I previously asked this on StackOverflow, but was advised it was a better fit for Programmers.
I have recently begun working on a codebase (as a Java developer) with the following characteristics:
- Roughly 800,000 lines of source (including whitespace and comments). Primarily Java, but also XML, PHP, Shell scripts, JSP, JS, HTML, CSS
- Frameworks: Mainly Stripes, a bit of Struts, and some quite old customised Hibernate (and I expect lots of other bits and bobs).
- No particular methodology for stories, sprints etc.
- No tests
- No build process
- No dependency management (everything is just setup in IDEA)
- Some very old code dating back as far as 2002
- No coding standards
- Masses of code redundancy, though given the lack of tests this is difficult to track down
- Multiple modules and inter-dependencies
- A massive 10,000 line XML file containing named Hibernate queries
- 'Dump-all' classes at the top of the inheritance tree containing references to multiple services that may only be used in a handful of child-classes (As an example, one class is extended by 573 classes. A particular service contained in it is used by only 3 children)
- Demotivated developers
I know this a really open-ended question: Where would you start in trying to tidy up this system?