I have recently inherited a large, legacy system with a complex codebase that has a lack of documentation. I'm the sole internal developer embedded with the core users, taking some responsibility from an external contractor. While modules and functions are documented well enough, there is no documentation around the system itself. What it is, why it exists, how overall modules fit together.
This is a web-application, and an example of this is how it produces a page. This requires executing PHP that makes a call to the database that retrieves some XSLT with custom extensions, that is transformed (this process may happen many times), it may retrieve PHP from the database, however sometimes when retrieving PHP, it will retrieve a PHP stub, which indicates it actually needs to retrieve a PHP file from the file system.
While individual chunks of this process are captured, there is no documentation that takes a systemic or macro view of the world. So I've come in, and have begun writing what I've been able to figure out from emails to the current consultant, what I've been able to figure out from the code, and just raw determination.
Because of the importance of this system, the people I work with have its important that the system can be understood, so they are less reliant on a single person - be it me, the original developer or anyone. To be honest, I am happy with this, I'm secure in my own abilities, and if part of my task is to help mitigate the organisational risk if I was to disappear then thats fair. I also enjoy the task of unravelling a giant legacy system - its a giant puzzle.
However, the challenge is this - I need to produce some solid documentation around how the system works, both for me and whoever follows. While I have started to get a clear understanding, what I would like to know is what are the kinds of things a new developer would need to see to make sence of a legacy system that was new to them?
Similarly, knowing what needs to be written how would you tackle diving in, dissecting and documenting a large legacy codebase?