Legacy code: what to do in regards to maintenance and moving forward?

Question

I am currently working on an application with legacy code that was built using proof of concepts (POCs). These POCs became the finished production-ready code, there were no tests, and the classes have become quite large over time (pretty much god objects and there are multiple objects that follow this structure). These questions are related to these god objects as legacy code.

For the application itself, it works well enough, but there are bugs that do hinder the users. There are some workarounds in place, but there is a vibe of unreliability in the overall product (bugs creep back into the application due to coupling). For the development team, there is a fear of adding/modifying functionality due to large classes, tight coupling, and typical problems with inheriting legacy code that implements certain business rules and one-offs.

I do understand that with legacy code there is a need to review the costs of maintaining or refactoring code. Most attempts at refactoring would not be worth it, since the usage and coupling of POCs as production-ready code would make the refactor a fairly large undertaking. I have been through part of the Working Effective with Legacy Code book, but I do not know if this scenario specifically comes up. Also, I feel that developing valid tests for these kind of classes, or even some specified functionality, could fall under the same cost review as a refactor. (correct me if I am wrong on the assumption)

As for new functionality, I do not have a grasp on what should happen. There is the idea that new functionality could be separated into the preferred structure determined by the team, but there is the argument for consistency by putting the functionality in with the rest of the methods of some large class. Consistency being that some class which contains many uses/CRUD for many entities should receive the new functionality. For the consistency argument, it could have an advantage for the "possible" payoff on technical debt by having all of the functionality in the same place.

These questions lean more towards preference, and management normally has more weight in determining the direction taken. The main takeaway for me would be the outside perspectives. There could be discussions and other approaches that are being missed by the team (myself included) but could help in the long run. We could be doing everything that we can do, and that leaves us at the point of pushing through these growing pains.

Questions

For Maintenance, is there ever an instance where you cannot avoid a refactor that involves pulling out a piece of functionality from the large class, even if consistency is preferred? If so, then what would be a good example? (I do not know if there is a "problem" that fits this scenario, like what is seen with the square rectangle problem)

(my possible example; some functionality breaks because the large class was meant to deal with some in focus entity, but it is being used for multiple entities that are not always in focus. I cannot change the idea of the entity being in focus otherwise other pieces of functionality will break.)

For Moving Forward: Should the preferred method of adding functionality be up to the team/management? (it could be decided to separate or keep with consistency based on the situation)

Should new functionality always be one or the other (always consistent or always separate), with or without some expected large scale refactor?

Can you elaborate on what makes your scenario different from others discussed previously here and in Feathers' book? All such systems are (a) badly structured and (b) nevertheless in valuable production use, or the question wouldn't arise. And virtually always you have to make compromises w.r.t. normal principles such as consistency in order to get anywhere. — Kilian Foth, Nov 28 '17 at 10:54
@Kilian Foth - I believe that my wording is not the best. It wasn't that the scenario is different, but that I didn't know if it was something that came up either as an overall idea that is fleshed out over the course of the book or some specific points when managing/modifying legacy code that has evolved into these large classes. — eparham7861, Nov 28 '17 at 13:19
@gnat that question/answer is pretty close, but I do not know if the answer(s) that I am looking for makes them necessarily the same. If it really is a duplicate, could I ask a question of implementing that answer to my scenario? Mainly because it would/could feel like I am missing something. Again, I am looking more for discussion points to help with the fear and reactive nature that the development process currently has. I think as far as answers this could possibly be seen as a duplicate for any of the consistency vs refactor questions. — eparham7861, Nov 28 '17 at 14:41
`could I ask a question of implementing that answer to my scenario?` That seems totally reasonable — Daenyth, Nov 28 '17 at 15:02
Here is the link to the current question in regards to implementing that answer: https://softwareengineering.stackexchange.com/questions/361525/follow-up-legacy-code-maintenance-and-moving-forward-implementing-previous-ans — eparham7861, Nov 29 '17 at 16:53

score 4 · Accepted Answer · 2017-11-29T20:04:45.970

I'll give a very informal and possibly unusual answer as one who has come from a similar background about a decade and a half ago in ways where the techniques described in Working Effectively With Legacy Code were inapplicable in large part because the team and management were unwilling to accept them.

I was that kind of outspoken young guy insisting on using profilers, writing tests, etc, preparing lengthy presentations which often had some colleagues rolling their eyes. In retrospect I was also rather dogmatic since I saw the existing codebase as the worst thing imaginable and in the foulest ways that Michael Feathers himself probably might not have anticipated. He talked about monster methods spanning a couple hundred lines of code. Let's talk about countless C functions that span over 20,000 lines of code with sporadic goto statements, over 400 variables all declared at the top of the function and uninitialized, and colleagues that get pissed off if you initialize them to try to make the runtime behavior more deterministic while swimming through uninitialized variable bugs because they're afraid of the computational cost of initializing a plain old data type (ignoring the fact that optimizing compilers will generally optimize away redundant initialization). There was a repeatedly echoed policy to not fix what isn't broken but, when examining the nature of the codebase, the fact that each new version of the software often introduced more bugs than it fixed, and the lack of testing, often begged the question of how to effectively tell what wasn't broken.

There's more to this than just software engineering. There's your own sanity at stake if you try to fight too hard against the tide. It can help to relax a little if you feel like your team isn't cooperating and become a cheerful pragmatist and take pleasure in more things in life than a codebase built using the most sound engineering principles, like naked ladies and beer.

Anyway, for a start, monoliths or God objects start to undermine coupling and cohesion in a similar way as having a bunch of global variables. When your program is tripping over state management and finding it cannot effectively maintain invariants, then the difficulty of correcting those issues is proportional to the scope/visibility of the state involved. Just as a global variable might be accessed by hundreds of functions, so too might a class member if the class has hundreds of methods which have access to this persistent state. If the state violates a conceptual invariant and causes bugs, then your list of suspects is proportional to the number of functions that can modify the state, class methods or not. Explaining things this way might convince your colleagues to take it easy on the God objects and start hoisting functions out of a class that do not need access to the class's internals as a compromise. As an extreme example, if the entirety of your codebase's functionality was offered by a single God class, then it would be no more capable of effectively maintaining invariants over its private state than the worst C codebase using nothing but global variables for everything. From a practical debugging standpoint, the first priority to making things sane is typically to simplify state management and be able to confidently maintain invariants. That precedes even principles like SOLID if you have a horrific legacy codebase that's repeatedly encountering invalid states to first get its state management under control.

As for embracing a test-driven mindset, in my case the codebase was too foul and the team was too uncooperative to apply that mentality. You need the entire team really on board including the management if you want to start creating effective unit and integration tests on a codebase that requires 7,000 global variables to be initialized just right by sporadic functions and a main entry point which, itself, is over 80,000 lines of code, just to test a small section of the codebase.

I tried, one time, just over some weekends to construct a unit test of the licensing system for our software in a standalone project that tried to just come up with the minimal code required to test it and a reasonable interface to test against (the former license interface was monolithic and included even sporadic GUI functions). I ended up having to pull in around 800,000 lines of code since just about everything was coupled with everything else just to verify a license code with an initialization process that required dozens of hours of trial and error to figure out what global states needed to be effectively initialized to do so. In that case constructing the unit test required far more time, thought, and debugging than it would have taken to simply create a new licensing library from scratch which applied the same algorithm to verify a license code as the former one (something which could have been done with a few hundred lines of code, not 800k lines of tightly-coupled code). I can exchange horror stories all day long about nasty codebases from that former experience.

If you are in such a scenario, maybe the best option is to leave. That's what I eventually did. That said, I've found it useful for sanity purposes to seek to recreate key sections of the software almost from scratch, only using the old code for reference purposes. The goal is to come up with a completely independent library that you can test which recreates crucial functionality required for your software. It might not be the most productive way to go about it, working completely outside of your software on code that is not going to immediately benefit it, but at the end you get this independent library you can use and confidently demonstrate to your team, showing that it works conceptually through the tests and perhaps convincing enough people to replace large sections of code with your library which is now shown to be effective and reliable. It's a good confidence-boosting exercise if you are swimming in a foul sea of hopeless complexity.

Honestly, thank you for that answer. Overall, you did capture the whole scenario of the team, and what I have tried to get to some semblance of moving on with the code base. And I agree that this goes beyond software engineering, and I begin to wonder how being pragmatic, as you have brought up, would work with the idea of staying professional. Not just to employers, but to the field, and that is probably from me needing a longer amount of experience. The best option is leaving, and I am currently waiting and attempting to make things better in the meantime. — eparham7861, Nov 28 '17 at 15:49
As far as the question, this best provides many points of discussion that have either not been happening or just plain refused, and other answers hit varying points as well. So, I could say that this answer covers the overall idea of what I was looking for as an answer. — eparham7861, Nov 28 '17 at 15:52
Cheers! I've developed some possibly odd ideas about complexity working in these very sub-optimal conditions. In my opinion code can be as complex as it wants to be provided it works and the probability of it ever needing to be modified in the future is minimal. The problem is when you have instability and complexity at the same time; code that needs to be constantly touched by one or, worse, many members of the team that is incredibly complex. That's a recipe for a never-ending slew of bugs that keep cropping up with no sense that your software is becoming more robust or comprehensible... — , Nov 28 '17 at 17:32
... so it's worth seeking either simplicity or stability -- either eliminate the reasons for the code needing to be modified or eliminate the complexity. Establishing an independent library with thorough testing, even though that might require a bit more work away from the main codebase, can achieve either or both. Most of all, I find it a good way to figure out how things work without wading through an epic codebase and getting lost in the implementation details. It drives the focus back to figuring out how to solve a problem and just using snippets from the old codebase as clues. — , Nov 28 '17 at 17:34
As for professionalism, I'm a bad source. I was never great about it, always driven by impatience. But sometimes if you become too passionate about the product and the engineering behind it, it can actually start to interfere with the professionalism. There's some level of professional detachment you need to maintain your cool, accepting that a lot of things are outside of your direct control -- even if you can conceptualize an almost certainly superior solution and feel it can be implemented in a very reasonable time, it often hurts rather than helps to try to move so far ahead of the pack. — , Nov 28 '17 at 17:38
I struggle the most with grasping the idea of incorporating any level of professional detachment. As for the rest, the goal will be starting a conversation with those discussion points, and figuring our what can be used to work on myself, and what I need to develop/grow more in terms of being in software development. So again, thank you for the response, from you and everyone else. — eparham7861, Nov 28 '17 at 18:19

score 2 · Answer 2 · answered Nov 28 '17 at 09:22

2

I do not agree with the consistency argument, where you say that

Consistency being that some class which contains many uses/CRUD for many entities should receive the new functionality.

One of the things you should strive for is Single Responsibility Principle, so classes that contain many functions for many entities should be avoided. Also, breaking with this approach and moving out new or updated functionality out of the current classes will allow you to easily keep track of what code is still "old". It will give you many benefits. You'll have a better understanding of what needs to be tested and what remains to be refactored, if the time comes.

The exact way to move out code to a new place depends on the architecture in place, the type of application in general and what the new architecture is going to look like.

answered Nov 28 '17 at 09:22

Jonathan van de Veen

608
3
10

For a more overall picture when discussing consistency in this scenario, this idea of consistency is a discussion that needs to happen differently for the ones involved (team and so forth), but it is an approach being pushed (this would be by either lead/supervisor and senior positions). (I can add this piece to the question if it clears up anything) – eparham7861 Nov 28 '17 at 13:25
As for the SRP and architecture piece, the code base has been through a few iterations where there has been a full team turnover, so there is a lot of second guessing on given architecture, and discussions on defining an architecture is not happening. Like stated above, there is a fear surrounding the code base, and it feels like the current discussions and the driving force for how things are developed are mostly reactive to get something that "works" out the door. – eparham7861 Nov 28 '17 at 13:36
The more I read from your replies, the more I feel that there is a very different problem. No consistent leadership and therefor no single vision as to what direction to follow. This now becomes an organizational problem more then a software engineering challenge. I'm not aware of the culture within this organisation, but if I would be in your position I would escalate this up the chain of command so to speak, until I find someone willing to make the changes needed to get back on track. If I'd hit a dead end there, I'd walk away, because this company is not going to last. – Jonathan van de Veen Nov 29 '17 at 11:21
Yes, you are correct on the actual problem. I wanted to aim more from the software side of things with the question and answers, but that overall problem came out. As far as dealing with that overall problem, it has been made known up the chain, so that will become whatever it will. (This comment is an attempt at being vague, but still constructive; sorry if it comes off as confusing) – eparham7861 Nov 29 '17 at 13:17

score 1 · Answer 3 · answered Nov 28 '17 at 08:59

1

First define what you are moving towards. Unless you have a clear vision about what this is you wont be adding value. eg :

upgrade from classic asp to mvc.
switch to a no-sql database.
move these long running processes to a worker-queue.

If its just "these objects are too big" I think you are going to have trouble defining the 'new way'

Ideally you then stop adding any code to the old way and put all new code in the new way. Obviously this isn't always possible, you will be tempted/forced to do some bug fixes.

Eventually everything is the 'new way' and you can remove the last bits of the 'old way' by then you might have a 'new new way' though

answered Nov 28 '17 at 08:59

Ewan

70,664
5
76
161

Yes, I do see your point as far as defining the new way. Currently, there is a reactive feel to how things are being developed now which leaves us with no real "new way". As far as my question and no defined new way, could there be a reason that I should keep modified, or new, functionality with some big object, given that it is with similar functionality/ideas, or could the changes become a hindrance rather than being consistent? (hindrance like coupling where the overall class works with one or more entities in a way that breaks/cannot follow some expected functionality) – eparham7861 Nov 28 '17 at 15:21
if the team doesnt agree on a coding style then my rule is the most prolific coder wins :) But yeah, if everyone is doing things dramatically differently it can be unproductive. Obvs if everyone is generally doing small encapsulated objects you can have competing styles in the same project without issue. – Ewan Nov 28 '17 at 15:31
Things are changing to where everyone does pull in different directions, and in some cases, or some developers. get micromanaged to some other direction, and you start to see some developers "getting" to do small encapsulated objects and others that don't. To the point where the micromanaged developer(s) are "sometimes" required to change the small encapsulated objects towards the other direction. And, that starts getting into a vague/out-of-scope discussion on the question, which normally ends up being a "it depends" type of answer. – eparham7861 Nov 28 '17 at 16:25

Legacy code: what to do in regards to maintenance and moving forward?

3 Answers3

Linked