169

I started a new job recently where I am working on a very large application (15M loc). In my previous job we had a similarly large application but (for better or for worse) we used OSGi, which meant the application was broken down into lots of microservices that could be independently changed, compiled, and deployed. The new application is just one large code base, with maybe a couple of .dlls.

So I need to change the interface of this class, because that's what my boss asked me to do. They initially wrote it with some assumptions that didn't generalize too well, and for a while they have been avoiding the problem of refactoring because it's so tightly coupled. I changed the interface and now there are over 25000 errors. Some of the errors are in classes with important sounding names like "XYZPriceCalculator" which reaaally should not break. But I can't start up the application to check if it's working until all the errors are resolved. And many of the unit tests either directly reference that interface, or are coupled to base classes which reference that interface, so just fixing those is a pretty huge task in itself. Plus, I don't really know how all these pieces fit together, so even if I could get it to start, I don't really know what it would look like if things were broken.

I never really faced a problem like this at my last job. What do I do?

Panzercrisis
  • 3,145
  • 4
  • 19
  • 34
user788497
  • 1,321
  • 2
  • 9
  • 6
  • 7
    You're going to need to give us more information about the kinds of errors and what "this class" is, we're not mind readers. – whatsisname Nov 04 '16 at 02:28
  • @whatsisname The class is a very domain specific concept and I am not sure it would be meaningful to explain it. – user788497 Nov 04 '16 at 02:48
  • 140
    "over 25000 errors" such numbers shown in VS are often wrong. There are a few errors, but with the build of an often used dll broken, other builds do fail also, giving raise to astronomic error numbers. I always start fixing with the first error message found in the build output, not in the error list. – Bernhard Hiller Nov 04 '16 at 08:41
  • 1
    Well, did you add a parameter that has a sensible default value, for example? – user253751 Nov 04 '16 at 09:20
  • 13
    I would like to see the before and after signatures on the method you changed. BTW - 25k errors doesn't really sound like too many to deal with. Tedious yes, daunting, yes, unmanageable, no. – jmoreno Nov 04 '16 at 10:49
  • @jmoreno depends on how heavily the surrounding code needs to be adjusted ... FWIW that could be a sinkhole for days and days of working time, and that needs to be handled differently than just adjusting the callsites a bit ... – Vogel612 Nov 04 '16 at 11:42
  • 1
    As @BernhardHiller said, this is likely a cascading error situation and you don't actually have 25,000 errors. To get a better handle on how many call locations there are, you can do a find in files on the method name if it's somewhat unique. See how many results that gives you. – 17 of 26 Nov 04 '16 at 12:58
  • 47
    A public interface is a contract. Don't break such contracts -- create new ones. – Matthew Read Nov 04 '16 at 15:09
  • 1
    "They initially wrote it with some assumptions that didn't generalize too well" seems like you can keep this itnerface, and create a more general one on the side, is that right ? – Asoub Nov 04 '16 at 16:01
  • Often, you can find a refactoring path to only use tool refactorings that are guaranteed to not break (renaming, adding a nullable method argument, etc.). That way you can make mass changes with confidence. – usr Nov 04 '16 at 18:04
  • 6
    25000 errors!? Houston, we got a problem. Definitely revert the change and talk to your manager, explain to him that you might have to create a new interface altogether. – Code Whisperer Nov 04 '16 at 18:53
  • My personal record was 4,500 errors with a compiler that stopped counting at 100 errors in a file. (That was porting a large app from one OS to another, after just importing the files and pressing "build". ) – gnasher729 Nov 05 '16 at 17:28
  • @user788497 Without going to extreme detail regarding anything proprietary or secret, what exactly is the class you altered? What sort of alteration(s) did you make? Did you change methods in the class, add new parameters, etc? It is quite possible that your mistake is due to your change creating a non-equivalent interface. Be careful. If you had a list but now it is a tree or set, you might find that things are breaking because they are relying upon a broken class. I would argue that due to the high number of errors, the class you altered is the MOST CRITICAL CLASS, not that XYZ calculator. – user64742 Nov 06 '16 at 02:07
  • 4
    Don't IDEs refactor everything for you? If I change some class name in Eclipse it changes the reference everywhere in the source classpath. – mid Nov 06 '16 at 19:55
  • 2
    IntelliJ refactorings will fix this for you. – noɥʇʎԀʎzɐɹƆ Nov 06 '16 at 20:54
  • 1
    This might be an excellent opportunity to bring a more experienced developer in, present the problem and _learn_ from the suggested solution. – Thorbjørn Ravn Andersen Nov 06 '16 at 23:56
  • 1
    I would suggest that before you contemplate changing an interface (especially such an important one) that you do some thorough analysis in order to understand the impact of your change. Sounds to me like you're changing things without really knowing what's likely to happen. That's a Heath Robinson approach to development. The likely result of that is you'll (a) potentially miss a better solution and (b) probably let other issues into production once you've resolved your 25,000 errors. Spend some time understanding WTH you're doing. And you might have to tell your boss that's time well spent. – Bradley Thomas Nov 06 '16 at 23:59
  • Why is your boss telling you such low level details of how to program? He or she should be giving you a problem statement and trusting your judgement on how best to achieve it. You may need to "manage up". – Brandon Nov 07 '16 at 18:57
  • @Midnightas I'd be amazed if Eclipse gracefully handled a project with 15 million lines of code. 15,000, yes. 150,000, probably. 1,500,000? Probably not. (But I have never tried it.) Compilation alone will take hours, let alone refactoring, interfacing with version control etc. – Peter - Reinstate Monica Nov 08 '16 at 13:38
  • 2
    The phrase [we're gonna need a bigger boat](https://www.urbandictionary.com/define.php?term=We%27re%20Gonna%20Need%20a%20Bigger%20Boat) can help convey the situation. This was expected to be simple. It's not, or at least the "obvious" approach isn't. Happens. – Steve Jessop Nov 08 '16 at 16:17
  • What your boss wanted then is probably not exactly *directly* changing that interface, since so many things depended on this interface already. Or you might just tell him this. The most obvious thing to do is to just code another interface to replace the original one, instead of changing it directly. – xji Nov 09 '16 at 12:26
  • I think there is a key bit of information missing from this question. Did you start trying to fix the errors or did you just panic? If you started to fix the errors and things began improving quickly, maybe you were on the right track. If you started fixing the errors and it cascaded into a bunch of new errors, you are right to back-off. Lots of gradations in between those extremes. – OldFart Nov 11 '16 at 20:49
  • Whatever route you take I suggest using source control directly on your computer. That will give you great freedom to experiment and recover. – radarbob Nov 11 '16 at 20:52
  • Start reading "Refactoring, Improving the design of existing code" by Martin Fowler. Skim it for ideas and read in detail where it hints at promise. Like any gut-ripping operation the code will get seemingly messier before it gets better. But if you have an idea where you want to go with it, the code *will* get better. – radarbob Nov 11 '16 at 20:58

9 Answers9

350

25000 errors basically means "don't touch that". Change it back. Create a new class that has the desired interface and slowly move the consumers of the class to the new one. Depending on the language, you can mark the old class as deprecated, which may cause all sorts of compiler warnings, but won't actually break your build.

Unfortunately these things happen in older code bases. There's not much you can do about it except slowly make things better. When you create the new classes, make sure you properly test them and create them using SOLID principles so they will be easier to change in the future.

Stephen
  • 8,800
  • 3
  • 30
  • 43
  • 80
    It does not necessarily have to be a new *class*. Depending on language and circumstances it might be a function, interface, trait, etc. – Jan Hudec Nov 04 '16 at 07:23
  • 33
    It also helps if the old interface can be replaced by compatibility wrapper around the new one. – Jan Hudec Nov 04 '16 at 07:24
  • 5
    I have to disagree, often tools can help to carry out the repetitive part of the changes (as pure refactorings), which in this case could mean you only need to touch e.g. 25 of the 25'000 clients manually. – theDmi Nov 04 '16 at 07:58
  • 82
    Sometimes, 25000 errors actually means _We never dared touching that, but now that a newcomer has arrived, let's give him that cleaning-out-Augean-stables task._ – mouviciel Nov 04 '16 at 10:46
  • 14
    @theDmi: I agree, 1 change and 25k errors most likely means 2.5k changes at most, and I wouldn't be surprised to find that it was around 250. Lots of work either way, but doable. – jmoreno Nov 04 '16 at 11:04
  • 34
    Those 25000 errors might all be fixable by a single change. If I break the base class of a huge heirarchy, every one of the derived classes will spew errors about invalid base, every use of those classes will spew errors about the class not existing, etc. Imagine it was "getName", and I added an argument. The override in the "HasAName" implementation class now doesn't work (error), and *everything inheriting from it* now generates errors (all 1000 classes), as well as every time I create an instance of any of them (24x per class on average). The fix is ... one line. – Yakk Nov 04 '16 at 13:42
  • 1
    Do you need to completely change the interface (as in, from `IInterfaceA` to `IInterfaceB`) or is it enough to change its definition? If the latter, than add new method declarations rather than change old ones. If the language allows, mark the old methods as deprecated (then you'll get warnings rather than errors). You can now start changing the code base to use the new methods without actually breaking anything. – MBender Nov 04 '16 at 14:37
  • 3
    If the wording in the question was chosen intentionally, and the new class is really a *generalization* of the existing one, then it should really be possible to let the old class remain as a specific instance/implementation of the new, more general interface. – Marco13 Nov 04 '16 at 17:38
  • 2
    Small bonus question what are the SOLID principles? – Mister Verleg Nov 04 '16 at 19:38
  • 6
    SOLID (single responsibility, open-closed, Liskov substitution, interface segregation and dependency inversion) – Laiv Nov 04 '16 at 20:04
  • My idea would be to create a feature branch that starts with the primary change, follows with the thousands of dependent changes, then finally gets merged months later – John Dvorak Nov 04 '16 at 20:34
  • 3
    Please whenever you suggest some kind of “_add a new, similar **X₂** instead of changing the existing **X**_” change, make it clear that this should nevertheless obey [DRY](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself). I've far too often seen it that people _copy&paste_ hundreds of lines of code to avoid such interface problems, but then the old version would never be removed... in the end, this makes the codebase even more huge for no good reason, and of course makes similar or worse refactoring problems ever more likely in the future. – leftaroundabout Nov 06 '16 at 23:51
  • 1
    @leftaroundabout that's the point of deprecating the old class/interface/methods - it says "this still exists, don't use it any more, it's not being maintained and is going to go". When resources permit, any consumer of the deprecated interface can be moved to the new one. It's about managing the change. – Stephen Nov 07 '16 at 00:01
  • 2
    Yes, deprecation is one way to go about this. Just, it's important that this _is_ done, before somebody else starts modifying **X** or **X₂** in such a way you can't just get rid of one of them anymore. – leftaroundabout Nov 07 '16 at 00:14
  • 5
    Before you replace `OldClass` with `NewClass`, try to find out how many `VeryOldClass`, `VeryVeryOldClass`, and `NewButAbandonedClass` have previously been trying to solve the same problem. – Peter Nov 07 '16 at 22:48
  • 1
    More often 25,000 errors really means something much more benign like "oops, you forget a semicolon on line 762 and then the parser exploded". And in any case, I wouldn't advocate for a whole new class implementation. It should be sufficient to 1) leave the original method in place, 2) add a new method with the desired signature (to the _original_ class), 3) relocate all your actual business-logic to the new method, 4) rewrite the old method so that it just delegates to the new method, and 5) flag the old method as deprecated. Cleaner fix, much less disruptive, and shouldn't break anything. – aroth Nov 10 '16 at 05:57
  • Ahh Microsoft and their `CreateWindow` and `CreateWindowEx`(and several other examples, like `ID3DDevice`, `ID3DDevice1`, `ID3DDevice2` - or something like that)... – Bruno Ferreira Nov 10 '16 at 14:17
80

Divide and conquer with refactorings

Often, breaking up the change that you need to do into smaller steps can help because you can then perform most of the smaller steps in a way that doesn't break the software at all. Refactoring tools help a lot with such tasks.

Divide

First, identify the smallest possible changes (in terms of logical changes, not in terms of changed LoC) that sum up to the change you want to achieve. Especially try to isolate steps that are pure refactorings and can be carried out by tools.

Conquer

For complicated cases like yours, it may make sense to perform one small refactoring at a time and then let the problem rest, so that all continuous integration systems can verify the change, and maybe the testing team had a look as well. This validates the steps you make.

To carry out a particular refactoring, you absolutely need tool support for a case where you have 25,000 call sites of the method you want to change. Maybe search-and-replace works, too, but for such a critical case, I'd be scared by that.

Example

In C#, for example, it's possible to use Resharper to change the signature of a method. If the change is simple enough, e.g. adding a new parameter, then you can specify which value should be used at call sites that would otherwise have a compile error.

You are then at an error free code base immediately and can run all unit tests, which will pass, because it was a refactoring only.

Once the signature of the method looks good, you can go and replace the values that Resharper added as arguments to the newly introduced parameter. This is not a refactoring anymore, but you have an error-free code base and can run the tests after each line you change.

Sometimes that doesn't work

This is a very useful approach for cases like yours, where you have a lot of call sites. And you do have working software now, so it may be possible to make small refactoring steps to change the signature just a bit, then do another one.

Unfortunately it won't work if the signature change is too complex and cannot be broken down into smaller changes. But this is rare; splitting the problem into smaller problems usually shows that it is possible.

elixenide
  • 442
  • 1
  • 6
  • 17
theDmi
  • 898
  • 6
  • 10
  • 13
    This. Refactor if you can (safely), otherwise you need a proper migration. – sleske Nov 04 '16 at 07:46
  • 3
    And for the love of whatever, please make use of your IDE's built-in refactoring tools, rather than just changing the (e.g.) method signature locally and then going 'round fixing the consequent errors. – KlaymenDK Nov 07 '16 at 13:14
36

Clarify your task with your boss to help him to understand the problem and your needs as a professional software developer.

If you are part of a team, look for the lead developer and ask him for advice.

Good luck.

mmehl
  • 469
  • 3
  • 5
  • 17
    This is the first step I'd take - "I'm a junior and my boss told me to do a thing and now I'm getting a really big error message, but my boss didn't tell me about that." Step 1: "Hey boss, I did the thing, but I get 25k errors now. Is that supposed to happen, or..." – Pimgd Nov 04 '16 at 10:22
29

Don't touch it. Don't commit anything.

Instead, sit down on your chair and shout "Heeeeelp!!!!!" as loud as you can.

Well, not exactly like that, but ask any of your senior colleagues for advice. If you have 25,000 errors, you don't fix the errors, you fix what caused the errors. And a senior colleague should be able to advice you how to make the change your boss wants without the 25,000 errors involved. There are various ways to do this, but what is a good way depends on your specific situation.

And it might be the boss told your senior colleagues to make that same change, and they said "no". Because they knew what would happen. That's why you were given the job.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
  • 2
    This is definitely an important thing to keep in mind. If a new employee is truly ambitious to a large degree, they might be industrious (gullible) enough to do the really big task that only ends up gaining a 5 byte memory advantage or be bit more logical to recode and maintain later. – user64742 Nov 06 '16 at 02:10
  • @TheGreatDuck: Tell me you've never seen an interface used everywhere and *wrong* everywhere. I sure have. – Joshua Nov 08 '16 at 04:18
  • @Joshua that's not even relevant to what I just said. There are times were fixing a minor bug isn't always the smartest idea of it means rewriting 10,000+ lines of code. Unfortunately, a new employee might be gullible enough to pull 10 all-nighters outside of work rewriting that code to impress their boss and get it done. Sure, it's a great thing to do, but sometimes enough is enough. You just have to accept the bug's existence. – user64742 Nov 08 '16 at 14:17
  • 1
    @Joshua btw, I only joined this community because this question came in on my popular questions feed. I have no experience with large-scale design like this. I just agreed with this person's view. – user64742 Nov 08 '16 at 14:18
22

Entrenched APIs can't simply be changed. If you really need to change them, anootate and/or document them as deprecated (by whatever means the language allows) and document which API should be used instead. The old API can then be phased out slowly... perhaps very slowly depending on your time budget for refactoring.

Kevin Krumwiede
  • 2,586
  • 1
  • 15
  • 19
11

Evaluate

Evaluate whether this change is necessary, or whether you can add a new method and deprecate the other.

Forward

If changing is necessary; then a migration plan is necessary.

The first step is to introduce the new method and have the old method massage its arguments so it can call the new one. This may require hard-coding a few things; that's fine.

This is a commit point: check that all tests pass, commit, push.

Migrate

The busy work is migrating all callers of the old method to the new one. Thankfully it can be done gradually thanks to the forwarder.

So go ahead; don't hesitate to use tools to assist (sed being the most basic, there are others).

Mark the old method as deprecated (with a hint of switching to the new method); it will help you spotting whether you forgot anything and will help if a coworker introduces a call to the old method whilst you are working on this.

This is a commit point (or maybe several commit points): check that all tests pass, commit, push.

Remove

After some time has passed (maybe as little as a day), simply remove the old method.

Matthieu M.
  • 14,567
  • 4
  • 44
  • 65
  • 5
    `sed` is probably a bad idea... Something that actually "understands" the language and won't make unintended drastic changes is better. – wizzwizz4 Nov 05 '16 at 20:02
  • 1
    @wizzwizz4: Unfortunately, I've found very few tools that actually *understand* the language well enough for C++; most tools seem to cater to renaming and that's it. Granted, renaming *only* the exact method calls (and not any overload or unrelated but similarly named method calls) is already impressive, but it's insufficient for anything more complicated. At the very least, you would need the abilities to (1) shuffle arguments, (2) apply a transformation to a given argument (call `.c_str()` for example) and introduce new arguments. `sed` kinda works, the compiler catches its issues afterward. – Matthieu M. Nov 06 '16 at 12:07
  • 1
    `sed` (or `ed`) *can* be adequate for this sort of thing - as long as you properly review the diff before you commit. – Toby Speight Nov 07 '16 at 16:19
  • This is the TCRR of html, yeah? :) –  Nov 07 '16 at 23:38
8

If your change to the method signature is merely a name change, the simple solution is to use tools to automate the change in the 25,000 classes that reference the method in question.

I assume that you have simply edited the code manually, which gave rise to all the errors. I also assume you are familiar with Java (seeing your reference to OSGi), so for example in Eclipse (I don't know which programming environment you use, but other environments have similar refactoring tools) you can use "Refactoring -> Rename" to update all the references to the method, which should leave you with no errors.

In case you are making other changes to the method signature than simply renaming (changing the number or types of parameters), you can use "Refactoring -> Change method signature". However, chances are you will have to be more careful as the other answers suggest. Also, regardless of the type of change it may still be quite a task committing all those changes in a busy code base.

MikkelRJ
  • 129
  • 5
  • 2
    OP specifically said OSGi was at their previous job, so it's not really relevant here. – user Nov 05 '16 at 13:38
  • 3
    @MichaelKjörling If the OP was a Java programmer in their previous job, there's a better-than-evens chance they're a Java programmer in this job too. – Rich Nov 07 '16 at 10:11
  • @MichaelKjörling I wanted to give a concrete recommendation, using Eclipse for refactoring because OP is familiar with Java. Whether OP is in fact using Java for the current project, I guess, is less important, but I should clarify my answer. Thanks. – MikkelRJ Nov 09 '16 at 07:44
6

Here my contribution.

I started a new job recently where I am working on a very large application (15M lines of code).

You probably are not familiar with the project and its "features". Before to type down a single line of code, It's important to be familiar with the project. So do rollback your changes and start by analysing the code. (At least the affected one)

Understanding the existing solution gives you a better perspective of where are you getting into. Contextualise the solution and Its importance.

As @Greg pointed out, you should be able to test the existing code to have a valid reference to compare with (regression tests). Your solution should be capable of generating the very same results than the existing one. At this stage, don't you care about whether or not the results are right. The first goal is to refactor, not to fix bugs. If the existing solution says "2+2=42", your solution should too. If it does not throw exceptions, yours should not either. If it returns nulls, yours should return nulls too. And so on. Otherwise, you will be compromising 25k lines of code.

This is for the sake of the retro-compatibility.

Why? Because right now, It's your unique guarantee of a successful refactor.

And many of the unit tests either directly reference that interface, or are coupled to base classes which reference that interface.

A way to guarantee the retro-compatibility is urgently needed for you So here your first challenge. Isolate the component for unit testing.

Keep in mind that those 25k lines of code were made assuming the possible results of the existing code. If you don't break this part of the contract, then you are half way to the final solution. If you do, well: may the force be with you

Once you have designed and implemented the new "contract", replace the old one. Deprecate it or take it out.

I have suggested leaving bugs alone because of refactoring and fixing bugs are different tasks. If you try to take them forward together you may fail on both. You may think you found bugs, however, they could be "features". So leave them alone (for a minute).

25k lines of code seem to me enough issues to make me focus on only one task.

Once your first task is done. Expose those bugs/features to your boss.

Finally, as @Stephen has told:

There's not much you can do about it except slowly make things better. When you create the new classes, make sure you properly test them and create them using SOLID principles so they will be easier to change in the future

Laiv
  • 14,283
  • 1
  • 31
  • 69
5

Test it.

Everyone else is recommending how to refactor so there is little impact. But with that many errors, even if you succeed in refactoring with as little as 10 lines of code (you probably can), then you have impacted 25,000 code flows, even if you didn't need to rewrite them.

So, the next thing to do is make sure your regression test suite passes with flying colors. And if you don't have one, then make one that will. Adding a comprehensive regression test suite to your monolithic project sounds boring but is a nice way to increase confidence in release candidates, as well as release them faster if the suite is automated well.

Greg
  • 406
  • 2
  • 4