2

After much hard work, I've convinced my manager that some absolutely awful code needs refactoring. As with any project, they've asked me for a time estimate and I've found myself stumped. How can I estimate how long a refactoring project will take?

Answers/comments have proven the following points to be relevant in my specific case, but I only came here with the intention of asking about the general one.

  • We have no tests. I'd struggle to call the code testable. Making it testable would require major (and by definition, untested) changes.
  • The code is so bad that refactoring it is a major project. Mere opportunistic refactoring won't do here.
  • We're not currently trying to add anything in particular. We're wanting to fix it because either when it breaks or when we do want to add anything, we know it's going to suck.
J. Mini
  • 997
  • 8
  • 20
  • Do the refactoring and add 10%? – mmathis May 03 '23 at 17:48
  • 2
    Estimation is a complex subject - books have been written about it. But in short, break the refactoring down into smaller tasks, until each subtask is small and specific enough that you can give it a reasonable estimate. Then sum all the tasks. Of course, making the estimate itself takes time. – JacquesB May 03 '23 at 20:31
  • 1
    How many people have started renovating a house, only to learn that there was more to it than they initially thought? We cannot account for how accurately you've ascertained what needs to be refactored, how deeply you want to clean this, and where you draw the line when it gets to be too much. These three things, among others, are essential if you want to come up with any order of approximation of an estimate. I'm not sure what the core question here is that you want answered. There is no magic formula that can be calculated based solely on the information you've provided. – Flater May 04 '23 at 00:04
  • 1
    It is a pointless question, like "How long does it take to get the bugs out of a software product?". Refactoring should be a continuous, ongoing effort. You only start doing it on parts once you recognize problems with those parts. In any product you may have tens of subsystems that could be candidates for refactoring and than you would have several options on how to change things for each subsystem. And, as you would start you effort with one, your initial plans would morph into better ones. Planning is not a meaningful concept when it comes to refactoring. – Martin Maat May 06 '23 at 14:49
  • Estimate how long it would take to write and debug new code with the same functionality from scratch. That should be your upper bound. – cwallach May 15 '23 at 07:36

6 Answers6

8

As with any project the only truthful time estimate is far more about how much patience you have with the problem then it is about its difficulty.

Think about it this way. If I only gave you a day to work on this what would you do if you had to be done on time?

Now if I gave you a week what would you do differently?

Organize the work so that if you have to be done sooner than you expected you could still turn in something useful.

Work that way and you can pick whatever deadline you want. More time just means more gets done.

And that means what you’re really estimating is how much time is worth throwing at the problem you’re fixing. Not how much work there is. Which is good because if you dig deep enough there’s always more you could do.

Now that estimating issue aside, let me ask, why are you asking management about refactoring? It’s none of their business. Management doesn’t understand refactoring. It’s not something you do for them. It gives them nothing. Refactoring is something you do to make it easier to give management what they do want.

Don’t ask for permission to refactor. Just refactor when you see the need. Regression tests should prove if you broke anything.

candied_orange
  • 102,279
  • 24
  • 197
  • 315
  • We have no tests. The code is so bad that refactoring it is a major project. – J. Mini May 03 '23 at 19:22
  • 3
    @J.Mini ah ha. Ok go read Michael Feathers Working Effectively with Legacy Code. Your first job is to find ways to slip in tests. If you have requirements great. But if this is like most projects the only source of truth is the “working” code. – candied_orange May 03 '23 at 19:27
  • Even still, you should have eyes on something management does want. It’s easier to decide how to refactor when you know what feature you’re trying to add. – candied_orange May 03 '23 at 19:49
  • Good point, but we're not currently trying to add anything in particular. We're wanting to fix it because either when it breaks or when we do want to add anything, we know it's going to suck. – J. Mini May 03 '23 at 21:00
  • The best way to measure the flexibility of your code is to flex it. – candied_orange May 03 '23 at 23:32
  • 2
    @J.Mini then your first task is to write tests. Estimate that. That alone may end up indicating work to be done, estimate that. Baby steps. – jwenting May 04 '23 at 08:04
3

You won't be allowed to do such a refactor in one iteration. At this scale of changes, ahead-in-time estimations are useless because the code is continuously changing due to ongoing developments.

Big changes are often surrounded by uncertainty and uncertainty is always interpreted as a risk. So, your goal is to propose a way to manage this uncertainty.

The idea then is to propose a roadmap. A roadmap you all can adapt and estimate progressively.

Decompose the roadmap into milestones. Order milestones by complexity. The simplest (lesser risk) at the beginning1. For example:

  • Document the existing code. Starting with code candidate to change first. Think about comments indicating how certain pieces of code should change. Comments other developers can take as tips and hints.
  • Set a code style and format to reduce the risk of merge conflicts.
  • Improve the existing name conventions. Give meaningful names to classes, methods, variables, etc. If a component does s lot of things, don't be afraid of reflecting that mess in its name. Names can help to detect code claiming attention.
  • Add and setup a test framework
  • Set a canary test (it checks if the test framework works)
  • Add few (automated) tests. Start with code candidate to change first. If code wasn't designed to be testable, don't change it yet. Make your best to test it as is. This is very important.
  • Remove duplicated code. Back changes with tests.
  • Identify patterns (only identify and document)
  • Turn patterns into abstractions. Make sure these abstractions are testable. Back'em with tests.
  • Replace legacy code with the new abstractions. Back changes with tests.
    • Work on one abstraction/change at a time, so you can make fine-grained estimations and narrow down the "blast radius".

Given a roadmap, propose the strategy. For example, the dev team can work (only) on 1 or 2 milestones at the time per development cycle.

After each development cycle, review the roadmap along with your PM. Allow him/her to choose the best time to introduce the next N milestones in the development cycle. This way, you estimate and work on small tasks every time.

Finally, remember to include the time spent on analysis, an error margin and a margin for the management.

The error margin has to be generous because you don't know who is going to implement these changes. Often than not, the one doing estimations is not the one doing the refactor.


1: The idea is allocate code in every cycle, so the next one is ready for refactor, hence "cheaper" to execute

Laiv
  • 14,283
  • 1
  • 31
  • 69
1

Estimation is a complex subject - books have been written about it. But in short, you estimate any task (whether refactoring or adding new features) by describing the task and breaking it into subtasks. You should end up with sub-tasks that are small enough and detailed enough that you can make a reasonable estimate on the scale of hours. If a task is more than a few hours, you need to break it down further. If you are multiple developers, it is a good idea to make separate estimates and then compare and discuss afterward.

Remember to include time spent understanding and documenting the code and time for writing tests.

This process of estimating is of course itself rather time-consuming, but if you need a reasonable estimate, there is no way around it.

In your particular case you probably need to write a bunch of tests before starting the refactoring (otherwise how would you know the refactoring doesn't break anything?). If the code is not written with testing in mind, this may itself be rather challenging and even require some smaller preliminary refactorings e.g. to enable mocking of external services. This in turn will be risky and require manual testing, so remember to include this in the estimate.

In general I would be skeptical towards a large refactoring "for no particular purpose" as you describe. But if your manager is fine with that, go ahead.

JacquesB
  • 57,310
  • 21
  • 127
  • 176
1

"How can I estimate how long a refactoring project will take?"

Longer than any estimate you make, probably.

I would suggest a few useful starting points.

If you are intending to change the entire codebase, then estimates should probably start at the same number of man-hours the existing edifice took to produce.

If you're intending to set a higher standard this time, with more complicated or subtle code than before, then increase the estimate from the starting point - I would suggest increasing the estimate only in integer multiples, under this heading.

There may be circumstances which reduce the estimate.

For example, if you wrote the existing codebase and now have many more years of business experience, that will considerably speed the redevelopment process because you'll already have a considerable understanding.

If technology has moved on significantly since the original development, and you can identify a specific feature that was once bespoke and difficult and is now trivial, that may speed the redevelopment.

If the scope of the redevelopment is clearly localised and not global to all existing code, then that may speed the project.

And if the existing codebase mostly consists of cruft, legacy remnants, or grotesque distortions, which can all be purged in the redevelopment, then something much simpler may result from redevelopment, and thus you can reduce the estimate by some fraction.

But you can see that you're usually dealing with very big numbers and wide margins of error.

Knowledge of a specific codebase, and knowledge of a particular kind of development, may alter estimates significantly, but in the abstract there is rarely any reason to deviate from the proven experience of how long some software took to write in the first place.

If you don't know where to start with estimating, start with that, and think carefully about whether that estimate is within your resources.

Steve
  • 6,998
  • 1
  • 14
  • 24
0

Break it up into manageable chunks, estimate each separately. Then double the estimates, add them all up, and add another 20%.

Try to sell that, when people say it's too much, say you can probably do it in 10% less time. THAT's the political game. And it's going to happen.

Then start planning, using the chunks as a guideline. Do NOT attempt to estimate everything as one big black box, that way lies chaos. And also do NOT attempt to estimate it without intimate knowledge of both the current system and the desired end result.

For example I was involved several years ago in a major refactoring where a legacy system had to be pulled from running in JBoss 4 (years out of support) into JBoss 7 (the current production version). The person doing the estimate wasn't familiar with the current state of the codebase (he had worked on it, but 10 years prior) nor was he familiar with the differences between JBoss 4 and JBoss 7. Result was an estimate that was dangerously optimistic, the estimate was that it'd take 1 person about 2 weeks. When we were done and everything was working properly with the new JBoss release we were 18 months into the project, with 2 people working full time on the upgrade. It hadn't just required changing some deployment scripts and a JDK version, but a complete rewrite of about a third of the 1.5 million line codebase, and creation of thousands of unit tests that the person doing the estimate had assumed were already there (some were but hadn't been maintained, many things had no tests at all). Do NOT fall into the trap of "it's just updating a version number for a dependency", it's rarely that simple.

jwenting
  • 9,783
  • 3
  • 28
  • 45
0

Your manager decides the time budget, and that’s how long it takes. Maybe a few extra percent, and then you stop.

You decide on an area in your code where this time is enough to make a meaningful improvement, and then you make sure that area will be refactored well. That is unit tests added, unit tests passing, interfaces clarified and changed interfaces used everywhere, and code cleaned up. AND everything working. In that one area.

gnasher729
  • 42,090
  • 4
  • 59
  • 119