To rewrite or slowly refactor old C++ project

Question

Our team has recently inherited a relatively large project from another company (~250k lines). It was developed using C++Builder and we intend to port the Ui side to Qt. Most of the Ui code is separate from the business logic (yay!) but the logic side is quite a mess.

There is a lot of diamond inheritance going on (virtual inheritance thankfully) but it makes understanding the code quite difficult to do.

We've got very little documentation to go on, what's there is out of date (comments included).

I generated class diagrams using Doxygen, here's the most complex one (I had to remove most of the class names but kept some of the more important ones and kept the standard C++ data types and std classes, yes, they inherit from std)

confusing inheritance

So far we've been able to convert the base program to Qt and we're at a point where we can start converting the program's functionality bit by bit. Problem is, is it worth it long term? We would like to be maintaining this software as our own long term.

Is there a general approach we should take to untangling this kind of inheritance mess or should we simply redesign from scratch and only keep bits and pieces of the existing code as we go?

EDIT: Some more info

Zavior posted a link to an article about why we should not start from scratch but Ptolemy also brought up some good questions and I'd like to add some information about our situation.

The program we have is not 'bug free.' There are known issues that users have workarounds for most. There is no 'list' of these issues, that is currently being compiled by talking to all the existing users one by one as they tend to keep things to themselves.

We are all new developers to this project. Our only resource is a developer who started working on the project about half-way through its lifetime. He is available via e-mail/chat mostly. He has also put together some documentation on how some parts of the code work.

The program has only been used as an internal tool so far. We would like to make it commercially viable.

EDIT 2:

One of the most important things we want to do is have the program be in Qt. It's currently using the VCL framework from C++Builder that no one on our team is familiar with and we only have 1 license for. It's during my work porting from VCL to Qt that I found the messy code structure and question the decision to 'convert' vs redo.

Obligatory joel on software post - http://www.joelonsoftware.com/articles/fog0000000069.html — Zavior, Sep 17 '13 at 13:50
What that article doesn't address is the situation that Boumbles is in, where the existing code is buggy, and that no one knows how it works. — Michael Shaw, Sep 17 '13 at 14:07
i've added some info to my question based on your guys' points — Boumbles, Sep 17 '13 at 14:15
I dare say not knowing how it works would be an even bigger argument against fiddling with it ;). Unless there is testing against existing functionality either approach can be tricky... — Rig, Sep 17 '13 at 15:05
there is no existing testing framework. we'd like to create unit tests....someday :S — Boumbles, Sep 17 '13 at 15:13
For obligatory link completeness: http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052 — psr, Sep 17 '13 at 17:17
What could this mystery program possibly be intended to _do_, I wonder? — AJMansfield, Sep 17 '13 at 21:42
[The most simple, methodological, trivial, mundane, boring kind of refactoring steps should be applied laboriously and patiently to convert this untamed beast into a circus gem.](http://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672) Refactoring C++ would require more skill. I have done it, and I can tell you that every bit of backend (non-UI, or server-side) code is worth the journey; every bit of UI is eventually thrown away, because UI is strongly tied to "Modality of Computer-Human Interaction" (MoCHI), which changes too rapidly for large code base to catch up. — rwong, Sep 18 '13 at 05:51
@Ptolemy: what makes you believe that the code Joel was talking about (the Netscape browser) was in a better state? I am pretty sure it was as buggy as any other legacy project, and full of pieces hard to understand. — Doc Brown, Sep 18 '13 at 06:18
@Boumbles: "someday" means never. Your team did not understand the value of such tests for a legacy project like this, otherwise you would start writing such tests right now, daily. And do yourself a favor, get a copy of Michael Feathers book (see link given by psr above) and make sure every team member knows some of the core practices of that book. — Doc Brown, Sep 18 '13 at 06:22
@DocBrown, Netscape as a product then was much more usable as a product that Boumbles description of this project. I know, cause I was a user! Joel's article is about the fact that developers have a tendency to prematurely argue for a rewrite simply because they don't understand the problem in enough depth. That applies in this case, but there is much more than that here. There are commercial risks with the status quo, and they have very limited ability to support the code or even to make continual improvements (which is Joels recomendation). — Michael Shaw, Sep 18 '13 at 06:31
@DocBrown I strongly agree on the Unit Tests part. Writing unit tests should be a priority. It is a great way, albeit boring, to get to know the codebase. ATM you cannot be sure if your refactoring attempts fix or break existing functionality. You cannot refactor **mercilessly**. And if you think otherwise you would be proven wrong at the worst moment. I would be afraid to even touch the thing :P — rszalski, Sep 29 '13 at 16:12
Redesign by encapsulation. The diamond inheritance suggests there is not enough isolation of the independent functionality. Just as encapsulation helps one refactor a global state machine and increases the ability to understand a system by requiring only that one understand the interfaces of the encapsulated black boxes and not the implementation, further breaking down the classes is the same on a smaller scale. — DaveMorganTexas, Aug 21 '15 at 00:27

score 36 · Accepted Answer · answered Sep 17 '13 at 13:51

If you re-write & re-design from scratch, you're going to have two MASSIVE problems:

You don't have a spec. You might think you have a spec, but it turns out that the REAL spec is the old code. So you're going to have to dig into it to figure out what it really does, just so you know what to make the new system do.
You'll have a long period (looking at that class hierarchy, possibly years) in which you won't have a saleable product. Which means you either have to maintain the old one in parallel, or give up on shipping anything for a long time.

It stinks, but you probably need to just try to make the old one better. Write tests for everything you touch, don't hesitate to rip out a big section to make it better, but don't try to re-write it all at once. The project may not survive such an attempt.

I agree. There is little that can be done. The only way to make use of an old program is to understand it in its entirety. Before you can rewrite an old program, you have to first make use of an old program. — Neil, Sep 17 '13 at 14:14

17 of 26 · Answer 2 · 2013-09-17T14:29:16.840

7

Rewriting a project of this size is doomed to fail due to the reasons in Michael Kohne's post.

Refactoring a buggy codebase is difficult because you'll never know if bugs are due to your refactoring or due to the original code.

I would take the approach of doing nothing but bug fixes for a significant period of time. As part of these bug fixes, you can rewrite small sections that are blatantly wrong. The benefits of this are:

Gives the developers the opportunity to learn a lot about the code, architecture, and product behavior. You will need this knowledge to do any significant refactoring.
Stabilizes the codebase prior to any major restructuring work.
You can sell the stabilized product while major changes are made later.

It also sounds like you desperately need a bug tracking system and some sort of QA to find bugs & verify fixes.

edited Sep 17 '13 at 14:29

answered Sep 17 '13 at 14:23

17 of 26

4,818
1
21
25

We have a bug tracking system (redmine). But the original owners of the system do not. The team was using SVN for controlling the code, but due to some contract shenanigans we weren't able to get any of the history and were stuck with just the latest code. – Boumbles Sep 17 '13 at 14:34
It sounds like you've been stuck with a really horrible scenario :( – 17 of 26 Sep 17 '13 at 14:34
3

Horrible for my quest to stop biting my nails, excellent for my education ;) – Boumbles Sep 17 '13 at 14:37

Michael Shaw · Answer 3 · 2013-09-17T14:51:29.940

When I've implemented large scale re-factoring on projects, I've generally used the strategy of doing the architecture afresh, and proving the architecture first, whilst development continues on the original project. When the new architecture has been proven, there is a large amount of transferring existing logic to the new architecture. This seems to be best done quickly, rather than as a background task.

Is it worth doing? That's a difficult one to judge without being there on the project but think about these issues:

Is the existing code base stable and bug free? If you are having to spend significant time finding and patching the existing code without really understanding it, the time saved by a reduction in support by reimplementing using understood code can be significant.
Do you have at least as much knowledge of the product as the original coders? How much of the complexity is due to bug fixes for unexpected situations, and how much was poor design?
Do you have the time to finish the task? One of the things I value most about a code base is consistency. Inconsistent code is a nightmare to support, extend and introduce to new members of the team.
If you have not got a substantial understanding of the products business rules, you are likely to do worse job than the current version.

Conclusion - based on the questions updates, is that you do not have enough knowledge of the project to do a re-implementation. In your current situation, even doing a significant re-factor is risky. The good news at least, is that the current customers are not hit with bugs that prevent them from using the product, which means you have time to understand the code base. This needs to be your first goal.

The existing code is not stable. The users don't see most things as 'bugs' they see them as 'quirks.' So they area aware that in order for the program not to crash they have to check a mis-labeled checkbox before clicking OK. We have access to 1 of the maintainers of the project who joined about 5 years into its development. Time is our main concern. We would like to have something commercially available 'soon.' However, as-is the software is not sellable due to all the 'quirks.' There are many of them and we are unfamiliar with the code base. — Boumbles, Sep 17 '13 at 14:00
That is a fairly strong indicator of the need to re-factor, and that the cost of delaying it may be quite high. However, you still need to be sure you know enough of what the product needs to do for this to be successful. — Michael Shaw, Sep 17 '13 at 14:03
This approach only works if you have enough developers to do both things at the same time. Also, it's difficult to do a new architecture when you have zero knowledge of the existing product (which seems to be the case here). — 17 of 26, Sep 17 '13 at 14:26
This is our current plan. We've got a couple guys working on fixing existing bugs while I work on porting to Qt. Bug fixes are merged as we go along. I'll be getting a few devs to help me with the port so hopefully we'll be working on fixing bugs in the Qt version sooner rather than later. — Boumbles, Sep 17 '13 at 15:38

score 1 · Answer 4 · answered Sep 17 '13 at 19:10

Being an Delphi developer myself (and the VCL is written on Object Pascal), I agree with others that you must first get acquainted with VCL and the application codebase before get into radical decisions.

VCL itself is mostly tied with Windows API concepts, while creating an very precise OO translation of it. It's different of QT which have an multiplatform objective.

Regarding the business logic, decypher it (from the code) to the point you can write an very precise spec of the RULES (or, in ther words, an precise notion of real problems the codebase targeted to represent and solve). With that, you'll have an direction to walk.

score 0 · Answer 5 · answered Sep 17 '13 at 14:17

it depends on your business situation.

assume that

it will take a long time to totally refactor the code, especially since you are new to the code base in the first place and can't forsee like you could if you were refactoring your own first draft code.

given this assumption, you must consider the business need for porting this code. you said that you wanted 'to maintain' the code. what exactly does that entail? are you going to be adding additional functionality? or just fixing minor output bugs or logically easy, but showstopping, crashes.

what other benefits would you get from refactoring? would the code be more portable or usable in a new environment or OS? be more easily integrated into a project that together will adequately serve a business need?

take these questions into consideration when decide or present a case.

consider the short, medium, and long term objectives of this task. it may cost a lot in the short term to refactor, but if you intend to keep the conceptual model around, it very well could serve as the basis for a whole new generation of products.

conversely, if you are having wholesale changes in your business environment that may make, not just the messy code, but the entire conceptual model obsolete , it may be better to just cherry pick the 10 buggiest or confusing pieces, rewrite them in original language, but clearly, and move on. just keep it alive, and save money, and build something you really want later,taking lessons from what you have and like and don't like/need to guide you

score 0 · Answer 6 · answered Sep 17 '13 at 20:25

In general, you just said that you would refactorate it because it is a mess. Well, users want programs that run, do what they want, etc. They don't ask for new programs because the code of the older one (that they don't even see) is a mess.

If the old program is preventing you from doing something, then, perhaps, you could spend time, money, etc., doing it from scratch. But keep in mind that there are many points against it:

1 - bad or not, users know how to work-around some bugs. Even if you rewrite the app and it's perfect, your users won't know how to handle it in the beginning. No one likes changes, even for better.

2 - users know work-arounds = means that they are not documented, so you might miss some functionality when trying to do the new software (because it won't be in the old code, or someone will miss to tell you that)

3 - you have to have gains for the user to justify rewriting it: doing it all over to end up with the same system means that there is no benefit for the user

4 - new software will have new bugs, and that will play against you

5 - the users don't want something new, another system, etc., instead of just redoing what they already have ?

score 0 · Answer 7 · answered Sep 17 '13 at 20:46

When looking at large scale projects (specially the ones you haven't worked on), It is very important to realize one thing. Nobody should expect you to fully understand it. You should live by the mantra of divide and conquer. That is the technique I have always used it has yet to let me down.

Consider yourself a surgeon presented with a patient who has multiple severe injuries that require your attention. Lets say you have some infection in a wound in the leg. This would be analogous to a bad library/code you want to get rid of from your code. You'd first try to establish the extent of the infection. The analogy is a quick grep to see how many files actually use that library/code and what is the nature of use. This is a significant step in the whole process and usually the most cumbersome. But after doing this what you accomplish is the identification of your boundaries and these are key. Once you know the extents you can apply anesthesia appropriately (assuming local anesthesia as it better suits the example). In terms of code, this could be interfaces at the appropriate places close to the boundary, unit tests etc. Building interfaces and creating unit tests at the boundary lets you mark your boundary. After marking your boundary, you should be more or less free to attack the infection/bad code which every way however way you like while your interfaces and boundary unit tests ensure that you never lose your way. Once done with all the modifications within your boundary i.e: all the denting and painting, your application should behave as it did without having any knowledge what so ever of any any operation ever taking place.

You then repeat this whole process for the next library/bad piece of code until you target them all. So in summary picking your battles and localizing your focus is paramount, when tackling the big ugly beasts of such nature. Good luck.

To rewrite or slowly refactor old C++ project

7 Answers7

Linked

Related