Ramping Up On Legacy Code

Question

When starting to work on a project with an existing code base, the first thing that needs to be done is usually to understand the application & existing code. Let's assume that the existing code is legacy code; referring to Michael Feathers' definition of "code with no tests".

I am sure that there are many different ways to handle this ramp-up phase. The most straightforward way is to go through the UI of the application (if there is one) and simultaneously debug the application to understand what is happening at the code level. This is a very time-consuming approach and also it is very easy to forget what you learn in a debugging session. Furthermore, there is no real way to share (among the team) what you learn during debugging.

Understanding the down-sides of this approach, I have tried another approach for my most recent project. What I did was to write a kind of API layer that sits on top of the existing code base. This API contained the functionality of pretty much what a user would do in the UI.

To be more specific, let's assume that the existing application is a typical transactional application with orders, items and shopping carts. My API turned out to something like this:

public class OrderAPI{
    public Order createOrder(customerName);
    public boolean deleteOrder(orderID);
    public List<Order> getOrdersForCustomer(customerName);
}

public class OrderItemAPI{
    public OrderItem createOrderItem(order);
    public boolean deleteOrder(orderID);
    public List<OrderItem> getItemInOrder(order);
}

public class ShoppingCartAPI {
    public ShoppingCart createCart(customer,order);
    public boolean addItemToCart(cart, item);
    public boolean removeItemFromCart(cart, item);
}

The methods in the API correspond to the actions that the user would perform at the UI level. Within these methods, the calls to the existing codebase are made.

Writing this API by itself, of course, doesn't mean much. So, I have written tests (I guess they automatically become integration tests) to ensure that the API works well; proving that I got an understanding of how the legacy code works.

After all this introduction, comes my question: Can you define (possibly in software engineering terms) what I have done? When taking this approach, I have gone completely with my intuition. At some point, I remember being extremely confused; working on the API, then working on my tests, fixing my API, then my tests. I wasn't sure anymore if my main objective was to learn the existing code base or come up with a stable API layer.

I would greatly appreciate any kind of explanation; I am sure that this must be a practice that other people have already experimented with. I just need the right guidance to point me to the appropriate discussions/resources.

Hopefully, the architecture of the application set _allows_ for unit tests. Some languages/programming styles discourage anything less than integration/UA tests... RPG, for example, tends to throw the entire stack (database access through to display) into the same compiled unit - although if you're smart, there are some good ways around this. — Clockwork-Muse, Oct 19 '11 at 16:25
I hadn't considered this from that perspective; thanks! As you have pointed out, it would have been extremely difficult to write tests in that scenario. I have no experience with RPG; I will take a look at it. — Guven, Oct 20 '11 at 01:36
And before anyone disputes whether this is a duplicate or not, please read the accepted answer to the dupe target first. — , Dec 14 '15 at 00:12

Steve Jackson · Accepted Answer · 2011-10-19T19:36:58.227

The API is a Facade/Wrapper right? Feathers might also call it creating a Seam. I call it "Getting the System under Test".

If possible, you now want to take the UI layer and integrate it so it works through this new API. At that point you have real integration tests, and have some confidence of the system's behavior from the API down. If you leave it as is, I would characterize them as Learning Tests, which can still be useful for hunting regressions.

Learning Tests were featured in Clean Code, but I don't know who invented the term. Here are some links for more info:

Perfect! Exactly the guidance I was looking for. Integrating this API into the UI is not something I would do right now, but that would be a very good (and a giant) step towards reaching a high test coverage. Currently, as you pointed out, I use them for regression purposes and they are helping a lot. — Guven, Oct 20 '11 at 01:18

score 2 · Answer 2 · answered Oct 19 '11 at 13:06

2

I like the idea of writing unit tests to familiarize yourself with the code, but I'm not sure about the API layer. Was the legacy code too difficult to write tests against without the wrappers? Do you plan to develop something that uses this API (other than the tests)?

I would not generally advocate writing much code that you did not expect to run in production. The person coming after you may be even more confused by this. If this is just part of your test suite, I'd just make sure that is very clear from the folder/project structure.

answered Oct 19 '11 at 13:06

Jeremy

4,609
22
22

Yes, it would have been extremely difficult to write tests without these wrappers. I needed an abstraction (as Rodrigo pointed out below). I didn't have a clear goal when I have developed this API, but now I am using it as a data generator and also for regression tests. Actually, as a data generator, it is working really well! When writing complicated integration tests for new functionality, the data generator comes very handy. – Guven Oct 20 '11 at 01:20

score 2 · Answer 3 · answered Oct 19 '11 at 19:56

Characterization tests is the term that Michael Feathers uses to write tests that try to understand and characterize what an unknown code base does. It seems that, to get to these tests, you were forced to also create this API, a wrapper or facade (as mentioned by Steve Jackson) that tries to expose a coherent set of functionality at the right level of abstraction without going through the UI. It's hard to test without good separation and coherence.

So to me, this API seems like necessary scaffolding to be able to write some characterization tests. It might be interesting to consider evolving it and incorporating it in the application itself, so that you don't have competing views of the same system, what might backfire and turn out to be a source of duplication (and maintenance headaches).

Other related terms that might be of interest include:

To build test data:

Big refactorings:

Strangler application

Design patterns:

Facade

Principles (on levels of abstraction):

Yes, very good analysis. I was really forced to create this API; I couldn't imagine writing tests without it. You are also right that I need to put a lot of effort into it **now** so that it doesn't become a maintenance nightmare. Again as you have pointed out, currently the API serves as a very useful 'test data builder'. In that sense, it is extremely helpful when I create integration tests for the new functionalities. Thanks to the API, I can simulate very complicated scenarios. — Guven, Oct 20 '11 at 01:25

Ramping Up On Legacy Code

3 Answers3