How does one meaningfully measure maintainability?

Question

Context: I'm an enterprise developer in an all-MS shop.

Can anyone recommend a good way of objectively measuring maintainability of a piece of code or an application?

Why maintainability: I'm tired of "quality" metrics in my group revolving only around number of bugs and code coverage. Both metrics are easy to game, especially when you're not measuring maintainability. Shortsightedness and deadlines result in huge amounts of technical debt that never really get addressed.

Why the ability to measure objectively: I work in a big enterprise group. If you can't objectively measure it, you can't hold people accountable for it or make them get better at it. Subjective measurements either don't happen or don't happen consistently.

I'm looking at VS2010 code metrics, but I'm wondering if anyone has any other recommendations.

@Anon - I agree, but at least it would give me a place to start. Right now there's nothing; it doesn't even have to be gamed. — nlawalker, Jan 24 '11 at 22:23
I really don't see how you could do this without peer code reviews. Somebody needs to really understand the overall system design (and one must exist) in order to look at a unit of code and go...hm this could be improved by an improved design or this is code repition or good lord your tools are outdated...On a similar note, you could maintain overarching guidelines like, "hey guys its not a good idea to hardcode indexes into gridviews, use itemtemplates and select columns by name instead". When it comes down to it, the devs just gotto be good and teamable. Da Vinci can't teach awesomeness. — P.Brian.Mackey, Jan 24 '11 at 22:27
If you have developers gaming metrics instead of writing good code already, then adding more metrics will just result in them gaming those metrics as well, *but won't solve the problem*. The solution is to do away with metrics entirely and use other means (public code reviews, for example) to ensure code quality. — Anon., Jan 24 '11 at 22:27
"Everything that can be counted does not necessarily count; everything that counts cannot necessarily be counted." -Einstein — Jason Baker, Jan 25 '11 at 09:17
@nlawalker In the addition to the problems that answerers already raised, your question is loaded with a questionnable assumption, that if such measurement existed people could do something about it. Low maintanability is the result of various factors external to software itself: how difficult or well defined is the problem the program attempt to solve, staff experience, turnover, time to market requirements, scope changes... you simply can't put a bounty on this expecting the problem is a matter of good will. — Diane M, Sep 14 '19 at 12:47

score 7 · Accepted Answer · edited Jun 10 '11 at 16:51

The caveat with measuring maintainability is that you are attempting to predict the future. Code coverage, bug count, LOC, cyclomatic complexity all deal with the present.

The reality is that unless you have concrete evidence that the code is not maintainable as is..ie...fixing a bug caused N amount of hours of un-needed time due to un-maintainable code; then having a leg to stand on will be inherently difficult. In this example it could have been caused by the fact that an overly complex methodology was used when something much simpler would have sufficed. Treading into an area where you attempt to measure methodologies, paradigms, and best practices becomes increasingly difficult with little to no long term gain.

Going down this path is unfortunately a road to nowhere. Focus on uncovering root issues that have substantial merit and are not tied to personal feelings about an issue such as a lack of naming conventions across the code base and find a way to measure success and failures around that root issue. This will then allow you to begin putting together a set of building blocks from which you can then begin to formulate solutions around.

score 7 · Answer 2 · edited May 23 '17 at 12:40

Well, the measure I use, or like to think I use, is this:

For each independent, single, one-line, take-it-or-leave-it functional requirement, snapshot the code base before implementing it. Then implement it, including finding and fixing any bugs introduced in the process. Then run a diff between the code base before and after. The diff will show you a list of all the insertions, deletions, and modifications that implemented the change. (Like inserting 10 consecutive lines of code is one change.) How many changes were there? The smaller that number is, typically, the more maintainable the code is.

I call that the redundancy of the source code, because it's like the redundancy of an error-correcting code. The information was contained in 1 chunk, but was encoded as N chunks, which all have to be done together, to be consistent.

I think this is the idea behind DRY, but it's a little more general. The reason it's good for that count to be low is, if it takes N changes to implement a typical requirement, and as a fallible programmer you only get N-1 or N-2 of them done correctly at first, you've put in 1 or 2 bugs. On top of the O(N) programming effort, those bugs have to be discovered, located, and repaired. That's why small N is good.

Maintainable does not necessarily mean readable, to a programmer who has not learned how the code works. Optimizing N may require doing some things that create a learning curve for programmers. Here's an example. One thing that helps is if the programmer tries to anticipate future changes and leaves how-to directions in the program's commentary.

I think when N is reduced far enough (the optimum is 1) the source code reads more like a domain-specific-language (DSL). The program doesn't so much "solve" the problem as it "states" the problem, because ideally each requirement is just restated as a single piece of code.

Unfortunately, I don't see people learning how to do this very much. Rather they seem to think that mental nouns should become classes, and verbs become methods, and all they gotta do is turn the crank. That results in code with N of 30 or more, in my experience.

Isn't this making a very grand assumption - that all functional requirements are roughly the same size? And wouldn't this metric discouraging separation of responsibilities? I need to implement a horizontal feature; the most "maintainable" code is therefore a near-total rewrite of a program that is entirely contained within one monolithic method. — Aaronaught, Jun 10 '11 at 17:51
@Aaronaught: I don't know how grand it is, but in our group we work off lists of requirements/features, some interdependent, some not. Each one has a relatively short description. If it takes a major rewrite, sure I've seen/done those, but it says to me there was probably a better way to organize the code. [This is my canonical example.](http://stackoverflow.com/questions/371898/how-does-differential-execution-work) I don't say it is easy to learn, but once learned it saves a large measurable amount of effort, getting changes made quickly without error. — Mike Dunlavey, Jun 16 '11 at 13:02

score 5 · Answer 3 · answered Jan 24 '11 at 22:35

Maintainability is not that measurable really. It's a subjective view of an individual based on his experiences and preferences.

For a give piece of code come up with an idea of a perfect design.

Then for any deviation of the real code from that perfect one decrease the value of 100 by some number. By what exactly depends on the consequences of a chose non-perfect approach.

An example:

A piece of code reads and imports some data format and might show error message if something is wrong.

A perfect solution (100) would have error messages kept in one common place. If your solution has them hard-coded as string constants directly in code, you take, say 15 off. So your maintainability index becomes 85.

score 4 · Answer 4 · answered Jan 24 '11 at 22:30

One result of code that's hard to maintain is that it's going to take you longer (on "average") to fix bugs. So, at first glance one metric would appear to be the time taken to fix a bug from when it's assigned (i.e the fix is started) to when it's "ready for test".

Now, this will only really work after you've fixed a reasonable number of bugs to get the "average" (what ever that means) time. You can't use the figure for any particular bug as how hard it is to track down is not just dependent on the "maintainability" of the code.

Of course, as you fix more bugs the code becomes "easier" to maintain as you're making it better (or at least you should be) and you are becoming more familiar with the code. Countering that is that fact the the bugs will be tending to be more obscure and hence even harder to track down.

This also suffers from the problem that if people will tend to rush bug fixes to get a lower score thus either causing new bugs or not properly fixing the existing one leading to yet more work and possibly even worse code.

score 2 · Answer 5 · answered Jan 25 '11 at 21:04

I find the Visual Studio Code Metrics to be quite decent for providing a quick "maintainability" metric. 5 primary metrics are captured:

Cyclomatic Complexity
Depth of Inheritance
Class Couling
Lines of Code (per method, per class, per project, whatever, depending on your level of roll-up)

Maintainability Index is the one I find handy. It's a composite index, based on:

Total Size (Lines of Code)
# of Classes or Files
# of Methods
Cyclomatic Complexity above 20 (or 10 -- configurable, 10 is my preference)
Duplication

Occasionally I'll go look at my methods with a low Maintainability Index (low = bad for this one). Almost without fail, the methods in my project with the lowest Maintainability Index are the ones most in need of a rewrite and the hardest to read (or maintain).

See the white paper for more information on the calculations.

score 1 · Answer 6 · edited Jun 10 '11 at 17:34

1

huge amounts of technical debt that never really get addressed

What about technical debt that is "overtaken by events"?

I write crappy code and rush it into production.

You observe -- correctly -- that it's not maintainable.

That code, however, is the last round of features for a product line which will be decommissioned because the legal context has changed and the product line has no future.

The "technical debt" is eliminated by a legislative change that makes it all obsolete.

The "maintainability" metric went from "bad" to "irrelevant" due to outside considerations.

How can that be measured?

edited Jun 10 '11 at 17:34

Morgan Herlocker

12,722
8
47
78

answered Jan 25 '11 at 00:13

S.Lott

45,264
6
90
154

"In a hundred years we will all be dead and none of this will matter. Kind of puts things in perspective, doesn't it?" If there's anything irrelevant, it is this response which is not an answer to the question. – Martin Maat Sep 14 '19 at 12:21

score 1 · Answer 7 · answered Jan 25 '11 at 01:03

Two that will be meaningful are cyclomatic complexity and class coupling. You cannot eliminate complexity, all you can do is partition it into managable pieces. These 2 measures should give you an idea of where difficult-to-maintain code can be located, or at least where to look the hardest.

Cyclomatic complexity is a measure of how many paths there are in the code. Each path should be tested (but probably isn't). Something with a complexity above about 20 should be broken up into smaller modules. A module with a cycomatic complexity of 20 (one could duplicate this with 20 successive if then else blocks) will have an upper bound of 2^20 paths to test.

Class coupling is a measure of how tightly bound the classes are. An example of some bad code I worked with at my previous employer includes a "data layer" component with about 30 items in the constructor. The person mostly "responsible" for that component kept adding business and UI layer parameters to the constructor/open calls until it was a really big ball of mud. If memory serves me correctly, there were about 15 different new/open calls (some no longer used anymore), all with slightly different sets of parameters. We instituted code reviews for the sole purpose of stopping him from doing more stuff like this - and to avoid making it look like we were singling him out, we reviewed everyone's code on the team, so we wasted about half a day for 4-6 people each and every day because we didn't want to hurt one idiot's feelings.

Having code reviews for everyone isn't a bad thing, honestly. You may feel like you're wasting time, but unless everyone's using it as an excuse to slack off, you *should* be getting valuable insights from them. — Anon., Jan 25 '11 at 01:48

back2dos · Answer 8 · 2011-01-25T22:26:12.113

At the bottom line, maintainability can really only be measured after required, not before. That is, you can only tell, whether a piece of code is maintainable, when you have to maintain it.

It is relatively obvious to measure how easy it was to adapt a piece of code to changing requirements. It is close to impossible measuring ahead of time, how it will respond to changes in requirements. This would mean, you have to predict changes in requirements. And if you can do that, you should get a nobel price ;)

The only thing you can do, is to agree with your team, upon a set of concrete rules (such as SOLID principles), that you all believe generally increase maintainability.
If the principles are well chosen (I think going with SOLID would be a good choice to start with), you can quite clearly demonstrate they are being violated and hold the authors accountable for that.
You will have a very hard time, trying to promote an absolute measure for maintainability, while incrementally convincing your team to stick to an agreed set of established principles seams realistic.

score 0 · Answer 9 · answered Jan 24 '11 at 22:38

The next best thing to a peer code reviews is to create a work-able architecture prior to coding out a unit or product. Red-green-refactor is a pretty neat way to go about it. Have a Sr. guy throw together a workable interface and divy up the work. Everybody can take their piece of the puzzle and red-green their way to victory. After this, a peer code review and refactor would be in order. This worked pretty darn well on a past major product I worked on.

score 0 · Answer 10 · answered Jan 25 '11 at 07:48

Questionnaire

What about making a anonymous questionnaire for the developers, to fill once a month or so? The questions would go something like:

How much of your time the last month have you spend on project X (roughly) [0% ... 100%]
How would you rate the state of the code base in terms of maintainability [really poor, poor, neutral, okay, good, really good].
How complex would you rate the code base compared to the complexity of the project [way too complex, just right, too simplified].
How often did you feel you where obstructed in solving your tasks due to excessive complexity of the code base? [not at all, once in a while, often, constantly].

(Feel free to add additional questions you think would be useful in measuring maintainability in the comments and I will add them.)

score 0 · Answer 11 · edited Jun 16 '20 at 10:01

I can think of two ways to look at maintainability (I am sure there are more hopefully others can come up with good definitions.

Modification without understanding.

Can a bug fixer come into the code and fix a problem without needing to understand how the whole system works.

This can be achieved by providing comprehensive unit tests (regression tests). You should be able to check that any change to the system does not change how the system behaves with any specific good input.

In this situation a bug fixer should be able to come in and fix a (simple) bug with only a minimal knowledge of the system. If the fix works then none of the regression tests should fail. If any regression tests fail then you need to move to stage 2.

maintainabilty1 = K1 . (Code Coverage)/(Coupling of Code) * (Complexity of API)

Modification with understanding.

If a bug fix becomes non trivial and you need to understand the system. Then what is the documentation of the system like. We are not talking documentation of the external API (they are relatively useless). What we need to understand is how the system works where any clever (read hacks) tricks used in the implementations etc.

But documentation is not enough the code needs to be clear and understandable. To measure the understandability of a code we can use a little trick. After the developer has finished coding, give him/her a month to work on something else. Then ask them to come back and document the system to an extent that a pier can now understand the system. If the code is relatively easy to understand then it should be quick. If it is badly written they will take a longer time to work out what they built and write the documentation.

So maybe we could come up with some measure of this:

maintainability2 = K2 . (Size of doc)/(Time to write doc)

score 0 · Answer 12 · answered Sep 14 '19 at 11:33

I often find that the "shortest equivalent" solution tends to be most maintainable.

Here the shortest means the fewest operations (not lines). And equivalent means that the shorter solution shouldn't have worse time or space complexity than the previous solution.

This means all logically similar repeating patterns should be extracted to the appropriate abstraction: Similar code blocks? Extract it to function. Variables that seem to occur together? Extract them into a struct/class. Classes whose members differ only by type? You need a generic. You seem to recalculate the same thing at many places? Calculate at the beginning and store the value in a variable. Doing these will result in shorter code. That's the DRY principle basically.

We can also agree that unused abstractions should be deleted: classes, functions that are no longer needed are dead code, so it should be removed. The version control will remember if we ever need to reinstate it.

What is often debated are abstractions that are referenced only once: non-callback functions that are called only once with no reason to be ever called more than once. A generic that is instantiated using only one type, and there is no reason it will ever be instantiated with another type. Interfaces that are implemented only once and there is no real reason that it would be ever implemented by any other class and so on. My opinion that these things are unnecessary and should be removed, that's basically the YAGNI principle.

So there should be a tool that can spot code repetition, but I think that is problem is akin to finding the optimal compression, which is the Kolmogorov complexity problem which is undecidable. But on the other end unused and under used abstractions are easy to spot based on the number of references: a check for that can be automated.

score 0 · Answer 13 · answered Sep 14 '19 at 12:42

It is all subjective and any measurement based on the code itself is ultimately irrelevant. In the end it comes down to your ability to meet demands. Can you still deliver the features that are being requested and if you can, how often will those changes come back to you because something isn't quite right yet and how serious are those issues?

I just (re)defined maintainability but it is still subjective. On the other hand, that may not matter all that much. We just need to satisfy our customer and enjoy it, that's what we are aiming at.

Apparently you feel you have to prove to your boss or co-workers that something needs to be done to improve the state of the code base. I would argue it should be enough for you to say you are frustrated by the fact that for every little thing you have to change or add you have to fix or work around 10 other issues that could have been avoided. Then name a notorious area and make a case to turn it upside down. If that does not raise any support in your team you may be better off somewhere else. If people around you don't care, proving your point is not going to change their minds anyway.

How does one meaningfully measure maintainability?

13 Answers13

Modification without understanding.

Modification with understanding.

Linked