1

There are many posts about the benefits of static code analysis tools. However, in which scenarios would you recommend NOT using (or significantly limit) them? For example, do you also run them on tests?

I use mypy and pylint and usually find them very helpful. However, in a few cases I spent more time working around the linter than on the actual feature or bugfix. For example, the attrs package often yields False Positives in mypy and pylint, in particular when attrs is extended. Thus, I do sometimes ask myself whether the disadvantages outweigh the benefits and would be curious to hear your experience.

gebbissimo
  • 241
  • 1
  • 6
  • 6
    As with any other tool, the answer is "use it if the benefits outweigh the disadvantages". Only you can say whether that is true for your case, my project is different from your project. – Philip Kendall Mar 19 '22 at 11:19
  • 4
    This question is both on topic and good quality. I’m disappointed with the reception it’s getting. – candied_orange Mar 19 '22 at 16:11
  • @candied_orange: the question is on-topic and interesting, but the way it is phrased could make it likely to attract several opinionated answers. Maybe we should leave it is as it is either, currently the number of answers and their content seems to be pretty good. – Doc Brown Mar 21 '22 at 14:33
  • @DocBrown: Any tips on how to phrase it better? Would really appreciate it! – gebbissimo Mar 22 '22 at 15:41
  • 1
    @gebbissimo: I think, in the light of the current answers, you better leave the question as it is. Maybe next time a more specific title could make such a question look less opinionated, something along the lines of *"How to weigh up benefits vs costs of static analysis tools"*. – Doc Brown Mar 22 '22 at 21:17

6 Answers6

7

The goal of static code analysis is to have a positive expected return.

  • Using static code analysis tools requires effort (or other cost).
  • Dealing with bugs also costs effort (or other cost).
  • Finding and fixing bugs before they become a problem typically requires less effort (or other cost).

Thus, using these tools is worth it if the cost of using them is outweighed by the expected reduction in cost from bugs that slip through:

cost of doing QA + expected cost fixing bugs early < expected cost of dealing with bug later

where the expected cost is likelihood × impact.

Tools such as Pylint have low cost, mostly just creating a config file that disables warnings that you are not interested in. However, Pylint has rarely helped me find immediate problems.

Tools such as Mypy have considerable cost because it can only work if you add useful type annotations throughout your code. And as you've experienced, these type annotations don't always harmonize well with more dynamic programming techniques such as attrs. However, type checking has helped me find many problems up front.

It is not so easy though, because these tools have second-order effects.

  • For example, Pylint will complain if you have a function with lots of local variables. That is not a bug. But it indicates complexity, and bugs love to hide within complexity. Pushing you towards simpler code has value.
  • Mypy is probably even more useful, since it encourages you to add type annotations throughout your code. This can make developers much more productive, in particular by enabling extremely useful IDE features such as precise go-to-definition or Intellisense-style type-based autocomplete. Being forced to think about types can also help design clearer interfaces, for example by clarifying whether a function can take only int or float arguments or both.

Thus, these tools don't just have value by pointing you towards bugs, they also have value by encouraging a certain programming style and helping you to think about the design of the software more consciously.

In this context, it's worth pointing out that very useful libraries like attrs are not only useful, they also make the use of other tools more difficult. By the same logic with which Mypy might be not worth it, attrs might be not worth it either.

In practice, it is not possible to figure out precisely whether using a particular tool is worth it for a given project. It is not realistically possible to provide a useful estimate of expected costs of dealing with a bug now or later (though empirical evidence suggests dealing with bugs early is substantially cheaper). A lot of this comes down to your particular subjective preferences.

  • It is perfectly fine to decide that using these tools is not worth it, especially if you are dealing with a legacy code base that would produce a large amount of false positives. The same effort might be better invested in other QA measures, such as code reviews or better tests.
  • It is also perfectly fine to decide that these tools are worth it, even if it requires substantial changes to your code.

A lot of this depends on the context of the software – a Jupyter notebook with an explorative data science session will have completely different quality requirements from a business-critical financial system. There is no one-size-fits-all approach.

amon
  • 132,749
  • 27
  • 279
  • 375
5

Don’t use static analysis tools in any shop with reviewers that abuse them by making their job be nothing more than to run them and report on the results. This isn’t hyperbole. It’s happening now.

I’ve received reports from such departments. They are well meaning honest people but all they do is slow down the review process. They never say anything the tool didn’t say.

These tools are only effective when used by coders who have a critical feel for the code. Who could explain what’s happening in it, can see why it does what it does, and can be trusted to over rule the tool when appropriate. In the hands of those that don’t these tools are nothing but an excuse to write useless time wasting reports. Beware of such people.

candied_orange
  • 102,279
  • 24
  • 197
  • 315
  • Why is the tool being run by people? Don't you have a build server to run it automatically? – bdsl Mar 21 '22 at 16:04
  • 1
    @bdsl there are work environments where people who you've never met, who never read the requirements, who never talk to the stakeholders, will take your code and run it through these tools only so they can write reports about what the tools said. Seriously, I'm not kidding. This is a job. For an entire department. Complaining about how useless it is to sit around for a month waiting for a word document that tells you what your own computer can tell you in minutes just gets you painted as not caring about code quality. – candied_orange Mar 21 '22 at 18:19
  • OK, but the best way to stop that seems to be to make sure that the code reviewers are automatically provided with a copy of the tool output along with the code that they have to review. – bdsl Mar 21 '22 at 18:20
  • although waiting a month for code review and having code reviewed by people you've never met seem like much more serious problems. – bdsl Mar 21 '22 at 18:21
  • @bdsl I'm talking to the people who design these departments. Once they're designed you're already trapped. You might get lucky and take one of these guys to lunch to find out what version they're using but most of them think asking that is like asking for a tests answer key. – candied_orange Mar 21 '22 at 18:23
3

Generally speaking, I can't think of a reason why I would not run static analysis tools. More specifically, there may be some code where certain types of static analysis tools may not make sense or tools may need specific configuration.

Using your example of tests, I would probably not run scans looking for vulnerabilities on test code. A lot of what these tools look for - improper control on file access, input validations, SQL injections, and so on - aren't really applicable to automated test code, especially if the code doesn't get built and deployed with the application. Similarly, I would run style linters on tests, but I may think about changing the configuration to avoid certain types of problems if I don't feel they are appropriate for tests. One example may be line length - it may be more beneficial to ignore long lines or have additional whitespace or multiple expressions per line in test because that helps readability and understandability in the context of a test case.

In my experience, it's almost never a good idea to not use static analysis. It's more about getting the configuration right, whether that's to account for the kind of code that's being analyzed or to focus the findings on the worst problems while preventing new issues.

Thomas Owens
  • 79,623
  • 18
  • 192
  • 283
2

Where the number of false alarms the tool is giving is so high that it's hard to spot the actual errors it's finding.

If a tool gives too many false alarms, you may end up:-

  • Rewriting code that is actually fine in order to keep the static analysis tool happy.
  • Skimming through the error list so fast that you miss the actual problems.
  • Constantly disabling features of the tool until it stops reporting errors.
Simon B
  • 9,167
  • 4
  • 26
  • 33
  • Remember that if you write good code, all alarms are false alarms. In my experience, I found a static analysis tool reporting "if this happens, and then this happens, and then five more things happen, then you dereference a null pointer" and with the information, I managed the app to crash. – gnasher729 Mar 21 '22 at 19:21
1

Nothing speaks against generally using static code analysis.

There are, however, cases where blindly following the recommendations of such tools because of

  • Just too much effort because what you're analyzing is not production code (maybe prototypes, temporary implementations,...
  • False positives because of inappropriate tooling/inappropriate configuration of the tooling
  • Legacy code that's just not worth fixing/refactoring (maybe because it's being phased out)

comes close to cargo cult. Take analysis results with a grain of salt and don't blindly follow all recommended fixes - sometimes it's just not worth it. On production code you want to maintain/use for the foreseeable future, however, take warnings of a good static analysis tool seriously.

If static analysis complains about high complexity of your code, it might just be the problem is complex....

tofro
  • 891
  • 6
  • 10
1

From a practical point of view: Many people live without it, but it is helpful and finds bugs quite cheaply. In my environment, static analysis is quite expensive, probably 2 to 3 times slower than build with optimisations turned on which is probably 2 to 3 times slower than build without optimisations.

So I do it not on every build, that would just slow me down, and 99% of the time find nothing, but say a week before a release is a good time. If you are on top of it, you will have only a few problems, and not much work to fix these problems. (In addition, my static analyser wants to analyse everything so it is a complete rebuild every time).

gnasher729
  • 42,090
  • 4
  • 59
  • 119