Are there any empirical studies on the effect of different languages on software quality?

Question

The proponents of functional programming languages assert that functional programming makes it easier to reason about code. Those in favor of statically typed languages say that their compilers catch enough errors to make up for the additional complexity of type systems. But everything I read on these topics is based on rational argument, not on empirical data.

Are there any empirical studies on what effects the different categories of programming languages have on defect rates or other quality metrics?

(The answers to this question seem to indicate that there are no such studies, at least not for the dynamic vs. static debate)

As you can probably imagine, there are a ridiculous number of confounding factors involved. There are "empirical studies" out there, but they're little more than well-documented anecdotes and should be given about as much weight as that warrants. — C. A. McCann, Aug 04 '11 at 17:39
possible duplicate of [Dynamically vs Statically typed languages studies](http://programmers.stackexchange.com/questions/10032/dynamically-vs-statically-typed-languages-studies) — Steven A. Lowe, Aug 04 '11 at 17:45
@Steven: This question appears to be scoped more broadly (perhaps too broadly). — Robert Harvey, Aug 04 '11 at 19:08
@Robert there are COCOMO studies along these lines, but they are meaningless - unless you've studied your own team, and that's nearly impossible to do objectively — Steven A. Lowe, Aug 04 '11 at 20:34
the problem is that good engineers can produce quality software in any language, and the opposite is true for low quality engineers. The real question is which language produces acceptable quality from average engineers, which is what most companies have anyway. Now you just have to define "acceptable" and "average". — Kevin, Aug 04 '11 at 23:08
If the study indicated, and strongly supported, a result that you personally found intolerable or disgusting, what would you do? — John R. Strohm, Sep 16 '14 at 20:41
@Kevin, while it may be true that good engineers can produce quality software in any language, we all live and work in the real world, and there is some evidence out there that suggests that, in the real world, with *typical* engineers, language makes a difference. — John R. Strohm, Sep 16 '14 at 20:44
Hmm people who are good at static types know how to make the type system flexible so that allows growth, know how to express "invariants" into types, how to get around not being able to do "monkey patching" and "dynamic reflection", and they can use refactoring tools to make up for loss of flexibility. People who are good at dynamic typing languages are good at making their software "testable" by decomposing it into functions, using the REPL connected to the running program, know how to use "mocks", keyword arguments, "decorators", and eDSLs to reduce logic errors. — aoeu256, Apr 20 '21 at 22:55

redjamjar · Answer 1 · 2011-08-04T21:51:30.180

There is some research in academia on this subject. Here are some examples I know of, although you should treat the conclusions with caution:

An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time, Stefan Hanenberg. In Proc OOPSLA, 2010. ACM Link
An Empirical Study of Static Typing in Ruby, M. Daly, V. Sazawal, J. Foster. In Proc PLATEAU, 2010. PDF
A Controlled Experiment to Assess the Benefits of Procedure Argument Type Checking, Lutz Prechelt and Walter F. Tichy. IEEE TSE, 1998. IEEE Link

I'm sure there are other papers. Generally speaking, however, this area is extremely controversial for obvious reasons --- it's really hard to make an objective assessment!!

score 1 · Answer 2 · answered May 14 '12 at 01:32

One famous study is Lutz Prechelt. An empirical comparison of seven programming languages. IEEE Computer [33(10):23-29], October 2000

Published version (IEEE Computer subscription required)
Preprint version (freely available)

Prechelt discusses program reliability, and also examines execution time and memory consumption.

joshin4colours · Answer 3 · 2011-11-15T21:17:21.553

Although it's not related to code quality as such, this study looks at how novices learn using different languages. In particular, they compare how novices fare when learning Perl vs Quorum, a teaching language the authors wish to compare. What's really cool about this paper is that they actually come up with a control language where the syntax is generated randomly, as a sort of "placebo" language. This approach might be really interesting if applied to languages and code quality and help control some of those tricky confounding factors when comparing languages.

Are there any empirical studies on the effect of different languages on software quality?

3 Answers3

Linked