74

Do there exist studies done on the effectiveness of statically vs dynamically typed languages?

In particular:

  • Measurements of programmer productivity
  • Defect Rate

Also including the effects of whether or not unit testing is employed.

I've seen lots of discussion of the merits of either side but I'm wondering whether anyone has done a study on it.

Winston Ewert
  • 24,732
  • 12
  • 72
  • 103
  • Forgive my ignorance, but what's static and dynamic typing...? – gablin Oct 06 '10 at 21:52
  • 1
    @Winston: Try http://cstheory.stackexchange.com/ – Maniero Oct 06 '10 at 21:56
  • @gablin: http://en.wikipedia.org/wiki/Type_system – Maniero Oct 06 '10 at 21:58
  • 8
    @bigown, it doesn't seem to me that issues of productivity and defects relate to computer science theory – Winston Ewert Oct 06 '10 at 22:04
  • @bigown: Ah, _that_ kind of typing. I thought he meant typing as in keyboard typing... Thanks! ^^ – gablin Oct 06 '10 at 22:08
  • Ha hah I thought exactly the same thing as @gablin. Wondering, huh, is he talking about a keyboard that adjusts while you're using it? – dash-tom-bang Oct 06 '10 at 23:13
  • 2
    @Winston: Studying this kind of issues it's the job of computer scientists, not programmers. – Maniero Oct 06 '10 at 23:21
  • 10
    @bigown, yes its a computer science issue but its not a computer science theory issue. CS theory essentially deals with what we can mathematically prove about about programs and computing. Issues of programmer productivity are not cs theory questions. There have been discussions of dynamic typing both here and on stackoverflow. There have been none on cstheory. – Winston Ewert Oct 07 '10 at 00:09
  • 9
    The question's perfectly on topic. This question discusses one of the most important properties of the tools we use to program. – Frank Shearar Oct 15 '10 at 12:46
  • 4
    @Winston: Typing systems do belong in CS theory, but practical studies don't. – David Thornley Oct 15 '10 at 15:16
  • 2
    @David Thornley, Agreed. – Winston Ewert Oct 15 '10 at 17:05
  • @haylem, given http://meta.programmers.stackexchange.com/questions/105/should-we-worry-about-accept-rate, I haven't been concerned with accepting answers on Programmers.SE unless I thought they really hit the nail on the end. For many of my questions, I don't believe that any of the answers did that. – Winston Ewert Mar 03 '12 at 15:11
  • @WinstonEwert: Yes, I thought of that and did a search that took me to this Meta thread afterwards. Your questions, you call. – haylem Mar 03 '12 at 15:40
  • Studies of this nature never have enough context to be worthwhile, IMO. There's a big difference between what a 2,000-developer team that's probably way too big for the work needs to be successful and what a 5-dev team needs. IMO, every dev should learn to write maintainable code in a dynamic language. There's a lot of experienced developers out there claiming you can't write maintainable JavaScript whose strictly typed code could stand to benefit from STFUing and actually learning how, IMO. But it's not an either/or thing. – Erik Reppen Jan 18 '14 at 01:07
  • Today, we also have languages with templates, languages that derive types, languages that can use protocols instead of types, languages that allow typed enumerations. So there isn’t just “typed” an “dynamic” anymore. – gnasher729 Jan 05 '19 at 17:02
  • @gnasher729, I'm now convinced that the question is unanswerable because so many other factors influence productivity in a language. People's convictions one way or another are heavily influenced by which dynamic or static languages they've used. – Winston Ewert Jan 10 '19 at 15:04

7 Answers7

46

Some suggested reading:

Not exactly on static typing, but related:

Some interesting articles or essays on the subject or on static analysis of programs in general:

And for the ones who would be wondering what this is all about:

However, I doubt any of these with give you a direct answer, as they don't do exactly the study you're looking for. They will be interesting reads though.

Personally, I firmly consider that static typing over dynamic typing facilitates bug detection. I spend way too much type looking for typos and minor mistakes like these into JavaScript or even Ruby code. And when it comes to the view that Dynamic Typing gives you a boost in productivity, I think that mostly comes down to tooling. If statically typed languages have the right tools to allow for background recompilation and provide an REPL interface, then you get the benefits of both worlds. Scala provides this for instance, which makes it very easy to learn and prototype away in the interactive console, but gives you the benefits of static typing (and of a stronger type system than a lot of other languages, ML-languages aside). Similarly, I don't think I have a loss of productivity by using Java or C++ (because of the static typing), as long as I use an IDE that helps me along. When I revert to coding only with simple configurations (editor + compiler/interpreter), then it feels more cumbersome and dynamic languages seem easier to use. But you still hunt for bugs. I guess people would say that the tooling issue is a reversible argument, as if tooling were better for dynamic languages, then most bugs and typos would be pointed out at coding-time, but that reflects the flaw in the system in my opinion. Still, I usually prototype in JRuby and will code in Java later most of the things I do.

WARNING: Some of these links are unreliable, and some go through portals of various computing societies using fee-based accesses for members. Sorry about that, I tried to find multiple links for each of these but it's not as good as I'd like it to be.

haylem
  • 28,856
  • 10
  • 103
  • 119
  • That bug finding - is it mainly because of mis-spelled variable names, or method names, or somewhere in between? (I _loathe_ implicit variable declaration for precisely this reason: in Smalltalk, you declare all your variables at the top, so you immediately know when you've mistyped a variable name. (Method name typos are sometimes caught too, if the method name's never been used in the image before.)) – Frank Shearar Oct 16 '10 at 11:53
  • Re tooling, I have to say that it depends on your language - Smalltalk has excellent tools, including a Refactoring Browser that Eclipse has (I'm told) yet to catch up to. – Frank Shearar Oct 16 '10 at 11:55
  • @Frank Shearar, since I started with Ruby (coming from Java), I've realized that what @haylem's saying probably does not apply to Smalltalk. Nor does my mantra about automatic refactoring being impossible in dynamically-typed langs. I completely agree with haylem's "personally" section.... ignoring SmallTalk of course :) This is fair, to some extent, since SmallTalk, while not dead, is definitely not enjoying the popularity that Python or Ruby have (now in Oct 2010). – Dan Rosenstark Oct 16 '10 at 14:22
  • 3
    @haylem, personally I thank you for making me feel that I'm not the only person in the world that works in dynamic languages but, when given a choice, many times CHOOSES statically-typed languages (same case, Java vs. JRuby or Groovy). – Dan Rosenstark Oct 16 '10 at 14:24
  • @Frank Shearar: Tools can help you diagnose typos (Clang uses the Levenshtein distance to look for potential candidates), however dynamic typing allows you to pass anything to a method, and it may fail far further down the call tree :/ – Matthieu M. Oct 16 '10 at 14:41
  • I'm curious as to why you choose to prototype in JRuby. When prototyping the goal is to develop quickly and it would seem that having your compiler point out typos and silly mistakes as you type them would be helpful. – Winston Ewert Oct 16 '10 at 19:08
  • @Winston Ewert: yes, and I usually tend to agree with that. But if it's just to try out something, then I like to have an REPL in an interactive environment (be it Ruby, JRuby to directly mock access to Java Collections, Groovy or (J)Python or what have you). Take SO for instance: When I need to answer a Java or C++ question, it bugs me to have to open an IDE to prototype some Java (because dealing with imports in a simple text editor in Java, honestly, that's quite a pain, or you go the short route of using .* everywhere possible). – haylem Oct 16 '10 at 21:16
  • @Frank Shearar: Yes, that would be one of my biggest grudges against JavaScript/ECMAScript for instance. – haylem Oct 16 '10 at 21:17
  • @Frank Shearar and @Yar: I agree that SmallTalk seems to be an exception... but maybe because they did get that right: great tooling from the start. Xerox PARC people where a pretty god bunch of geniuses AND practical/pragmatic programmers, Alan Kay not the least and last one of them. I believe you can do refactoring with Dynamic Languages, it's just a difference type of process for the compilers (I'm not really a CS / language expert in any way though, so just stating my views on this). Oh and Eclipse has a *perspective* that ressembles SmallTalk's editor, but I've also been told it's a good. – haylem Oct 16 '10 at 21:22
  • 4
    It is interesting because my own preference for dynamic typing is for rather different reasons. I mean fast compiles and interactive interpreter are useful but they aren't why I like dynamic typing. I like dynamic typing because I find many situations in which the static typing languages just make it difficult to impossible to describe a particular concept. – Winston Ewert Oct 18 '10 at 00:50
  • @Winston Ewert: In my experience it takes more time to get the types correct in a statically-typed language, but after you got the types right (accepted by the compiler) the chances are higher that your program is correct. So, yes, I like to use dynamically-typed languages for prototyping and statically-typed languages for more mission-critical code. – Giorgio Mar 30 '14 at 16:21
  • @Giorgio, I think the critical point isn't when the compiler accepts my code, but when it passes my unit tests. At that point I find little difference in terms of the chance that my program is correct. – Winston Ewert Mar 31 '14 at 00:42
  • @Winston Ewert: But you probably have to write more unit tests to check properties that would otherwise be enforced by a compiler. In my own experience, I can obtain a comparable code quality with fewer tests when using a static language. Of course, I might have to spend more time in order to design proper types. – Giorgio Mar 31 '14 at 15:47
  • @Giorgio, actually, that doesn't match my experience. I write pretty much the same tests in both dynamic and statically typed languages. I'm actually curious, what kinds of extra tests do you find yourself writing in dynamic languages? – Winston Ewert Mar 31 '14 at 16:08
  • @Winston Ewert: I usually test most functions in Python whereas I write almost no tests when programming in Haskell. As a specific example, I recently had a bug in Python because I expected a function `def foo(x)` to only accept list. Since the function internally only referenced `x` in an `if` statement (`if x:`), it also accepted strings (and maybe other types as well). This occurred to me only later and I added the extra test and code. – Giorgio Mar 31 '14 at 16:13
  • In general, type checking restricts heavily what you can do with data, making it more difficult to perform operations that do not make sense. So, with type checking in place, you only need to check the behaviour of functions when they are given values of valid types and you do not need to check what happens when they are given some nonsense values. See also the article cited by PBrando. – Giorgio Mar 31 '14 at 16:26
  • @Giorgio, ah. My experience is with comparing C++/Java to Python. Haskell, as I understand it, is quite different in that it catches many more errors and requires less busy-work to please than C++ or Java. So I'd be willing to believe that Haskell might change the game in favor of static typing. – Winston Ewert Apr 01 '14 at 02:27
  • In your example, was the bug merely that `def foo(x)` accept a string without complaint? Or did that actual cause a bug elsewhere? – Winston Ewert Apr 01 '14 at 02:29
  • @Winston Ewert: The function then called some other function passing the string as an argument. So the wrong string value (instead of list) was propagated along the call stack. Eventually, some other function tried to do something with it (concatenating it with a list) and threw and exception. – Giorgio Apr 01 '14 at 15:22
  • @Giorgio, the usual practice in Python is not to be concerned about the behavior of functions when passed incorrect types. You are writing extra tests because you are worried about what `foo` does when passed the incorrect type. Typically, Python coders only test the cases with correct types. So a Python coder wouldn't add a check to foo for list types, and wouldn't test the behavior of foo for anything that wasn't a list. You appear to be trying to get python to act more like a statically typed language which naturally requires much more code/testing. – Winston Ewert Apr 05 '14 at 16:09
  • @Giorgio, so the extra checking tests that you mention are typically just not written in python. The question is whether or not defects in the code arise as a result. My experience has been that for the most part, I catch incorrect types being passed during unit testing, and the problem is trivial to find. i.e. I don't find these types of errors escaping to production and they types of errors don't generate hard to find bugs. – Winston Ewert Apr 05 '14 at 16:14
  • @Giorgio, but ultimately the point of my question here was to see whether actual studies exist on programmer productivity and defect rates regarding the typing issue. There are lots of people who have opinions and arguments, but there is a great lack of data. I think you can only hope to settle this question with data. – Winston Ewert Apr 05 '14 at 16:17
  • @Giorgio, actually, if you're interested I'd be game to try an experiment where we pick something and both develop the same project in our preferred language and compare notes. – Winston Ewert Apr 05 '14 at 17:40
  • @Winston Ewert: We could do such an experiment, why not (even though I do not think it will have much statistical significance). Do you have anything in mind? I would discuss the details and other stuff in a chat. Hope to catch up with you on here some time. – Giorgio Apr 09 '14 at 16:43
  • http://chat.stackexchange.com/rooms/13863/winstonewert-and-giorgio – Winston Ewert Apr 10 '14 at 03:42
  • An additional recent study at http://macbeth.cs.ucdavis.edu/lang_study.pdf – mkobit Sep 15 '16 at 17:50
  • "I like dynamic typing because I find many situations in which the static typing languages just make it difficult to impossible to describe a particular concept.". Actually, it's quite the opposite. Dynamic languages hardly allow you to define any concept. You cannot even define an integer variable! – gardenhead May 04 '17 at 02:00
  • @gardenhead your comment should be addressed to Winston Ewert. – haylem May 08 '17 at 18:21
22

Just yesterday I've found this study: Unit testing isn't enough. You need static typing too.

Basically the author used a tool able to convert automatically a project from a non-static typing language into a static typing one (python to haskell)

Then he selected a number of open source Python projects that also included a reasonable amount of test units, and automatically converted them to haskell.

The translation to Haskell revealed a series of errors related to the type of the variables: the errors weren't discovered by the test units.

PBrando
  • 320
  • 2
  • 6
  • 4
    Uncomfortable truth of dynamic typing. – Den Jan 24 '14 at 13:57
  • 7
    "The translation to Haskell revealed a series of errors related to the type of the variables: the errors weren't discovered by the test units.": With a dynamically-typed language you have to manually test properties of your code that in a statically-typed language are automatically checked by the compiler. This is both more time-consuming and error-prone. +1 – Giorgio Mar 30 '14 at 16:18
  • 4
    [I responded to a posting on this link on Reddit.](http://www.reddit.com/r/Python/comments/2hfa67/unit_testing_isnt_enough._you_need_static_typing_too./cksoa87?context=3) I don't think the conclusions drawn from the paper are reasonable. – Veedrac Oct 31 '14 at 22:14
  • Both typing systems has pro/cons and their usages. It's like discussing about bringing a machine gun to a hand-to-hand fight. That's a religion war far from end. That said, I agree with Veedrac. Non-static languages need more test cases to catch errors caused by types. That's their nature and con. But, a programmer need to write test that catch error in code caused by unexpected state of input, not necessarily exhaustive testing for input type. – Andre Figueiredo Nov 25 '18 at 05:19
10
  • Link to discussion of ACM paper "An Experiment About Static and Dynamic Type Systems" (2010) by Stephan Hanenberg article (referenced by Lorin Hochstein in a previous post).
  • Conclusion: Productivity for similar quality was higher in a dynamic language.
  • Potential biases/validity issues: Experimental subjects were all students. Also, limited variety of the programming tasks (subjects were asked to implement a scanner and parser).
  • ACM paper "Do Programming Languages Affect Productivity?" (2007) by Delorey, Knudson, and Chun.
  • Conclusion: JavaScript, Tcl, Perl more productive than C# C++ and Java. Python and PHP fall in the middle.
  • Potential biases/validity issues: No measure of quality (such as bugs discovered post-release). No measure of reliability (is software written in statically typed languages more dependable?). Sample bias - all projects were open taken from open source CVS repositories. Also, no distinction between weakly and strongly typed languages (i.e. pointers).
  • Thesis "Empirical Study of Software Productivity and Quality" (2008) by by Michael F. Siok
  • Conclusion: Choice of programming language does not significantly influence productivity or quality. However, it does affect labor costs and "quality within the overall software projects portfolio".
  • Potential biases/validity issues: Restricted to avionics domain. Programming languages could have all been statically typed. I didn't read the thesis, so I cannot evaluate its rigor.
    My opinion. Although there is weak evidence that dynamically typed languages are more productive, it is not conclusive. (1) There are many factors that were not controlled, (2) there are too few studies, (3) there has been little or no discussion about what constitutes an appropriate test method.
ahoffer
  • 613
  • 7
  • 19
6

Here's a starting point:

The paper is challenging the commonly received wisdom that, all else being equal, programmers write the same number of lines of code per time regardless of language. In other words, the paper should serve as supporting empirical evidence that mechanical productivity (lines of code written) is not a good measure of functional productivity, and must at least be normalized by language.

gnat
  • 21,442
  • 29
  • 112
  • 288
Pi Delport
  • 196
  • 1
  • 8
  • 5
    For the non-IEEE people, what's the basic summary? – Frank Shearar Oct 16 '10 at 07:21
  • 1
    @Frank Shearar, the conclusion they draw is that programming language does affect productivity. They are measuring lines of code per programmer per language per year, I'm not sure thats a good measure of productivity. – Winston Ewert Oct 18 '10 at 15:04
  • 5
    @Winston: That's definitely a flawed metric. You'd find COBOL to be a very productive language by it: it takes a lot of lines to do anything useful, but they're fairly easy to write. – David Thornley Oct 20 '10 at 16:07
  • Winston, David: I'm pretty sure the authors are not suggesting that lines-of-code productivity is a measure of *functional* productivity. Rather, the paper is challenging the commonly received wisdom that, all else being equal, programmers write the same number of lines of code per time regardless of language. In other words, the paper should serve as supporting empirical evidence that mechanical productivity (lines of code written) is *not* a good measure of functional productivity, and must at least be normalized by language. – Pi Delport Oct 20 '10 at 16:26
  • I agree with that. But it doesn't serve to answer my original question. – Winston Ewert Oct 20 '10 at 17:26
  • I think their findings may reflect how "upward trending" any language is. JavaScript is getting much attention and it's also getting more time and more lines written, compared to, say, pascal (which was at the other end of the spectrum in this study). – Rolf Sep 06 '14 at 01:04
4

I have found a Static vs. dynamic languages: a literature review, which lists some studies on the subject and gives a nice summary on each study.

Here's the executive summary:

Of the controlled experiments, only three show an effect large enough to have any practical significance. The Prechelt study comparing C, C++, Java, Perl, Python, Rexx, and Tcl; the Endrikat study comparing Java and Dart; and Cooley’s experiment with VHDL and Verilog. Unfortunately, they all have issues that make it hard to draw a really strong conclusion.

In the Prechelt study, the populations were different between dynamic and typed languages, and the conditions for the tasks were also different. There was a follow-up study that illustrated the issue by inviting Lispers to come up with their own solutions to the problem, which involved comparing folks like Darius Bacon to random undergrads. A follow-up to the follow-up literally involves comparing code from Peter Norvig to code from random college students.

In the Endrikat study, they specifically picked a task where they thought static typing would make a difference, and they drew their subjects from a population where everyone had taken classes using the statically typed language. They don’t comment on whether or not students had experience in the dynamically typed language, but it seems safe to assume that most or all had less experience in the dynamically typed language.

Cooley’s experiment was one of the few that drew people from a non-student population, which is great. But, as with all of the other experiments, the task was a trivial toy task. While it seems damning that none of the VHDL (static language) participants were able to complete the task on time, it is extremely unusual to want to finish a hardware design in 1.5 hours anywhere outside of a school project. You might argue that a large task can be broken down into many smaller tasks, but a plausible counterargument is that there are fixed costs using VHDL that can be amortized across many tasks.

As for the rest of the experiments, the main takeaway I have from them is that, under the specific set of circumstances described in the studies, any effect, if it exists at all, is small.

Moving on to the case studies, the two bug finding case studies make for interesting reading, but they don’t really make a case for or against types. One shows that transcribing Python programs to Haskell will find a non-zero number of bugs of unknown severity that might not be found through unit testing that’s line-coverage oriented. The pair of Erlang papers shows that you can find some bugs that would be difficult to find through any sort of testing, some of which are severe, using static analysis.

As a user, I find it convenient when my compiler gives me an error before I run separate static analysis tools, but that’s minor, perhaps even smaller than the effect size of the controlled studies listed above.

I found the 0install case study (that compared various languages to Python and eventually settled on Ocaml) to be one of the more interesting things I ran across, but it’s the kind of subjective thing that everyone will interpret differently, which you can see by looking.

This fits with the impression I have (in my little corner of the world, ACL2, Isabelle/HOL, and PVS are the most commonly used provers, and it makes sense that people would prefer more automation when solving problems in industry), but that’s also subjective.

And then there are the studies that mine data from existing projects. Unfortunately, I couldn’t find anybody who did anything to determine causation (e.g., find an appropriate instrumental variable), so they just measure correlations. Some of the correlations are unexpected, but there isn’t enough information to determine why.

The only data mining study that presents data that’s potentially interesting without further exploration is Smallshire’s review of Python bugs, but there isn’t enough information on the methodology to figure out what his study really means, and it’s not clear why he hinted at looking at data for other languages without presenting the data3.

Some notable omissions from the studies are comprehensive studies using experienced programmers, let alone studies that have large populations of “good” or “bad” programmers, looking at anything approaching a significant project (in places I’ve worked, a three month project would be considered small, but that’s multiple orders of magnitude larger than any project used in a controlled study), using “modern” statically typed languages, using gradual/optional typing, using modern mainstream IDEs (like VS and Eclipse), using modern radical IDEs (like LightTable), using old school editors (like Emacs and vim), doing maintenance on a non-trivial codebase, doing maintenance with anything resembling a realistic environment, doing maintenance on a codebase you’re already familiar with, etc.

If you look at the internet commentary on these studies, most of them are passed around to justify one viewpoint or another. The Prechelt study on dynamic vs. static, along with the follow-ups on Lisp are perennial favorites of dynamic language advocates, and github mining study has recently become trendy among functional programmers.

Mr.WorshipMe
  • 141
  • 2
0

I honestly do not think that Static vs Dynamic typing is the real question.

I think that there are two parameters that should come first:

  • the expertise level in the language: the more experienced you are, the more you know about the "gotchas" and the more likely you are to avoid them / track them down easily. This is also true about the particular application/program you are working on
  • testing: I love static typing (hell I like programming in C++ :p) but there just so much that a compiler / static analyzer can do for you. It's just impossible to be confident about a program without having tested it. And I am all for fuzzy testing (when applicable), because you just can't think about all possible input combinations.

If you are comfortable in the language, you'll write code and you'll track down bugs with ease.

If you write decoupled code, and test each functionality extensively, then you'll produce well-honed code, and thus you'll be productive (because you cannot qualify as productive if you do not assess the quality of the product, can you ?)

I would therefore deem that the static vs dynamic debate with regard to productivity is quite moot, or at least vastly superseded by other considerations.

Matthieu M.
  • 14,567
  • 4
  • 44
  • 65
  • 2
    If this is a counter-question, where is the question in it? :) I agree that other factors are more important then static vs dynamic typing. However, dynamic typing advocates claim better productivity and static typing advocates claim better code quality. I was wondering whether anyone had actual evidence to support their claims. – Winston Ewert Oct 16 '10 at 18:17
  • @Winston: I removed the counter bit :p As you mentionned it's mostly claims. I think most advocates of dynamic typing are mixing ease of use with dynamic typing, while ease of use is mostly about tools. I do agree that the possibility to write quick throw-away prototypes and to experiment short commands using an interpreter are a productivity boost, but even Haskell (perhaps the language with the most impressive type system of the moment) has an interpreter for quick experimentation :) – Matthieu M. Oct 17 '10 at 10:23
  • But until someone actually does a study that considers this question - whether methodology, tools have a larger impact than language on defect rates, productivity - we just end up comparing anecdotes. – Frank Shearar Oct 20 '10 at 08:22
0

Here are a few:

  • Stefan Hanenberg. 2010. An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications (OOPSLA '10). ACM, New York, NY, USA, 22-35. DOI=10.1145/1869459.1869462 http://doi.acm.org/10.1145/1869459.1869462

  • Daniel P. Delorey, Charles D. Knutson, Scott Chun, "Do Programming Languages Affect Productivity? A Case Study Using Data from Open Source Projects," floss, pp.8, First International Workshop on Emerging Trends in FLOSS Research and Development (FLOSS'07: ICSE Workshops 2007), 2007

  • Daly, M.; Sazawal, V., Foster, J.: Work in Progress: an Empirical Study of Static Typing in Ruby, Workshop on Evaluation and Usability of Programming Languages and Tools (PLATEAU) at ON-WARD 2009.

  • Lutz Prechelt and Walter F. Tichy. 1998. A Controlled Experiment to Assess the Benefits of Procedure Argument Type Checking. IEEE Trans. Softw. Eng. 24, 4 (April 1998), 302-312. DOI=10.1109/32.677186 http://dx.doi.org/10.1109/32.677186

Lorin Hochstein
  • 637
  • 3
  • 13