59

There are very complex open source projects out there, and to some of them I think I could make some contributions, and I wish I could, but the barrier to entry is too high for a single reason: for changing one line of code at a big project you have to understand all of it.

You don't need to read all the code (even if you read, it won't be sufficient) and understand all every single line does and why, because the code probably is modularized and compartimentized, so there are abstractions in place, but even then you need to get an overview of the project so you can know where are the modules, where does one module interface with other, what exactly each module do and why, and in which directories and files are each of these things happening.

I'm calling this code overview, as the name of a section that open source projects could have in the website or documentation explaining their code to outsiders. I think it would benefit potential contributors, as they would be able to identify places where they could build, the actual primary coders involved, as they would be able to, while writing everything, reorganize their minds, and would help users, as they would be help to understand and better report bugs they experience and maybe even become contributors.

But still I have never seen one of these "code overviews". Why? Are there things like these and I'm missing them? Things that do the same job as I am describing? Or is this a completely useless idea, as everybody, except for me, can understand projects with thousands lines of code easily?

Peter Mortensen
  • 1,050
  • 2
  • 12
  • 14
fiatjaf
  • 727
  • 5
  • 11
  • 7
    You mean a design document? I've seen the rare project with a description of each package but that's usually an API already. – ratchet freak Nov 30 '14 at 18:28
  • 1
    Watch out for some smelly code if you have all these implicit dependencies across the system. Above from that, I do agree that it would be vale in maintaining some high-level map to most software projects. If that is not easy to do for the ones involved, something is probably wrong. – Alex Nov 30 '14 at 19:21
  • 14
    Why? Because there are few projects whose maintainers want to invest the effort to write and maintain high-quality documentation, and often they may not understand the benefits either. – Alex D Nov 30 '14 at 19:39
  • 9
    Documentation can be out-of-date or inaccurate relative to actual behavior. Code can't. So most projects prefer code. – Paul Draper Dec 01 '14 at 08:31
  • 5
    Also it's easy to underestimate how much you can learn about a project if you set a kitchen timer for 2 hours or so and Just Read It (tm). – Kos Dec 01 '14 at 09:57
  • Well, some applications do have a code overview. There's the [Architecture of Open Source Applications](http://aosabook.org/en/index.html) series. – Frank Kusters Dec 01 '14 at 10:17
  • 3
    @PaulDraper But a code _overview_ should be stable enough to describe the bird-eyes view of the system. If the system is consciously architected, there should be some kind of documentation to at least mitigate architectural erosion. I've never heard any software architect suggest that relying on the code is enough, on the contrary rather - they beg for at least a basic architectural documentation. – Alex Dec 01 '14 at 11:20
  • I fail to understand how one can approach a mutlti-person project without requirements analysis, a carefully considered software architecture and good, detailed design. Take it from 35+ years of s/w development experience, if you don't do that it will severely lower your code quality and lengthen your project. Your only currant hope of comprehension is to run DoxyGen over it, even if there are no DoxyGen comments in it. – Mawg says reinstate Monica Dec 01 '14 at 12:19
  • A high level overview is essential in medium and large projects. A simple flowchart or class diagram can make a huge improvement to the understanding of the project. Unfortunately, even something simple like that is often not provided. – Mast Dec 01 '14 at 12:20
  • 43
    Welcome to the community-driven world: _if it's not done, that's because you haven't done it_ :) – mgarciaisaia Dec 01 '14 at 15:17
  • 1
    Most open source contributors are not paid to contribute, and they are going to do what they find most rewarding to do during their limited free time - that would be coding, not writing docs. – GrandmasterB Dec 01 '14 at 19:03
  • 1
    This doesn't only affect open source projects, I've rarely seen commercial code projects having an overview document. Especially not one that's up to date. – Matthew Lock Dec 02 '14 at 01:39
  • The premise of the question is based on a broad generalization. Example: "Why don't Californians eat bananas?" Some of us might actually eat bananas. Similarly, some open-source projects might actually have code overviews. The question should be reworded/changed to be less presumptuous. – Jackson Dec 02 '14 at 10:22
  • Writing good documentation is hard and time consuming. There is an Open Source project which tries to overcome this problem: http://arc42.org/ - they have a template for writing good architecture documentation... (yes, I am a contributor to this project...) – rdmueller Dec 04 '14 at 21:48
  • @GrandmasterB: You're wrong. A lot of open-source projects have very comprehensive documentation on how to use the software, tutorials etc. – fiatjaf Dec 05 '14 at 16:38
  • @mgarciaisaia: I have to write code overviews about every project? Even the ones I'm trying to understand and, because of that, _searching_ for a code overview written by someone else? – fiatjaf Dec 05 '14 at 16:39
  • 1
    @fiatjaf What I said and what you said are not mutually exclusive. There are a lot of well documented projects. But most projects aren't documented for the reason I stated. – GrandmasterB Dec 05 '14 at 17:28
  • 1
    @fiatjaf: read it the proper way - it's not about it being your fault that the docs doesn't exist, but it's your responsibility, and mine, and everyone else's if it doesn't exist. – mgarciaisaia Dec 05 '14 at 17:51

6 Answers6

59

Because it's extra effort to create and maintain such a document, and too many people don't understand the associated benefits.

Many programmers aren't good technical writers (although many are); they rarely write documents strictly for human consumption, therefore they don't have practice and don't like doing it. Writing a code overview takes time that you can't spend on coding, and the initial benefit to a project is always greater if you can say "We support all three encoding variants" rather than "We have really neat explanations of our code!" The notion that such a document will attract more developers so that in the long run more code will get written isn't exactly foreign to them, but it's perceived as an uncertain gamble; will this text really make the difference between snagging a collaborator or not? If I keep coding right now, we will certainly get this thing done.

A code overview document can also make people feel defensive; it's hard to describe higher-level decisions without feeling the need to justify them, and very often people make decisions without a reason that "sounds good enough" when actually written own. There's also an effect related to the aforementioned one: since updating the text to suit the changing code causes additional effort, this can discourage sweeping changes to the code. Sometimes this stability is a good thing, but if the code really does need a mid-level rewrite, it turns into a liability.

Kilian Foth
  • 107,706
  • 45
  • 295
  • 310
  • So there is such a thing, the concept of "code overview", then? Although nobody is doing it, it has been done in the past and people still consider doing it sometimes? It would make me feel better if you said this isn't a nouveau idea of me. – fiatjaf Nov 30 '14 at 22:30
  • 6
    Well, it seems the answer is yes: https://gnunet.org/gnunet-source-overview – fiatjaf Nov 30 '14 at 23:02
  • 5
    If you want it to exist, volunteer to write it. The whole point of open-source projects is that people can and should contribute what they can, subject to the community agreeing that it's worth integrating. – keshlam Dec 01 '14 at 05:37
  • 8
    @keshlam - that makes sense if you're **already** a contributor to the project... but if you're a potential contributor who is trying to get a basic idea of how the code works, you're the worst person possible to write that document.... – Jon Story Dec 01 '14 at 11:49
  • 13
    @JonStory Your point is a good one, but in practice I've found the opposite is true sometimes, too. In some projects I've ended up writing a bunch of documentation based on notes I made while learning an undocumented code base. It was better documentation because I had to start at the API I could see and then dig deeper and deeper. The developers who had written the code already had a model of the code in their heads, and so had lots of assumptions about what someone would already know. Documentation by someone new to the project *can* be better documentation for someone new to the project. – Joshua Taylor Dec 01 '14 at 13:09
  • 6
    @JonStory: If you're getting involved in a less-than-pefectly-documented project, you're going to have to start figuring this out anyway. Making your notes part of the project helps save the next person work. (I don't know that anyone would use the presence or absence of docs as a deciding factor on whether to contribute.) Simply improving the javadoc comments (or equivalent) can be a valuable way to start contributing. Seriously, that's the basic principle behind open-source: If you see something that needs to be done, DO it rather than waiting for someone else to. – keshlam Dec 01 '14 at 13:13
  • 1
    And although not always easy, writing this stuff is *much* easier than maintaining it. You have a document that not all contributors have necessarily ever read (alternative: you have to somehow enforce that all existing and new contributors read it), and probably aren't familiar with, and each change they make *may or may not* affect something stated in the document, and they can't tell just by looking at the code they're modifying. Function-by-function documentation is much easier to maintain because (assuming normal doc tools) it's right there in the code. – Steve Jessop Dec 01 '14 at 18:53
14

The dry, harsh truth?

Documentation is not made because projects can do without it.

Even open source projects often face stiff competition. Most of such projects don't start with large shoulders, they start off a bright idea, often a one man bright idea.

As such, they can't afford the time and costs of hiring human documentors, even if they offered to cooperate for free. A documented project, infact, has usually gone through several beginning iterations first. It often starts with 1-3, maybe 5 guys writing their novel idea down and showing it to the world as a proof of concept. If the idea proves good then "followers" may add, they tend to start asking for extensions, new options, translations... At this point the code is still a prototype, usually with hard coded options and messages.

Not all open source projects go beyond this phase, only those that break the "critical mass" needed to attract public interest. Moreover, one of the beginning developers has to "think big and far" and plan for expansions and so on. He might as well become the project "evangelist" and sometimes also "project manager" (other times it's different people). That's a necessary step to bring the project up, from proof of concept to an industry established reality.

Even then, the project manager may opt to not create documentation.

A dynamic, growing project would be both slowed down and documentation would really lag behind the code while it's being enhanced really hard, to implement translations, options, plug in managers...

What usually happens is:

  1. A brief introductory document is made, about what the project is and where it's going to (the famous "roadmap").
  2. If possible, an API is developed and that one is elected as "documented code" over the bulk of the underlying code.
  3. Expecially the API but also the other code are reformatted and "PHPdoc" / "Javadoc" etc. special comments are added. They offer a decent compromise between time spent and reward: even a modest programmer usually knows how to write an one liner describing his functions, parameters get "auto" documented as well and the whole is tied to its pertaining code and thus they avoid documentation "desyncing" and lagging behing development.
  4. Most often, a forum gets created. It's a powerful social media where end users and programmers may talk each other (and between their peers, possibly in "devs only" subforums). This allows a lot of knowledge to slowly emerge and getting consolidated by community made (read: not weighing on the developers team) FAQs and HOWTOs.
  5. In really large projects, a wiki is also produced. I state "large projects" because they are often those with enough followers to create a wiki (a dev does) and then actually fill it beyond the "bare bones" (the community does).
Dario Fumagalli
  • 249
  • 1
  • 4
  • 2
    WOW!! we live (and work) in two totally different worlds. Wherever you are currently working, get out of there fast & find a company (there are many) where it gets done correctly because that actually saves you money. Don't ever let pointy headed managers / cowboy coders try to tell you otherwise. – Mawg says reinstate Monica Dec 01 '14 at 12:21
  • 6
    +1, I agree with almost all of your points, the only statement I strongly reject is that *parameters get "auto" documented*. When we think of explanations rather than the mere syntax/type constraints, *nothing* gets "auto-documented"; a generated comment in the style *Returns the X.* for a *getX* method is not helpful documentation, it's just a filler without any extra information. – O. R. Mapper Dec 01 '14 at 13:29
  • 3
    @Mawg providing good documentation is an investment, you forego developer time in return for (hopefully) more contributors in the future, and some other benefits. But like many of its kind, it's only worthwhile if you know there's a good chance the project will succeed, and most software projects fail. It's important to be aware of survivorship bias when you lament the lack of documentation in successful projects. – congusbongus Dec 02 '14 at 00:13
  • Isn't it possible that those projects fail because they don't document? And by document, I mean plan, so that you understand, rather than sit down at the keyboard & pound away. Here's my estimate for a project life-cycle, all figures +/- 5%. Up front stuff (requirements & use cases, architecture, detailed design) 50%, coding 10 to 15%, testing, the rest. "If you fail to plan, you plan to fail" – Mawg says reinstate Monica Dec 02 '14 at 21:12
6

Overview documents such as you describe are rare even on commercial projects. They require extra effort with little value for the developers. Also developers tend not to write documentation unless they really need to. Some projects are lucky to have members who are good at technical writing, and as a result have good user documentation. Developer documentation if it exists, is unlikely to be updated to reflect code changes.

Any well organized project will have a directory tree which is relatively self-explanatory. Some projects will document this hierarchy and/or the reasons it was chosen. Many projects follow relatively standard code layouts, so if you understand one you will understand the layout of other projects using the same layout.

To change a line of code you need a limited understanding of the surrounding code. You should never have to understand the whole code base in order do so. If you have a reasonable idea of the kind of function that is broken, it is often possible to navigate the directory hierarchy rather quickly.

To change a line of code you need to understand the method within which the line is found. If you understand what the expected behavior of the method is, you should be able to make corrective changes, or extensions to the functionality.

For languages which provide scoping, you can refactor private scoped methods. In this case you will be may need to change callers as well as the refactor method or methods. This requires a broader, but still limited, understanding of the code base.

See my article Adding SHA-2 to tinyca for an example of how such changes can be done. I have an extremely limited understanding of the code used to generate the interface.

BillThor
  • 6,232
  • 17
  • 17
  • 1
    The important point here wasn't to assert how much you need to know about the code in order to make a change. Of course this will depend on a lot of things, but you'll never need to understand the whole code, neither an **overview** will give you that understanding, but even to *find* the line of code you'll change you need a certain knowledge of the general project structure. – fiatjaf Nov 30 '14 at 22:19
  • +1 There is nothing special about open source. In my over 10 years experience working in industry I've never once seen an overview document. What typically happens is that employers expect the first month of your employment to have zero productivity because you're studying the codebase. "Overviews" are usually implemented as asking your co-workers questions – slebetman Dec 02 '14 at 03:45
5

Are there things like these and I'm missing them? Things that do the same job as I am describing?

There is an excellent book called The Architecture of Open Source Applications that provides detailed descriptions of a variety of high-profile open source software projects. However, I'm not sure if it exactly fills the role you're imagining, because I believe its primary audience is intended to be developers looking for patterns to follow when creating their own applications, not new contributors to the projects featured in the book (though I'm sure it could be helpful there).

bjmc
  • 159
  • 3
  • this reads more like a comment, see [answer] – gnat Dec 01 '14 at 19:14
  • 4
    I don't find your comment constructive. What, specifically, do you feel is lacking? Many of the other answers here are lengthy speculation about possible reasons why developers might not write overview documentation. I've linked to a specific example of good overview documents. – bjmc Dec 01 '14 at 20:06
  • 1
    I feel an answer to the question asked is lacking, "Why aren't there code overviews for open-source projects?" – gnat Dec 01 '14 at 20:08
  • 3
    I would argue it's not possible to respond accurately to the question as written when, in fact, there *are* code overviews for some open-source projects. I've edited my answer to make it clear that I'm narrowly responding to a request for examples the user may have missed. – bjmc Dec 01 '14 at 20:14
  • 1
    The question as written asks "Are there things like these and I'm missing them?" This answer responds definitively, pointing to an existing collection of such code overviews. As such I think it's a great (and appropriate) answer to the question. – Jim Garrison Dec 01 '14 at 20:40
  • @JimGarrison see **[Your answer is in another castle: when is an answer not an answer?](http://meta.stackexchange.com/q/225370/165773)** – gnat Dec 02 '14 at 08:58
4

Because there are far more open-source programmers than open-source technical writers.

Documentation takes maintenance and time to keep up to date. The more bulky the documentation, the more it takes. And documentation that isn't in sync with the code is worse than useless: it misleads and conceals instead of revealing.

A well documented code base is better than one less documented, but documentation can easily take as long as writing the code in the first place. So your question is, is it better to have a well documented code base, or a code base that is twice as large? Is the cost to keep the documentation up to date whenever code changes worth the contributions of extra developers it may or may not bring?

Shipping code wins. Reducing the amount of effort put into things other than shipping code can make code ship more often, and be more likely to ship before it runs out of resources.

This doesn't mean that things beside shipping matter. Documentation adds value to the project, and with a large enough project the interconnect cost of adding another developer might be far higher than adding a documentor. And, as noted, documentation can increase investment in the project (by making it easier for new programmers to join).

However, nothing sells like success: a project that isn't working or doing anything interesting rarely attracts developers either.

Documentation of a code base is a form of meta-work. You can spend a lot of time writing up fancy documents describing a code base that doesn't do much of value, or you can spend time making stuff that consumers of your code base want and make your code base have value.

Sometimes making things harder makes those who do the task better. Either due to a higher degree of commitment to the project (spending hours upon hours learning the architecture), or because of skill bias (if you are already an expert in related tech, getting up to speed will be faster, so the barrier of lack of such documentation is less important: thus more experts join the team, and fewer beginners).

Finally, for reasons noted above the current developers are likely to be experts on the code base. Writing such documentation doesn't help them understand the code base much, as they already have the knowledge, it only helps other developers. Much of open source development is based off of "scratching an itch" that the developer has with the code: lack of documentation that already says what the developer knows rarely itches.

Yakk
  • 2,121
  • 11
  • 10
  • +1 "documentation can easily take as long as writing the code in the first place" -- or longer! – Marco Dec 04 '14 at 14:55
-1

Besides being extra effort, some open source project are crippling their documentations on purpose, in order to get freelancing jobs for their maintainers (to implement something, or to hold trainings). Not only they don't have code overview, but their API and tutorials are bad or missing lots of things.

Just to name one quite popular : bluez. Good luck finding a good tutorial, other then to scan for nearby devices.

BЈовић
  • 13,981
  • 8
  • 61
  • 81
  • 8
    No matter how many examples you can list for badly documented open source projects, in my opinion, the claim that they "are crippling their documentations on purpose" needs to be supported by conclusive evidence (and even then it probably doesn't hold as a general statement). – O. R. Mapper Dec 01 '14 at 13:32
  • @O.R.Mapper Lets start with ["Bluez - greatest linux mystery"](http://www.danielhnyk.cz/blog/view/bluetooth-linux-indecipherable-mystery). As the only bluetooth library for linux, I find it hard to believe that it as not documentation because it is an extra effort. Hell, there is doxygen, and how hard is to write simple tutorials? – BЈовић Dec 02 '14 at 12:06
  • @O.R.Mapper Then there is linux kernel. If you are missing something (like a kernel driver), if your company is missing the expertise, you can either hire someone, or find a freelancer or a company that will do it for you. So, it is open source, but it is coming with a price – BЈовић Dec 02 '14 at 12:08
  • @O.R.Mapper Then there are open source project, with documentation in paper format. So you buy a book, and there are given no other documentation. Is this documentation crippling, or not? – BЈовић Dec 02 '14 at 12:10
  • @O.R.Mapper Then there are projects, where you need to pay for a training. I can only think of codesys, which is not open source. Good luck reading their "documentation" – BЈовић Dec 02 '14 at 12:11
  • All of the examples listed may be examples of poor documentation, but you have yet to present any conclusive evidence for the (so far, rather far-fetched) story about projects regularly crippling their documentation on purpose. – O. R. Mapper Dec 02 '14 at 12:20
  • @O.R.Mapper Yes, what I wrote is my personal opinion, without a concrete proof. If any project openly writes that they are going to cripple the documentation, in order to get freelancing work - no matter how good idea, it is going to fail right there, in the start. – BЈовић Dec 02 '14 at 12:41
  • Statements by former OSS developers who have become disillusioned with OSS and revealed the practices formerly used in some now abandoned projects, declarations of projects who have decided to change their methods, or simply statements with a less crass wording than "cripple the documentation" would be more believable than mere speculation, just as well. – O. R. Mapper Dec 02 '14 at 12:47
  • @O.R.Mapper Yes, it would. However, internet is a huge place and I couldn't find such article or blog. BTW English is not my mother tongue, and I am not sure how to express "cripple the documentation" in a less crass wording. – BЈовић Dec 02 '14 at 12:54
  • 2
    For what it's worth, i've seen enough profiteering off of shoddy documentation to at least *wonder* whether it's intentional. When the same groups putting half-assed documentation online are more than happy to sell you a book or a training class, it doesn't take much cynicism at all to reach that conclusion. – cHao Dec 02 '14 at 14:27