The source is the documentation - part 1

Question

Searching the specification for the btrfs file system i stumbled once more over the catchphrase: "The code is the documentation". This implies that changing the code changes the specification.

Does this approach make sense while working on a file system that will be used from various drivers or applications?

I can understand that a smal group can split the job and everyone works on her own task. But how should this work with a larger group, spread over several places?

(btrfs spec would still be welcome)

possible duplicate of [What's with the aversion to documentation in the industry?](http://programmers.stackexchange.com/questions/202167/whats-with-the-aversion-to-documentation-in-the-industry) — gnat, Nov 08 '14 at 21:39
For something like a file system, I would expect a reference implementation and/or a test suite, which would be the documentation. — jmoreno, Nov 08 '14 at 22:37
@jmoreno they are different approachs. For a filesystem, the specification would be the POSIX functions. You are building a module that must be used by code already existing (in difference with the usual task of building a module that will be used from programs still to be written). — SJuan76, Nov 09 '14 at 17:37
@gnat I've read the question about the aversion before posting my question. I have not asked why people DON'T document. I like to know HOW a larger group can work together without to get in each others way and that without specification. — Kitana, Nov 09 '14 at 19:21
If the source is the documentation, then all bugs are documented features. You can mess up everything, it is still correct because it is as documented. There must be somewhere where you say what you want that is different from what you have. The code is the documentation, maybe for an API. — Florian F, Nov 10 '14 at 23:00

Stephen C · Answer 1 · 2014-11-10T22:07:26.603

3

I can understand that a small group can split the job and everyone works on her own task. But how should this work with a larger group, spread over several places?

Well if you think about it, it works the same way that all distributed development projects work. The developers uses tools (e.g. distributed version control) and procedures that are tried and tested for this kind of situation.

And the evidence is that this approach really does work ... if done properly.

One thing that works in the favour of the "the code is the spec" approach for a file system, is that a Linux file system only requires a single master implementation. From the developers perspective, there is no need for multiple implementations of (say) BTRFS, and certainly no need for independent (e.g. clean-room) re-implementations. If you look at it from their point of view, there is no value to them in writing a spec, setting up a committee to manage changes to the spec, and constraining themselves to conforming to the spec.

jmoreno comments:

For something like a file system, I would expect a reference implementation and/or a test suite, which would be the documentation.

You could say that the master implementation is the reference implementation, and the master test suite is the reference test suite. The only issue is that the test suite will have been designed as a functionality test suite for the master (reference) implementation rather than as a compliance test suite for all possible implementations of a (hypothetical) spec.

edited Nov 10 '14 at 22:07

answered Nov 09 '14 at 00:40

Stephen C

25,180
6
64
87

I disagree with the example of the filesystem... after all, the definition of the Posix functions provides an standard API, so it is not like the different file systems are providing different functions (apart from the ones they chose to implement specifically9, just different implementations of the same functions – SJuan76 Nov 09 '14 at 17:39
To support a filesystem on a different OS i don't get around to re-implement it. If that filesystem is not already supported by this non Linux OS, i also can not rely on any API. In such a situation an OS independent specification would be highly welcome. – Kitana Nov 09 '14 at 19:35
@Kitana - The point is that that support of (say) BTRFS on (say) Windows is not something that the developers of BTRFS are interested in. But I'm sure that if >>you<< wanted to document and publich the file system layout, invariants, etc as part of your project to port BTRFS, they would not stand in your way. – Stephen C Nov 10 '14 at 21:55
@SJuan76 - We are not talking about file system APIs. We are talking about the design; i.e. what goes into the disk blocks on a BTRFS file system, the procedures / invariants for doing various operations. Internal APIs are not part of a spec of the file system. (They might be the spec of a file system **implementation** ... but that serves a different purpose, and "the code is the spec" is standard practice for internal APIs in a module codebase that is maybe a few tens of thousand lines.) – Stephen C Nov 10 '14 at 22:00
@Stephen C - I agree that a test suite could be part of the answer, if i could i would give 1+. But these tests have to be writen in advance to be useful as a specification to write the actual code – Kitana Nov 12 '14 at 11:28
Arrrgh, 5 minutes over. A version control system does help after i did it wrong, but it doesn't help me sitting there staring at the code editor. Mailing lists may help to discuss a topic, but later one has a hard time to find the definite conclusions in these often large heaps of text. – Kitana Nov 12 '14 at 11:36
@Kitana - You misunderstood my point about the test suite. I am saying that a test suite probably does exist, but it will have been written for a different purpose. The tests are *unlikely* to have been designed or written as a specification. Even using them as a compliance test suite is likely to be problematic, because they are likely to be coded to work against internal APIs, and assume implementation-specific behaviour that *need not* be exhibited by a different implementation of the file system. – Stephen C Nov 13 '14 at 10:18

score 1 · Answer 2 · answered Nov 10 '14 at 22:38

I disagree: The tests are the documentation/specification. The code is the documentation of how it was implemented.

Tests are stable, long lived, publishable and legible by most (you don't publish the test code, just what is being tested). The implementation is impermanent, or at least there is no requirement or expectation that it be permanent, and code is not realistically publishable (only developers can decipher it).

My team uses Spock testing framework, which can publish reports. We publish these reports as our specifications - the name of the test describes the behaviour. Whenever a spec change is requested, we create a test that captures the desired behaviour, then code to make that test go green (BDD).

Btw I can recommend Spock.

The source is the documentation - part 1

2 Answers2