21

I want to document my code such that there is minimum need for reading and browsing the code again months later.

I know that there are different types of documentation (in source code and outside, sequence diagrams, and so on).

I just want to know what is an efficient way to document my code so that, when a few months later I want to see my code, I spend less time on reading code and understanding code flow.

psmears
  • 188
  • 4
  • 44
    The best method for spending less time reading code later on is to write *clearer* and *more understandable* code. – Mael Mar 08 '18 at 06:43
  • 2
    Possible duplicate of [How to make sure the application source code has a proper documentation for new programmers?](https://softwareengineering.stackexchange.com/questions/150479/how-to-make-sure-the-application-source-code-has-a-proper-documentation-for-new) – Doc Brown Mar 08 '18 at 06:46
  • The documentation depends on the target. If you are addressing to developers, then comments in the code be quite useful. If you are addressing to analysts, diagrams of the overview are useful too. If you are addressing to tech-savy audience, make a user guide. – Laiv Mar 08 '18 at 07:20
  • @Laiv Well in view of a developer of my own code and maybe other developers code. – Reza Akraminejad Mar 08 '18 at 07:57
  • Biggest thing is to keep the code small. If the code needed to implement an item in your issue tracking system is large, then your team may need to learn how to break it up some more so the amount of code reviewed isn't overwhelming. – Berin Loritsch Mar 08 '18 at 21:17
  • Maybe you can comment out sample calls at complicated parts of your code. –  Mar 09 '18 at 07:51

6 Answers6

57

IMO the best documentation is the documentation you don't actually need. I also hate writing documentation and comments.

With that being said:

  • Pick readable and talking names. Don't use n, but instead numberOfItemsFound for example.
  • Don't shy back from storing parts of a calculation in a constant variable rather than pushing everything into one line.
  • Move partial tasks from branches into their own (inline) functions, if you're reusing them or the parent function becomes long and tedious to follow.
  • Be more elaborate and only optimize code over readability where it's really required.
Mario
  • 1,489
  • 2
  • 11
  • 13
  • 20
    [Here's a good metric](https://xkcd.com/1343) for documentation (mandatory link). – Neil Mar 08 '18 at 07:19
  • 4
    This should also be on the list: explain in code **why** you are doing things you're doing. – t3chb0t Mar 08 '18 at 11:59
  • 2
    +1 for the last bullet item, since [premature optimization is (in 97% of the times) the root of all evil](https://softwareengineering.stackexchange.com/questions/80084/is-premature-optimization-really-the-root-of-all-evil) – gmauch Mar 08 '18 at 14:13
  • @stannius A "variable" in programming isn't "a thing that varies", it's a thing with a name. – Nic Mar 08 '18 at 18:20
  • 5
    `numberOfItemsFound` is quite verbose though; too verbose is *also* an issue. – Matthieu M. Mar 08 '18 at 18:48
  • 6
    @MatthieuM., Rarely is "too verbose" a problem with names in code. Too terse or cryptic is a very common problem though. – David Arno Mar 08 '18 at 19:39
  • so basically, a short summary of the proposed doc style of "Clean Code" by Uncle Bob – BlueWizard Mar 09 '18 at 07:03
  • 1
    @MatthieuM. Well, that's also part of the crux. ;) For example, a signature `bool MyImage::loadFromStream(std::istream &stream);` is nice, but the name is more complex than it has to be, since the argument talks for itself. So it's save to just use `bool MyImage::load(std::istream &stream);` But always keep in mind that names are typically just a human thing. After optimization most names become addresses (if at all), no matter how long the original name was. – Mario Mar 09 '18 at 07:11
27

Treat your code as documentation

Your code is your primary documentation. It precisely describes what the resultant app, library or whatever, actually does. As such, any attempts at speed up the understanding of that code has to start with the code itself.

There's lots written on how to write readable code, but some of the key points are:

  • do not rely on comments to explain bad code, make the code better and get rid of the comments,
  • write short focused functions, methods, classes etc,
  • use names appropriate to the context (eg n is good for a loop, longer descriptive names are needed for items with greater scope),
  • treat function names as if they were comments, eg don't use UpdtTbl with a comment explaining it updates the table with the supplied rules when UpdateTableContentsWithSuppliedRules can be used as the name,
  • avoid mutability. Every time you change the contents of a variable, you increase the complexity of the code. Assign that new value to a new variable (with a good name) where feasible.
  • lastly, and most importantly, avoid "clever" code. The only real clever code is code that is easy to read. If you write some complex bit of code and find yourself think "wow, aren't I clever here?", the answer is almost guaranteed to be "no, you aren't".

Become better at reading code

Reading code, regardless of how simple it is, is a learned skill. No one is naturally good at reading code. It takes practice; lots of practice. So, for example, go to Github or whatever and read the code of the libraries that you use, rather than just using those libraries. Find code to read and read it.

Comments are a code smell

Only then do we get to other types of documentation. Firstly, as previously stated, avoid comments. If I come across code containing comments, I prepare for the worst: the code is likely to be bad, and to be honest the comments are likely to be bad too. Someone who can't communicate well through code is unlikely to be able to communicate any better through natural language.

Beware autogenerated API documentation

Also, beware autogenerated API documentation. If I have to resort to reading such docs, it'll be because your code is so hard to read. Again, make the code simple and I can read that directly.

Tests are docs, too

Tests are documentation too. So don't treat your unit tests as a chore. Treat them as a way of communicating with others (your six month's later self being included here) as to how the code works and is intended to be used.

Draw pictures if it helps

If you like UML, then by all means find yourself a good tool and generate UML diagrams from your code. Just never ever ever ever try to use it to generate code. It's not good as a design tool and you will end up with horrible code as a result.

Have a "1000ft" view document

Finally, write yourself an overview document. What does the app do? How does it do it? What other systems does it connect to? Things like that. Do not attempt to describe the code structure here though. Let the code do that. Let this document remind you why you wrote the code in the first place.

David Arno
  • 38,972
  • 9
  • 88
  • 121
  • 15
    I agree with all your point, except that comments do have their place. While I agree there's no point in comments like `add 1 to i`, comments should explain _why_ the code does what it does. For example, the code `if (!something.Exists()) {...}` can use a comment like: `// something exists only when (explanation of the broader scenario)`. – Jonathan Mar 08 '18 at 12:58
  • 1
    @Jonathan. From experience, all such comments can be replaced with extracting out that piece of code into another function and using the function name to explain why. So my "no comments" position stands in my view. – David Arno Mar 08 '18 at 14:01
  • 17
    We've all seen our fair share of `// increment x` `x++;` comments that are of no use, but it's wrong to throw the baby out with the bathwater and declare that comments are always bad. For example, comments of the form `// this case should never happen because xyz` `throw exception "unreachable"`. – angrydust Mar 08 '18 at 14:48
  • 7
    Very nice list. But like @Jonathan. I do not agree with the comments. Some times you have to account for bugs in a third party frameworks though. While this can be refactored into its own function it's still nice to leave a bit of description to why the workaround (bugnumber or bugname/description of the bug) is required. – magu_ Mar 08 '18 at 16:34
  • 17
    @DavidArno But you cannot do that for a comment explaining why something was **not** done. Like `//XXX: Not using straight-forward method Foo here because ...`. Such comments can be immensely valuable, but are impossible to convey with code for obvious reasons. – cmaster - reinstate monica Mar 08 '18 at 16:37
  • 6
    I think, the most important point about comments is, that they should be written at the highest possible abstraction level. No-one profits from comments like `i++; //next index` or `//> parameter 'foo': a Foo instance`, but comments explaining why a class exists, what use cases it was designed to facilitate, what other classes it exists to serve, etc. are valuable. This actually goes quite well with your 1000ft view document, but is at an intermediate level. The more you get down to the implementation, the less comments you should find. – cmaster - reinstate monica Mar 08 '18 at 16:49
  • 7
    I like it even more dramatic: _every comment is a failure to express yourself well in code_. For example, I have a 4 line comment in one method, explaining the workaround for a 3rd party bug. _I failed to express that well in code, so it's in a comment_. I'd say it improved readability tough, because I doubt anybody would enjoy scrolling horizontally to read a _very_ long and _very_ descriptive method name. "Comments are a code smell" - yes, but we have to remember that not everything that smells is sh*t. – R. Schmitz Mar 08 '18 at 18:18
  • 1
    @R.Schmitz, Spot on: code smells are an indication that something *might* be wrong. In most cases it is. There are edge cases though. Comments explaining work-arounds to third party bugs are one such edge case. – David Arno Mar 08 '18 at 18:51
  • 3
    Documentation comments are universally used in public APIs; without them, the API is unlikely to be used. It is a code smell to avoid them, not to add them. For me, "self documented code" is the same as "only the original author can understand or love this code". I don't know why it is, but the less documentation, the more likely the author believes it is easy to understand. – Frank Hileman Mar 08 '18 at 19:15
  • @FrankHileman, nope it really isn't universal. In fact I have a reasonably popular library that contains zero comments. The documentation is hand-written, thus both adds to the information already in the code and avoids cluttering the code with that documentation. Documentation comments are a really crappy way of producing API docs. They're a classic "emperor's new clothes" scenario. – David Arno Mar 08 '18 at 19:34
  • 3
    As a historical comment on the "clever" code: in a plea for clear and straightforward code, [Kernighan & Pike](https://en.wikipedia.org/wiki/The_Practice_of_Programming) argued that (paraphrasing) to debug something, you have to be twice as smart as the person who wrote it. If you write code that's as "clever" as you can be, you are by definition not smart enough to debug it. – Ti Strga Mar 08 '18 at 20:54
  • I agree with most of this answer, but the part about comments is so bad, I have to downvote it. – Jack Aidley Mar 09 '18 at 09:15
  • 1
    @JackAidley, That's fine. My bit about the comments is probably the most important point in my answer as it's the part most folk get wrong. I appreciate others will disagree, but expressing what I've found to be the best approach to writing readable code is way more important to me than avoiding downvotes. – David Arno Mar 09 '18 at 09:19
  • @DavidArno: As it should be! – Jack Aidley Mar 09 '18 at 09:43
  • 1
    @theguywholikeslinux I disagree with the example, because you could write `throw exception "We assumed xyz, so this shouldn't have happened."`. – Eric Mar 09 '18 at 10:05
  • 1
    I like the "1000ft view document" idea. That's the one document I miss most when trying to understand someone else's code. – Ralf Kleberhoff Mar 09 '18 at 11:57
  • @DavidArno Documentation comments are comments that can be extracted from the code, and processed using a tool, that then reformats the contents into a more readable form, usually a type of web format. Having worked on reference documentation the old way, and used documentation comments for 14 years or so, I can say that the automated process has saved me tremendous amounts of time. – Frank Hileman Mar 09 '18 at 22:19
  • @Eric Good point, although I then raise to you cmaster's example of a comment about why something was not done. – angrydust Mar 10 '18 at 15:09
  • I suspect your metric for "readable" is not the same as those reading your code. Comments can be good or bad, just like code. They can be redundant or critical for understanding. Engineering code is often an implementation of an algorithm in a book or paper, with complex formulas. Assuming the reader can understand the (extremely complex) code, without references back to the original source is a typical mistake engineers make in their code. I always ask them to add comments in this case. – Frank Hileman Mar 12 '18 at 21:04
16

I must admit I do not agree with some of the things that the other answers recommended, so I'm going to throw my two cents;

Comments

Documentation is extremely helpful for strangers reading your code. Usually many things will not be verbose enough to be read and understood immediately, and you should then explain what you are doing.

Edit: the discussion in the comment section has pointed out something right – over-commenting is usually done when writing bad code.

Commenting your work should be precise and minimal, but, in my opinion, should definitely be present. At least a comment for every 15 lines of code. For example, on top of blocks on code, add a line about what you're doing:

def login(username: str, password: str, create_session: bool = True):

    # Filter the user we need from the database
    hash = md5(password)
    users = db.table("users", db_entities.USER)
    results = [x for x in users.query(lambda c: c.get("username") == username and c.get("password_hash") == hash)]


    if len(results) == 0:
        return None, None
    else:
        # Create a login session record in the database.
        if create_session:
            sessions = db.table("sessions", db_entities.SESSION)
            ses = sessions.new()
            ses.set("username", username) \
                .set("expiery", 31536000 + time.time())
            sessions.update(ses)
            return results[0], ses
        else:
            return results[0], None

Minimal comments that explain why and what you're doing are very helpful throughout the code. I do not agree with the answer that states

If I come across code containing comments, I prepare for the worst: the code is likely to be bad, and to be honest the comments are likely to be bad too.

Many times, gracefully, good code is documented. It is true that bad programmers see their documentation like "Alright, my code is bad, let's add a few sentences to make it clearer".

Yes, and while this occurs quite a lot, it is also true that good programmers that write clean code also want to make sure that they return to their code and understand why they want their function to behave like that, or why did they need that line that seems a bit redundant, etc...

Yes, comments that explain obvious things, comments that are unclear, comments that were just put together to make sure that "this code is documented, yeah, whatever", are code smell. They make reading the code harder and irritating. (Adding an example below)

# Logging into Gmail when the module is imported
_client = login()
def get_client():
    global _client
    return _client

Example clarification: "No shit, Sherlock. does _client = login() log into the mail service? OMG!"

More clarification: the login() method has no relation to the login() method from the above example.

But comments that do match the standards, explain the why's and not the how's, and answer the right questions, are very very (very) helpful.

Inline comments

One thing you should NOT (and if I could write that bigger, I would) do, is write your comments in the same line of the code. It makes comments very line-specific, which completely misses the purpose of commenting your code.

For example, bad inline comments:

outer = MIMEText(details["message"]) # Constructing a new MIMEText object
outer["To"] = details["to"] # Setting message recipient
outer["From"] = "xAI No-Reply" # Setting message sender
outer["Subject"] = details["subject"] # Setting message subject
outer.preamble = "You will not see this in a MIME-aware mail reader.\n" # I don't know what I'm doing here, I copied this from SO.
msg = outer.as_string() # Getting the string of the message
_client = details["client"] # Assigning the client
_client.sendmail(SENDER, details["to"], msg) # Sending the mail

Would be much easier to read and understand this code without the comments, that make it messy and unreadable.

Instead, comments inside your code should be placed above blocks on code, and they should answer the important questions that may arise while reading the code block.

# Constructing the email object with the values 
# we received from the parameter of send_mail(details)
outer = MIMEText(details["message"])
outer["To"] = details["to"]
outer["From"] = "xAI No-Reply"
outer["Subject"] = details["subject"]
outer.preamble = "You will not see this in a MIME-aware mail reader.\n"
msg = outer.as_string()

# Sending the mail using the global client (obtained using login())
_client = details["client"]
_client.sendmail(SENDER, details["to"], msg)

Much clearer, right? Now you also know that you have to use the login() function and provide the parameters to send_mail() with everything you used. Helps a bit, but one thing is still missing.

Function documentation

Has been widely discussed. You should always let your readers know what your function is about, why and what it does. How it does that, this does not belong to the documentation, but maybe to footnotes of the function.

You should clearly describe what you expect your parameters to be, and if you want them to be obtained/created in a specific method. You should declare what your function should return, what its use is, etc.

Again, that's my opinion and methodology while writing my code. Not only those, but those are just some of the things I could not agree with the other answers about. Oh, and of course, not just the comments read your code out, but your code itself. Write clean code, understandable and maintainable. Think about your future self while coding ;-)

Yotam Salmon
  • 318
  • 1
  • 8
  • 6
    Disagree with a given example - instead of writing many comments in one huge function, you should compose it from many smaller functions with descriptive names, that will act as comments. Without a risk of being out of sync with what the code is _really_ doing. – user11153 Mar 08 '18 at 14:06
  • You might be right. I should not have given an example with a large function. Specifically this function I'd leave long and not break it to smaller functions since it follows a very strict algorithm for reading and parsing a stream, so because it does one basic task I'd leave it together. So yes, a better example can be given, I guess... – Yotam Salmon Mar 08 '18 at 14:14
  • 6
    Finally some sanity. Extracting every piece of code that could use a comment into its own function is how you end up with thousands of functions spread across hundreds of files. – Uyghur Lives Matter Mar 08 '18 at 15:45
  • 1
    @cpburnz So? I don't need to read their implementation, ever, if they are extracted **well** (I am aware of the existence of "ravioli code"). Just don't use 10 levels of nesting `void` functions that "do stuff" on mutable variables. By the same logic, you end up with thousands of classes, package, modules, and namespaces, so better write everything in a few files 20k LOC each. – user11153 Mar 08 '18 at 16:47
  • 2
    That second example is lovely. – Lightness Races in Orbit Mar 08 '18 at 19:20
  • 7
    The comments in the second example are much too verbose. Some of them (e.g. "Did we find anything?") just repeat what the code says and would be better removed. For other, you could gain more readability by refactoring, like making (stream.is_empty()) the loop condition, or moving the check for accept_literals outside. – Frax Mar 08 '18 at 20:10
  • 1
    Seems like my first 2 examples are hated here and I understand why. I took it from code I am writing for a project I'm working on, with fellow developers that are undergraduate. Most likely without a high level of explanations and commenting I would have to actually verbally explain it to everyone, which I don't want. About the factoring — great advice, @Frax. Thanks. Still haven't really taken care for the shape of the code but I'm sure to implement them. – Yotam Salmon Mar 08 '18 at 20:26
  • @YotamSalmon Honestly, I prefer your second example. I've had to dig through too many legacy projects and third-party libraries without any code comments to appreciate comments that explain what's happening and why. – Uyghur Lives Matter Mar 08 '18 at 23:40
  • 3
    @cpburnz, "I've had to dig through too many legacy projects and third-party libraries without any code comments to appreciate comments that explain what's happening and why". exactly my point all along: comments are there to explain crap code. Since the question is "how do I write code that easy to read" then clearly this answer is wrong as it focuses on writing comments to explain bad code, rather than writing good code in the first place. – David Arno Mar 09 '18 at 09:16
  • @DavidArno I don't see it as you see it, I think. I see comments as there to explain code. Not only bad code. They are there to help you flow through the code faster and more easily. – Yotam Salmon Mar 09 '18 at 09:31
  • @YotamSalmon, good code explains itself. In such cases, comments become a hindrance to reading that code. Comments are therefore only beneficial in explaining difficult to understand code. But in most cases, fixing the code so its easier to understand is the better thing to do. – David Arno Mar 09 '18 at 09:35
  • 1
    I find comments never get the same love as the rest of the code does. They invariably get out of sync with the code and end up lying to the reader. Frankly, when i run into a comment, i have to double-check whether it's actually even effing true anymore, which means more work. :P – cHao Mar 09 '18 at 10:35
  • 1
    The problem with the "bad" examples from your post isn't the lack or bad quality of comments, but the poor readability of the code. – Ralf Kleberhoff Mar 09 '18 at 12:02
  • `// Did we find anything? expression_found = lexicon.hasOwnProperty(note);` is exactly equivalent to `// Increment i ++i;` – Jeremy Mar 09 '18 at 13:12
  • The comment `// If the note is optional, we can just give up on it.` on the else clause does nothing to help me understand why `(accept_literals && isNum(note))` being false means that the note is optional. – Jeremy Mar 09 '18 at 13:18
  • 1
    @DavidArno I think this answer provides an alternative perspective from the current top two answers which advocate: the best comments are no comments, and self-documenting code is the panacea (in my experience self-documenting code still doesn't explain why and frequently lacks any explanation). Most of the advice is good, but I disagree when it comes to comments because they're frequently touted as being an "anti-pattern". – Uyghur Lives Matter Mar 09 '18 at 14:40
  • @RalfKleberhoff Well you must be right. I updated my example. I meant that there should always be comments (personal opinion) but just to describe what you're doing, *minimally*. I hope the new example clarifies my opinion better. – Yotam Salmon Mar 09 '18 at 15:03
  • 2
    I have seen entire code bases thrown out due to a lack of documentation. It depends on how well it is written, but when the original developers are unavailable, documentation is usually critical. Comments are merely inline documentation. Like all other documentation, they can either be well done or poorly done; we cannot make general statements, such as "comments are a code smell". That in itself is smelly to me. – Frank Hileman Mar 09 '18 at 22:22
5

Provide a cover letter

Unless you are in a very technical domain, most questions around the code will not be about the 'how' but about the 'why' or the 'what'.

As such, the way to reduce people from having to look in your code, is to write a short description of it. The advantage of this is that you can compile an overview of descriptions quite easily, and that this is much more accesible. (Even to people who won't/are not allowed to see the code).

Even if people are technical, the cover letter should offer guidance of where they should be looking for something.

Simple extremely minimalistic points:

  1. Introduction, why does this code (base) exist
  2. What function does the code subset fulfill
  3. Where is the code (script name for instance)

Example

  1. This set of scripts scrapes StackOverflow and upvotes answers by Dennis Jaheruddin
  2. a. This script is responsible for parsing the html, and analyze whether it is the right user
  3. a. The script is found at: ScrapeAndVote/RecognizeDennis.scr
1

The biggest speed gain I usually get from building separate commits that each represent an intermediate step that compiles and works.

So if I have to introduce a new parameter to a function in order to implement a specific feature, then there is one commit that does nothing but add the parameter in the declaration, in the definition and at all call sites. Then, the next commit introduces functionality, and the third updates the call sites that make use of the new feature.

This is easy to review, because the purely mechanical changes can be glanced over quickly, and then get out of the way.

Similarly, if you reformat code, that should always be a separate commit.

Simon Richter
  • 1,568
  • 9
  • 10
1

Although there are one or two apparent points of disagreement among the existing answers, if only in emphasis, I'll try to summarise the usual advice in a way that makes clear where everyone's been coming from:

  1. Firstly, write clean code; any other "documentation" will take care of itself after that. Clean code is a whole set of principles to learn in the first place: single-responsibility classes, short methods that do one thing, good variable and method names, better class/type names than these by focusing on metaphors (e.g. call a MultiButtSupporter a sofa), unit tests to indicate requirements, DRY, SOLID, a consistent paradigm and so on.
  2. Code reveals how code works; comments reveal why code works. For example, explain a +1 with "prevents an off by 1 error", or some complicated formula with "derived in this textbook or webpage".
  3. Whatever you've been doing with comments, point 1 above may well achieve that in clean code. See comments as failures/necessary evils, or even lies if over time they get out of sync with code as both are edited. Comments shouldn't compensate for badly written code, because why would comments be written with any more talent or care than the code was?

On the other hand, if anything I probably err too far the other way, almost never using comments. Your code reviewers will let you know if you've got the balance in the wrong place for them, but if you make a conscious effort to follow the above 3-point plan you'll probably be close to their optimum anyway.

J.G.
  • 325
  • 1
  • 7
  • 2
    How is a "prevents an off by 1 error" comment different from a comment that says "the +1 is not a typo" or "I am not aware of off by one errors in my program"? (Useful comments generally relate to something bigger than +1 in source code, or to something outside the source code.) So that still leaves "derived in this textbook or webpage" as a valid and actually great example in your point #2. Then your point #3 seems to suggest that you might be able to express "derived in this textbook or webpage" using clean enough code without any comments; wow, I'd like to see that in action. – Jirka Hanika Mar 09 '18 at 09:46
  • @JirkaHanika Maybe off-by-one was a bad example. As for 3, what I meant was "each may" rather than "maybe each"; so no, I don't think code alone can clarify such things. (Well, you could try gaussianFromThisTextbookNamesApproximation as a variable name, but that's a bad idea!) – J.G. Mar 09 '18 at 10:53