Is reading javadoc preferable to reading source code to familiarise yourself with a library?

Question

I just came across the following in a lab manual at university:

You do need to study the interfaces of the classes by generating the javadoc for them so you know what operations are provided (feel free to look at the code, but when using somebody else’s code, as here, you should work from the javadoc rather than the code whenever possible).

I don't understand why this is the case; since the javadoc could be out of date, or could describe the function of the code badly. Surely looking at the source code, and reading the javadoc comments is best?

Is there a reason why, or a case when reading only the javadoc is the best thing to do?

In the majority of cases, you won't have a chance at being able to read and understand all the code you would need to. It can also be non-obvious from the code how edge cases are handled. — raptortech97, Feb 04 '15 at 13:32
this has been asked and answered many times before, starting with very first question at this site - [“Comments are a code smell”](http://programmers.stackexchange.com/questions/1/comments-are-a-code-smell) and _multiple_ questions [linked](http://programmers.stackexchange.com/questions/linked/1?lq=1) to it — gnat, Feb 04 '15 at 15:09

score 23 · Accepted Answer · answered Feb 04 '15 at 13:38

23

The recommendation is probably about programming to an interface rather than the implementation.

Sure, if you have access to the code then there's nothing stopping you from looking at the implementation to understand how it works. But you should always make sure that the how doesn't influence your consumption of the API.

When you're consuming an API you're working against an abstraction. Try to concern yourself only with what the API offers (the contract) and not the how (the implementation).

This is because there is no guarantee that an API's implementation won't change drastically from one version to the next, even if the contract has remained unchanged.

answered Feb 04 '15 at 13:38

MetaFight

11,549
3
44
75

2

One of the biggest omissions in a lot of class documentation is a clear specification as to what aspects of a class's apparent behaviors may be legitimately relied upon by consumers (and must not change in future versions of the class), and what behaviors may be legitimately changed and may thus not legitimately be relied upon. For example, while it's expensive for a mapping collection to provide any guaranteed order of enumeration after any items have been deleted, it's cheap to guarantee that as long as no items have *ever* been deleted, items will enumerate in the order they were added. – supercat Feb 04 '15 at 20:12
There are many cases where code may need a to build a mapping collection from a sequence of items and later process items in the original sequence. If the collection guarantees that items will enumerate in the sequence they were added, the original sequence may safely be abandoned, but in the absence of such a guarantee, it must be retained. Documenting a behavior the class would "naturally" abide by would cost the implementation nothing, but would have made the class more useful. – supercat Feb 04 '15 at 20:16
@supercat: That restricts later tweaking / re-writing of the class though. Which means any unfortunate decision can never be corrected. – Deduplicator Feb 06 '15 at 10:21
@Deduplicator: There are a trade-offs; the question to be asked is whether it's worth foregoing a potential benefit for consumers in order to facilitate certain kinds of potential implementation changes. I would suggest that the YAGNI principle would favor giving consumers the benefits unless one can actually articulate the kinds of changes one would want to make, and one would not be able to efficiently accommodate such changes without denying consumers the benefits. Alternatively, one could have an `AddOnlyDictionary` which promises to maintain insertion order and offers... – supercat Feb 06 '15 at 16:01
...one-writer multi-reader thread safety, or figure that if other kinds of dictionary might be needed they could be derived from `Dictionary` and people could migrate toward the new one when writing code that didn't need the old behavior. Note that the ability to of maintaining addition order isn't generally relevant for code which *receives* a `Dictionary` from elsewhere (since a dictionary received from elsewhere may have had an item deleted at some point), but only for code which creates instances via the constructor. In any case, if a dictionary won't honor an addition-order guarantee... – supercat Feb 06 '15 at 16:07
...even when no item has ever been deleted, its enumerator should return items in mixed-up order to prevent anyone from relying upon such behavior. Otherwise, it's likely that code *will* rely upon the insertion-order behavior whether it "should" or not, and changing the enumeration behavior will break code which worked perfectly and usefully before the change. A change which makes code that worked 99.999% of the time work 20% of the time isn't a breaking change, but a change which makes what used to be 100%-working code fail is a breaking change. – supercat Feb 06 '15 at 16:10
@supercat: Sure, there are always tradeoffs. And due to how many people program (it works for me, ship it!), if you don't make sure a nice behavior you don't want to guarantee does not happen to exist in $version in $circumstances, you are always in great peril of uncontrollable compatibility-constraints. Because some will complain over any change, whether it is night-undetectable, makes bugs more obvious, uncovers bugs, or breaks the contract. Interestingly, one can get away with the last much better than the first in many cases. – Deduplicator Feb 06 '15 at 16:58
@Deduplicator: The better class authors are about actually guaranteeing useful behavior which they are, in practice, always going to uphold, the less reason clients will have to make assumptions about non-guaranteed behaviors. Given a choice between code which is "guaranteed" to work, or code which is simpler and (in some cases) much faster, and which will be 100% reliable unless a Framework changes in a fashion that it's unlikely to, there are some sound arguments in favor of the latter, based upon how well one can assess the likelihood of an undesirable change. – supercat Feb 06 '15 at 17:56
@supercat: Well, that's the ideal world. Fact is, for that ever to be true, too many people program by happenstance, not plan and definitely not docs. Sometimes they deign to acknowledge the existence of docs after programming themselves into a corner, at least. Though looking on SO, not opening the docs even than seems to be common too. Still, I guess we are in violent agreement on what should be at a minimum. – Deduplicator Feb 06 '15 at 17:59

score 4 · Answer 2 · edited Apr 12 '17 at 07:31

Aside from the difference between the interface and the implementation, already explained in the previous answer, there is another important aspect: complexity.

Real-life systems are usually complex. If you start reading through the code of a class, you'll find that you should also go and read the code of another class, then another one, etc. A few hours later, you'll be simply lost in all the complexity and won't remember who calls what and in what cases.

When you use only the comments of the interface, you mitigate all this complexity. It might be that under the hood, everything is simple. Or it might be that under the hood, dozens or hundreds of classes interact each other, making it practically impossible to keep the whole image in your head.

You can do an experiment.

Find a part in OpenCV which interests you. Read through the interface documentation. How long does it take to be able to grasp the basics and effectively use the library?
Now look at the implementation. How many classes are called directly by the interface? How many classes are called by each of those classes? How many lines of code are there? How many methods? How long it would take you to explore all this source code before having a stack overflow in your brain?

Main ma, that must be why updates take so long, and why security holes can go for so long without being discovered, because it's so difficult to look over the implementation of a sophisticated program. I tried just looking over the Java source of the first two semesters of a Java programming course. I think there wasn't one class that didn't call at least 2 other classes, and those classes they called also called any number of classes. I was never able to follow a trail of code to it's final completion. It would simply take too long, and it was too difficult to keep track of where I was in und — Progfram, Feb 04 '15 at 18:32

score 0 · Answer 3 · edited May 23 '17 at 11:33

Is there a reason why, or a case when reading only the javadoc is the best thing to do?

While you're entirely correct that the JavaDoc may be out of date or bad, it does tend to be in a better format for reading wholesale than code in an IDE. And more importantly, it's in natural language. That is important for two cases:

People not used to reading code. University students for example, are likely more often be better served by reading natural language descriptions of functions than trying to understand code that they're in the process of learning.
People who do not use English (or languages that use phonetic alphabets at least) as their primary language. Since JavaDoc can work with characters that identifiers can't, it can provide better descriptions of what is going on to those users. JavaDoc in particular seems to even have some ability to localize the documentation for you.

That said, I'm a fairly strong believer in readable code. For experienced developers, I expect reading the code to be a better approach almost all of the time if that option is available.

Is reading javadoc preferable to reading source code to familiarise yourself with a library?

3 Answers3