How to understand an OS's source?

Question

I usually understand code by fiddling around, or by inserting printfs here and there, but this time around, that isn't possible, since the end product is an OS, so any suggestions on how to tackle this problem?

Possible duplicate of [How do you dive into large code bases?](http://programmers.stackexchange.com/questions/6395/how-do-you-dive-into-large-code-bases) — Robert Harvey, Oct 03 '16 at 15:46
@RobertHarvey As I said before, the conventional techniques used by me and those suggested by the linked lost are not really applicable to an OS, since you can't run a debugger or insert print statements... — Shahe Ansar, Oct 03 '16 at 15:48
@RobertHarvey Yes, it was written by my friend, part of it was supposed to be written by me, but the code was too spread out and scarcely documented for me to get in — Shahe Ansar, Oct 03 '16 at 15:50
OS-coding is hard. The best thing you can do is to be mentally prepared for the fact that it is _really hard_. Don't be afraid to do mistakes, they _will_ happen! — T. Sar, Oct 03 '16 at 15:52
@ThalesPereira Nah, I have already coded another OS from the ground up, but it was by myself, so I had a pretty good idea of how the code fit together, this time around that isn't the case.. — Shahe Ansar, Oct 03 '16 at 15:53
Then I don't see any distinction. Source is source; you read it and understand it the same way you would any other source. If you want an actual answer, your question needs to be more specific than this. — Robert Harvey, Oct 03 '16 at 15:54
@RobertHarvey If it had a documentation, yes, but reading the source of a pretty advanced OS isn't quite so easy without a way to print messages for me, since that's how I understand code — Shahe Ansar, Oct 03 '16 at 15:56
I'm not sure how helpful we can be in this context. The best advice I can personally give is "read a good OS book." — Robert Harvey, Oct 03 '16 at 16:10
How can we help you in three paragraphs or less? I'll just ask it that way. — Robert Harvey, Oct 03 '16 at 16:20
*"that isn't possible, since the end product is an OS"* - well, what kind of visible I/O **is** possible for your system? There must be at least something, otherwise your device/program/whatever it is which you forgot to tell us would be pretty useless. — Doc Brown, Oct 04 '16 at 10:57
@DocBrown Yes the VGA driver does work,but most of my work involves replacing it with VESA... So yeah... — Shahe Ansar, Oct 04 '16 at 13:20
@ShaheAnsar: well, you need to use for debugging/logging/tracing whatever is available. When you can display things on a gfx card, use that. If you can send some output to a file, a network, shared memory, or simply an LED display on your device, use that. If you don't have better tools available, think how you can create your own. And it would be surely a good idea to talk to your friend who created the OS, how he did the debugging. — Doc Brown, Oct 04 '16 at 14:44
@DocBrown The thing is, he didn't debug it, so it's a big buggy mess right now, and he himself has no idea about what exactly is going on.. — Shahe Ansar, Oct 04 '16 at 15:51

Useless · Accepted Answer · 2016-10-03T16:33:08.943

This isn't really specific to an OS, but to any large and/or complex codebase. This remark:

I usually understand code by fiddling around, or by inserting printfs here and there ....

(and from comments)

... without a way to print messages for me, since that's how I understand code

sound worrying.

Adding print statements can be a good way to build your intuition about how something works, either when you're starting out, or it's too much trouble to get a debugger to stop in the code you're worried about. It's a good technique.

But, it's not the only technique, isn't always available (as you've found), and you can't rely on using it as a crutch.

When you can't use print statements to help you follow the execution path, or see intermediate values, you just need to think - hard! - or write it down on paper, and figure it out yourself.

The intuition you built with the help of print statements in easier settings should help with this. Think if those print statements as a scaffolding, which helped develop your skills and intuition, to the point where you don't need the scaffolding any more.

Obviously there's no harm in going back to using it in problems where it is available, if it's quicker. But you'll be choosing it as the right tool for a particular job, from your whole toolkit - not relying on it as the only tool you know.

Anyway, that's a general essay on understanding code. The particular issues around large codebases are that you also have to understand structure. That's a different skill, which you can only really get by studying large code bases. Particularly, if there aren't a lot of comments, you have to infer structure and intent from code, which is even harder.

Start by breaking it up. What components do you expect an OS to have? One or more filesystems, a scheduler, drivers, what else? Can you see code modules or libraries corresponding to those large-scale components?

If so pick one (if not, you've got real troubles). Have a brief look at the top-level structure and interface of that component - can you see what it's intended to do, or how it's intended to be used? If so: good, that's your high-level view of one component, now onto the next. If not: see if you can find some other code using this component, maybe it'll demonstrate the author's intent.

Now, you have a high-level overview of the component parts and some idea how they fit together. If you need more detail about something in particular, dive into the implementation. If you need to eg. write a driver, read a few existing drivers, try to understand their differences and similarities, pick the one most similar to what you need and start from there.

I doubt using pen and paper (which I actually do for smaller code bases) is efficient for a code base this large... Also, the source doesn't have many comments... — Shahe Ansar, Oct 03 '16 at 17:42
Well, _add_ comments. Break up the large codebase into smaller units which _are_ susceptible to pen & paper attacks. There isn't some magic number of lines of code above which techniques just stop working, you just divide and conquer. — Useless, Oct 04 '16 at 11:09

How to understand an OS's source?

1 Answers1