1

I'm not sure if this is an acceptable question, but compiler-os-design-where-to-start was, so I figured that I'd take a shot at it.

I have taken no formal Computer Science classes. I have programmed in Python and attempted C# without success. My technical vocabulary is expansive, yet scattered over a very wide range of computer science topics.

I have a very long way to go before I can get to a level where I could reasonably read a book about compiler design/theory. I am asking what steps I need to take before attempting compiler design. I have some examples here already:

  • Computer architecture
  • Binary
  • How booting/kernels/OSes work
  • Imperative vs. Comparative language design
  • "Grammars"

At least these are some examples of what I've seen.

Edit: I can't for the life of me see how this problem is unclear. I quite clearly bolded what I was asking. I was expecting it to be marked as biased, vague/broad, or too generalized, but certainly not unclear. Don't be afraid to say it's unconstructive (just make sure whatever it is classified under is accurate).

person27
  • 289
  • 3
  • 10
  • 8
    How about "Pick up the dragon book, when unsure of term, google". Lack of vocabulary isn't really a reason not to even try to read something. – daniel gratzer Sep 18 '13 at 05:05
  • Generally speaking, the better programming books on these types of subjects attempt to avoid using unfamiliar terms or clarify new terms, as understanding the contents of the book is the objective. If they are using terms which aren't clarified, chances are they are terms you should probably already know them, which is all the more reason to google them yourself. – Neil Sep 18 '13 at 06:13
  • @Stopforgettingmyaccounts... specifically, what problem do you have understanding? Lexing? Parsing? Symbol tables and name spaces? Optimizations? Code generation? Symbol trees? How far did you get into it before getting confused? –  Sep 18 '13 at 22:07
  • @jozefg I tried it, but didn't want to continue because I feel like I'm missing a lot of things (page 3). I'm using version 1 of the book, since I cannot access version 2 as an ebook right now (and I suspect it's mostly inconsequential). And it's somewhat counter-intuitive to learn compiler design and theory before I learn the framework that makes them up. I think learning for computers should be circuits -> Memory allocation -> CPU/Booting/OS -> Software, but I think that I am missing some prerequisites along the way. That's why I'm asking for them. – person27 Sep 18 '13 at 23:31
  • @Stopforgettingmyaccounts... You don't need to learn the things that compose abstractions to use them. It's sufficient to have a "surface understanding" of assembly and go from there. Or even read a book about compiling to the JVM and ditch assembly entirely. I read the dragon book at 15 and took about 9 months, when I didn't know something, I went and learned it. – daniel gratzer Sep 18 '13 at 23:44
  • @Stopforgettingmyaccounts... you don't need to learn biology to cook. You don't need to learn optics to look at a cell under a microscope. You don't need to learn quantum mechanics to design a lens system. You certainly don't need to know the flavor of a quark to cook. It helps to learn about what goes on beneath and behind - but you don't need to know it to use the tools of the higher level. Understanding the theoretical automatons (regular, context free and such) will likely be more useful than understanding the logic gates or specifics of a cpu when it comes to writing a compiler. –  Sep 19 '13 at 00:29
  • @MichaelT I don't need to know it, but I want to. I can't expect to understand a book about compiler design _to the same degree_ as I might if I already understood the prerequisites. Knowing these means a deeper and faster understanding of the contents of the book, as well as a significantly smaller chance of misinterpretation. Knowing the prerequisites allows me to _fit the pieces together_ and draw parallels to my current knowledge, which is much easier to do when it's done in a linear fashion. – person27 Sep 19 '13 at 06:41
  • A compiler is typically thought (at a high level) to be made up of a lexer, parser, and code generator - and need to be addressed in that order. The lexer and parser have more to do with CS theory than physical realities. Don't even *think* of a CPU until you get to the code generator (and if you target the jvm, llvm, parrot, or CLI, you don't even need to think of it then). Note that targeting a virtual machine is a growing trend with modern languages (as an aside, saying "I know how to develop for llvm" is a bigger plus in many cases on the resume than "I know how to target x86"). –  Sep 19 '13 at 17:45
  • So, design the language, write the lexer, write the parser, and *then* start to worry about the cpu/machine that it targets (or write the interpreter and set aside thoughts about targeting any machine). –  Sep 19 '13 at 17:47
  • You may find [Understanding Computation: From Simple Machines to Impossible Programs](http://www.amazon.com/Understanding-Computation-Machines-Impossible-Programs/dp/1449329276/) (I often like O'Reilly books) to be a basis for understanding the theory behind the necessary parts of a compiler - chapter 3 covers the theory behind the lexer and chapter 4 the theory behind the parser). It will help in doing that first step too - design the language. –  Sep 20 '13 at 03:00
  • Book suggestions are rarely good answers - there are too many of them to be useful. **I** like O'Reilly - that's an opinion. –  Sep 20 '13 at 04:07
  • @MichaelT Is there a point to your last comment? Nobody is offering an **actual** list of steps to take; I'd even say that my own crummy example is the current best answer. It's against SE to select external resources as a best answer, but since nobody else responded with anything better than your suggestion and I don't like to vote up my own (esp. incomplete) answers, I would select yours. Of course I knew O'Reilly was your opinion; I certainly wouldn't think it was fact. – person27 Sep 20 '13 at 04:19
  • 2
    Stumbled across a question you might like to read [How do I create my own programming language and a compiler for it](http://programmers.stackexchange.com/questions/84278/how-do-i-create-my-own-programming-language-and-a-compiler-for-it) - the top answer is by Eric Lippert who knows [a few things about compilers](http://blogs.msdn.com/b/ericlippert/). –  Sep 22 '13 at 02:43

2 Answers2

10

Compilers are not some mythical creatures, even though some people might like you to think that.

A compiler is a program like any other program. It takes some input, tries to make sense of it, and generates some output. Have you ever written a program which reads a text file in some format and outputs some HTML based on that text? Well, congratulations: you already have written a compiler. A very simple one, I admit, but it is a compiler.

You approach it like any other program: try, fail, learn, repeat.

Some resources to help you fail less and learn more :-)

Jörg W Mittag
  • 101,921
  • 24
  • 218
  • 318
2

Nicklaus Wirth's text on compiler construction is arguably one of the two most approachable texts on the subject. The other is Jack Crenshaw's series of articles, cited by the other guy.

If you want to go the Dragon Book route, there's no easy approach, but you can start by working your way through Course 6 at ocw.mit.edu. You'll have to pick and choose the Computer Science classes.

John R. Strohm
  • 18,043
  • 5
  • 46
  • 56