1

I'm working on a C compiler for Linux for the purpose of personal curiosity/fun. How can I test the generated assembly before the compiler is complete enough to do anything useful?

For example, if I have the following program:

int a = 1;

and it compiles to

a:
    .long 1

How can I be sure that this output is actually the correct thing to do? My assumption is that the ways one can normally interact with compiled code are relatively advanced features (e.g. I/O, static library, etc.) that are best implemented at a later stage.

Any tricks here?

  • 1
    are there other (pre-existing) compilers on this ISA that you're trying to be compatible with? – Erik Eidt Aug 08 '18 at 22:44
  • (depending on your assembler you may need an external linkage specification for `a`, something like a `.globl a`.) – Erik Eidt Aug 08 '18 at 22:45
  • @ErikEidt I do not currently care about compiler compatibility, though I confess I'm not sure whether I ought to. –  Aug 08 '18 at 22:46
  • Wouldn't this be a simple string comparison check? – Robert Harvey Aug 08 '18 at 22:53
  • @RobertHarvey I'm looking to verify that the assembly does what I think it should do, rather than verifying that the output matches what I think it should be. –  Aug 08 '18 at 22:56
  • 6
    No tricks, just hard work. Assemble it, then link it together with something that tests it, like perhaps `void main() { extern int a; printf("a=%d\n",a); }`, whether also in assembly or compiled by a (more trusted) compiler. – Erik Eidt Aug 08 '18 at 23:03
  • What kind of source language is your compiler for? That is in practice an important consideration (and knowing that could improve the answers you've got). So please **edit your question** to give more details and motivations – Basile Starynkevitch Aug 09 '18 at 05:37

2 Answers2

5

In general, you won't "test" (actually, in that meaning, a better word is check and probably statically check, related to static program analysis) by statically parsing the assembler without running the compiled program, because it is too difficult. However, you might have some testing scripts which parse the generated assembler for a given input program and check its structure (I don't think it is wise to do so...).

In practice, most compilers have an extensive test suite. If you are coding a compiler for some existing language (e.g. if you are coding a C compiler), you might try to reuse them (for example, GCC has a well established test suite that you could adapt to your compiler; some tests are indeed "parsing" the generated assembly or the emitted diagnostics).

However, look also at the CompCert project which is about a formally verified C compiler (an important part of that work is the formalization of the semantics of C and of the behavior of the compiler).

How can I test the generated assembly before the compiler is complete enough to do anything useful?

You probably cannot do that, except by manual inspection of that assembly code. You probably want to work to get, as quickly as possible, a tiny part of your language compilable (to some program that you could test). For example, you might decide to work hard to make the empty program compilable, then to make a 1 line single assignment program compilable, then to make some few tiny 5 lines programs compilable, etc... So your compiler would have a growing sequence of test suite.

You may want to compile your compiler with itself. This is a long tradition (and then the ability to compile your compiler is a strong test). Read about bootstrapping compilers (and look into J.Pitrat's blog about bootstrapping artificial intelligence; it has many interesting pages).

You could also base your compiler work above some "compiling" library like libgccjit or LLVM, or choose to compile to C (or to some other language, higher level than assembly). That could save you a lot of efforts.

Be aware that in practice, C compilers should be optimizing, and that is why it is hard to compete with existing compilers. See also this.

Basile Starynkevitch
  • 32,434
  • 6
  • 84
  • 125
  • 1
    It's worth noting that GCC is the first project I ever encountered with an automated test suite -- back in the 90s, when almost nobody else was doing automated testing. – Jules Aug 09 '18 at 12:02
1

How can I be sure that this output is actually the correct thing to do?

If I understand what you're asking correctly, you can't. The correct behavior of your program is derived from your understanding of the problem it tries to solve. Testing validates that your implementation matches your understanding but does nothing to validate your understanding.

A flawed understanding of basic arithmetic that has you believing add(2, 3) should return 6 will lead you to write code to do that and a corresponding test to verify that it actually does. That test would pass and everything would look good. (Or, if you're doing TDD, you're going to write the test first, fail it and then adjust the code so the test passes.) The implementation, while testably-correct in relation to the requirement, is based on a flawed requirement. Garbage in, garbage out applies as much to developing code as it does to processing data.

Blrfl
  • 20,235
  • 2
  • 49
  • 75
  • So to follow along with your example, one could manually verify the expected output of `add(2, 3)` with a calculator. –  Aug 09 '18 at 19:25
  • @JETM Right. The calculator would be a tool to verify that your understanding is correct before you go ahead with the implementation and writing tests. – Blrfl Aug 09 '18 at 19:56