5

For a time now I have been wanting to make my own programming language.

I'm 17 and the only language I know is Color BASIC. I know that compared to today's complicated languages, it's pretty weak.

But I was wondering if it would be hard to create a language that is kinda like color BASIC/BASIC and how would I start it?

What to do or what do I need to do to learn how to make my own programming language?

Dan McGrath
  • 11,163
  • 6
  • 55
  • 81
landon
  • 51
  • 1
  • 14
    If you know only Color BASIC, you are a long way from writing your own language and compiler. Why not start out a little easier, and simply learn a more powerful language? – asthasr Dec 13 '11 at 02:45
  • 2
    I'd second the motion to learn another language; I'd aim for something like Python (perhaps via [Learn Python the Hard Way](http://learnpythonthehardway.org/) or Ruby. Writing a language in CoCo BASIC isn't much fun. You'll be better served by learning more programming paradigms first, even if you learn them by trying to build your own language. – Dave Newton Dec 13 '11 at 03:44
  • 2
    Very curious how you came to know Color BASIC first. Many of us old timers got started with the Color Computer but I think it's been discontinued for longer than you've been alive. Interesting to see someone learning to use the CoCo that didn't grow up during the period it was somewhat common. – Jason Dec 13 '11 at 15:23
  • 1
    Problem is, if you only know BASIC for the CoCo, you have no frame of reference to know what it's missing. When all I knew was BASIC (although an earlier version), I tried writing out a description of a new language, and in retrospect it was really lame. Unless you've gotten familiar with a few different languages, you really don't know what you're missing. – David Thornley Dec 13 '11 at 17:30

5 Answers5

6

A programming language implementation typically starts by transforming the source code in passes:

  • Lexical analysis: break up the input text into tokens. This deals with whitespace and comments. For example:

    a := 1 + 2 * 3 -- comment
    

    might be turned into:

    identifier a
    operator :=
    number 1
    operator +
    number 2
    operator *
    number 3
    

    The above is pseudocode representing a list of values, where each value is a token.

  • Parsing: Build a tree representing the source code:

    assignment:
        lvalue:
            variable a
        rvalue:
            add:
                number 1
                multiply:
                    number 2
                    number 3
    

    Parsing determines things like where statements start and end, and order of operations.

There may be additional passes after this, depending on the complexity of the language and the quality of the implementation.

Finally, the resulting tree is either interpreted or compiled.

  • Interpreted: The program is run directly. This is usually easier to implement.

  • Compiled: The program is converted into one of the following:

    • Machine code, so the CPU can run it directly.

    • Bytecode, so it can be executed by a virtual machine.

    • Another programming language. For example, some programming language implementations compile to C, then invoke a C compiler to produce machine code.

This is a rough overview. As you can see, implementing a programming language is rather involved. To avoid feeling overwhelmed, start by implementing a very small set of features. Don't start by writing a lexer or parser that supports your entire language; you'll get burnt out before you get to see something working. The key is to get something working. It will set up a strong feedback loop: adding new features will be fun because you'll get to see them work right away.


I don't recommend jumping into writing an interpreter for BASIC right away. Here are some starting points:

  • Implement a calculator supporting order of operations. Example:

    > 1 + (2 + 3) * (4 + 5)
    46
    
  • Implement a random sentence generator, using a template that looks something like this:

    <weapons>
    a snake
    many snakes
    a gun
    a rocket launcher
    a thick board and something sharp
    
    <verb>
    kill
    destroy
    dismember
    impale
    
    <sentence>
    I will <verb> you with <weapons>.
    

    This is a nice example of a domain-specific language.

  • Learn the C programming language. It'll give you a better understanding of how computers work. C will seem less convenient at first (e.g. you can't simply take two strings and concatenate them, you have to allocate a buffer large enough for both strings first), but you'll find that it provides better methods of structuring data than you had in BASIC.

Joey Adams
  • 5,535
  • 3
  • 30
  • 34
2

Slow down. This is a major project, and you're nowhere near ready for this. I'm not trying to put you down, just to save you a lot of wasted time.

Learn several other languages, and make sure that C is one of them. Learn assembly (maybe start with a simple instruction set like ARM). Learn how compilers work. Write a few programs that actually do useful stuff for people. By that time, you'll have a good foundation to come back to this plan, if you still want to.

Mike Baranczak
  • 2,614
  • 16
  • 16
1

I found this book Game Scripting Mastery a very good book. It's not a very well known book but it goes from implementing a super basic scripting language all the way to a custom designed compiled scripting language. It's also easy to follow and all the examples are related to games. The only problem is that it's in C++ but all the basics are very well explained.

You can find a pretty cheap used copy in Amazon or elsewhere, just make sure you get a copy that includes the CD with it.

krolth
  • 111
  • 3
  • I think that if a person doesn't understand C++, they probably aren't going to be able to understand how to write a programming language. Therefore, I don't really see that as a *downside*, per se. +1 – Richard Dec 13 '11 at 20:48
0

Remember Bill Gates got started writing a Tiny BASIC interpreter for the ALTAIR.

There are better languages around, but you can easily write an interpreter for a small language of your own, and it's fun and good experience.

Start with whatever language you've already got. You probably want to keep it really simple, like for starters only have numeric variables, and keep the variable names to single characters like A, I, or X. You can do arrays later.

For expressions, like (F-32)*5/9 a recursive-descent parser is easy to write and excellent practice. You don't need to build a parse tree - you can just calculate the results as you parse along.

Forget functions/subroutines. You can do those later. For control structure, if you wanted to be really minimalist, you could stick to IF and GOTO. You can do the more structured stuff when you get more confident.

When you get something working, you can elaborate it in whatever directions you want.

Have fun!

Mike Dunlavey
  • 12,815
  • 2
  • 35
  • 58
  • 3
    Recursive descent parsers are... annoying in BASIC. Especially old CoCo BASIC. Wrote one a million years ago in TRS-80 Level II BASIC, and it was probably a bit beyond where OP is. – Dave Newton Dec 13 '11 at 03:42
  • @Dave: Annoying, maybe, but I had students writing expression parsers ages ago. It's not too bad if the syntax is simple, like add, subtract, multiply, divide, unary minus, 1-character variable names, decimal numbers, parentheses, and no comments. – Mike Dunlavey Dec 13 '11 at 13:24
0

I recommend that you learn Python, and do not try to write a compiler until your skill set includes C and C++.

Python is a modern language, easy to learn, and if you can do BASIC you can do PYTHON. It's easy.

Warren P
  • 830
  • 5
  • 14