12

Say, for example, I wanted to pay somebody to create a programming language or scripting language for me. What type of document would they need, in order to fully understand what it is exactly that I want.

I mean, are there standard documents that describe the new programming/scripting language in question?

sepp2k
  • 4,339
  • 1
  • 25
  • 25
J.T.S.
  • 299
  • 1
  • 8
  • as this is about programming, not programmers, this is probably a better fit for StackOverflow. – Muad'Dib Oct 16 '10 at 20:21
  • 14
    I do not agree with Muad'Dib. I think this is a good place for this question. – Chris Oct 16 '10 at 21:14
  • 5
    I think rather than inventing your own scripting language, at the cost of huge amount of work for you and your user have to learn a new language, you would better embed an existing scripting language. Some languages, e.g. Python, Javascript/ECMAScript, are designed so it can be embedded to a bigger framework. In short, you'll only need to design the API and figure out a way to embed the script interpreter into your own program. – Lie Ryan Oct 16 '10 at 23:01
  • 1
    There is advantage in doing this if the language is goint to be a DSL. for a general language not so much. Of course some general languages are very good bases for DSLs e.g. Lisp or TCL – jk. Jan 26 '12 at 11:33

5 Answers5

16

What you need to write is called a language specification.

It should contain a description of the language's grammar (preferably in Extended Backus-Naur-Form) and its semantics.

For the latter part you could either write a description in your own words (but take care to be precise) or a formal semantics.

sepp2k
  • 4,339
  • 1
  • 25
  • 25
  • 1
    BNF is only useful for content free grammars, scripting languages aren't always context free e.g. TCL (though I think you can still argue it is preferable to have a context free language in most cases) – jk. Jan 26 '12 at 11:30
  • @jk. I wouldn't say BNF is entirely useless for non-context free languages. Depending on how non-context free the syntax is, it can still make sense to specify it in EBNF and then resolve the ambiguities in words. That's what the C++ standard does for example. In most cases I imagine that's still clearer than explaining everything in words or specifying it using a context-sensitive or unrestricted grammar. – sepp2k Jan 26 '12 at 14:20
  • true, my point was more there are languages like lisp, tcl or forth (that are actually very good for defining DSLs in) that have degenerate syntaxes and so the BNF tells you very little – jk. Jan 26 '12 at 17:29
  • @jk. Sure, but in that case any other means of describing the syntax will tell you equally little, simply because there is so little to tell. That just means that the syntax part of the specification will be very short. – sepp2k Jan 26 '12 at 17:32
13

You will need the following:

  • A reason for creating a new language
  • A Philosophy
  • A Semantic Definition
  • A lexical description of your tokens
  • A Syntax Analysis definition

How will your language be different? What is its mission? Is it functional? Is it object orientated? Is it a meta-language? What are its unique features? What will it give the world that doesn't exist (or exists in an ugly way)? How do you want to change things? Is it compiled or interpreted? A DSL or general purpose language? This is your philosophy and dictates alot about your language's design.

Next, work on scratching out rough syntax and semantics on paper. This will be your semantic definition ... writing fake code is a great way to develop your thoughts. Read "The C Programming Language" for an excellent example of how this is done. Play with it.

You will then need to define your tokens and syntax in some way. Programs then process these into automata capable of reading in strings and processing the syntax. Yacc and Bison use Regular Expressions and a BNF style syntax for lexical and syntax analysis respectively. There are also Yacc and Bison like tools in for other languages.

You will also need a grounding in language theory/compilers to know what NOT to do. Examples include ambiguous grammars, AST generation and manipulation problems and generally how to make life simple for yourself. Knowing the theory is very important. I would consider getting the following to start off:

Compilers: Principles, Techniques and Tools (Dragon Book)
Modern Compiler Implementation in C or Modern Compiler Implementation in Java

Aiden Bell
  • 506
  • 2
  • 10
8

99.9% of the time creating a new language is completely unnecessary. The return on investment would most likely be tiny, and you would have just wasted your time.

Most likely you can use Javascript as a susceptible scripting language, and there are parsers available for most languages already. You can also use other scripting languages you like if you can find a suitable parser for them. Implementing those into your program would require much less work and have a bigger return. People don't have to learn another language, they just have to learn your API. Its a much better solution.

Creating a new language is almost always bad.

Bryan Oakley
  • 25,192
  • 5
  • 64
  • 89
TheLQ
  • 13,478
  • 7
  • 55
  • 87
  • 9
    Except for the multitude of times when it isn't bad. Creating your own simple DSLs can be very useful. Now creating your own general purpose language would be more inline with what your answer says. – ChaosPandion Oct 16 '10 at 23:45
  • @ChaosPandion but a lot of languages already excel at creating a DSL that uses code from that language (ie ruby is good at this) – alternative Oct 16 '10 at 23:52
  • 2
    I agree with your answer but I believe its not the right answer for this question. I believe the asker is looking at the generalities of creating a scripting language not for the pros/cons of creating one. – Tim Murphy Oct 17 '10 at 04:19
  • Creating a new language is almost always the best solution. http://en.wikipedia.org/wiki/Language-oriented_programming – SK-logic Jan 26 '12 at 18:37
3

You can describe your language's grammar in BNF.

For instance, this is Python's grammar.

grokus
  • 7,536
  • 4
  • 31
  • 46
  • 6
    The grammar by itself isn't enough information to implement a language though. He'll also have to specify the semantics one way or another. – sepp2k Oct 16 '10 at 20:15
0

if you're using .NET, here is something I stumbled across some time back. I only gave it a curious glance, but maybe it would be use to you: irony.

Irony is a development kit for implementing languages on .NET platform.

svick
  • 9,999
  • 1
  • 37
  • 51
DevSolo
  • 2,814
  • 20
  • 23