16

In other words, a language where every possible string is valid syntax?

EDIT: This is a theoretical question.
I have no interest in using such a language; I'm just asking whether it's possible.

Further edit

I went ahead a designed such a language. See ErrorFree

SLaks
  • 1,204
  • 11
  • 16
  • 2
    If we could do that, we'd have created AI. – Michael K Jan 19 '11 at 14:07
  • 5
    @Michael: No; I don't think so. – SLaks Jan 19 '11 at 14:10
  • Would it make any sense? – Sorantis Jan 19 '11 at 14:11
  • This: http://programmers.stackexchange.com/questions/34737/the-most-mind-bending-programming-language/34780#34780 sounds pretty much like it. – Mchl Jan 19 '11 at 14:24
  • Then, it'll impossible to debug the code by some one else! – Abimaran Kugathasan Jan 19 '11 at 14:36
  • 7
    !!! **Perl** !!! –  Jan 19 '11 at 15:07
  • might fit better in this SE site: http://cstheory.stackexchange.com/ – Louis Rhys Jan 19 '11 at 15:48
  • This should be posted on cstheory.SE, but I'm afraid that in its current state it might get closed there as well. I suggest expanding on your question, take a look at the 6 guidelines to asking questions in our FAQ. Additionally our FAQ explicitly calls hypothetical questions off-topic/not constructive. – Walter Jan 19 '11 at 16:09
  • Of course yes. Here is the proof: There is a natural number that corresponds to each valid Java program (because the set is recursive). There is a natural number that corresponds to each source file (just look at the bits!). Make that correspondence, and you have your language. [Note to some: This is not a joke, just a simple proof. The mathematical approach is to first try trivial solutions. If you don't like the answer, you'll need to ask another question!] – Macneil Jan 19 '11 at 16:13
  • 9
    I strongly object to the question being closed! It's neither subjective nor not constructive!!! – Felix Dombek Jan 19 '11 at 20:47
  • Felix: I think you mean **un** consructive. – SLaks Jan 19 '11 at 20:49
  • 1
    why not take a assembly language with: exacly 256 instructions, 128 registers, and a general syntax of `instruction operand*`, where an operand may be a register or a number between 0-127 (and everything above that is treated as a register) and if an operand is missing for a multi-arity instruction, '0' is assumed. – Felix Dombek Jan 19 '11 at 20:50
  • @slaks: I wanted to write that first, but then decided to just copy the reason as stated below. – Felix Dombek Jan 19 '11 at 20:52
  • So, apparently someone went through and -1'd every answer that said you can't get rid of syntax errors. I'd say that's not constructive. – Berin Loritsch Jan 19 '11 at 22:07
  • @Berin: An incorrect answer "should" be downvoted, because they aren't helpful. There are two proofs above (one by me, and another by Felix) that show it's possible to have a language with no syntax errors. I personally did not downvote, knowing that the question would be closed. – Macneil Jan 20 '11 at 15:52
  • The proofs are incomprehensible to folks with just practical knowledge. I wish math people would use real words instead of single letter substitutions when discussing theory. And I do disagree with the assertion you can have a practical language without syntax errors. Of course, just because it is "possible" doesn't mean it should be done.... – Berin Loritsch Jan 20 '11 at 16:24
  • I wish people would use real words instead of single letter variables in their code. – Michael K Jan 20 '11 at 16:42
  • Recently I saw people commenting on TECO, stating that nearly any input could be run, giving unpredictable results... I can imagine a language attempting to do automatic spelling fixes (cnost to const) with possible funny results (azerty to assert)... – PhiLho Mar 01 '11 at 13:38
  • The Scratch Programming Language has no syntax errors. It's a visual programming language where users snap blocks together. If you can imagine, this means that you can't put a string block into a number block (visually, it looks something like putting a square peg into a circle slot, and the GUI is specifically programmed to forbid it). Because you can only construct syntactically correct programs, there are no syntax errors. Unlike some answers suggest, it is very easy to assign meaning to these programs. – michaelsnowden Dec 15 '15 at 20:48
  • @michaelsnowden: See also http://blog.slaks.net/2014-04-01/programming-without-errors-errorfree/ – SLaks Dec 16 '15 at 01:49
  • The flip-side of syntax errors (the "advantage" if you will) is that they tell you when you have made a mistake (for a class of mistakes). In a language where everything is syntactically valid, those mistakes become runtime errors ... or programs that don't behave as the programmer intended. – Stephen C Dec 16 '15 at 07:56
  • @StephenC: I know; I never intended for this to be _used_. – SLaks Dec 16 '15 at 14:57

6 Answers6

16

Yes, of course it's possible, it's even trivially easy.

<programm> ::= char | char <program> |

I don't understand how anybody can say "no". That said, it might be rather hard to define a meaningful semantic for such a language, but that's possible too. Just look at whitespace.

user281377
  • 28,352
  • 5
  • 75
  • 130
  • So if the language ignores it, it's valid systax? and isn't "tabtabspace" a valid string? – Michael K Jan 19 '11 at 14:27
  • 1
    The semantics of the language were my issue with it, but can't really discuss it without it getting off-topic to philosophy/linguistics proper. – StuperUser Jan 19 '11 at 14:30
  • 2
    Michael: Exactly. Everything is syntactically valid, but it can possibly be a NOP (have no special meaning). Nothing wrong with a language ignoring lots of stuff. Just look at all the things C ignores in this sample program: int main() { 3;;; /* comment */ } – user281377 Jan 19 '11 at 14:36
  • Many people say "no" because they have no conceptional distinction between syntax and semantics. "It doesn't compile? Must be a syntax error, then!" – fredoverflow Jan 19 '11 at 17:40
  • Many people say "no" because there is no real meaning in this. As soon as you add structure (i.e. more than a self recursive parsing rule) you have the concept of syntax. Violation of the structure is a syntax induced error. Syntax induced error is a Syntax Error, whether the parser flags it as such or not. – Berin Loritsch Jan 20 '11 at 16:29
  • Berin: I think you could have a very complex structured language and still accept any possible string, but nobody wants that; because it would be hard to find bugs in a language where the compiler accepts any typo. – user281377 Jan 20 '11 at 22:13
  • AmmoQ: I assert that any error introduced by not following proper syntax (aka typo) _is_ a syntax error. The fact that the parser didn't flag it as such is irrelevant. I think we have fundamental differences of opinion on what constitutes a syntax error. I.e. the "proofs" come from the aspect of parser flagging the error, and I come from a more abstract level--i.e. its an error because it doesn't follow the proper syntax. – Berin Loritsch Jan 21 '11 at 18:52
  • What would be the difficulty of saying that a FORTH-like language has 95 identifiers, many of which are initially bound to the value 0 but some of which are bound to operations? I would think one could fairly easily devise a language which was capable of useful operations, but where any sequence of characters could be valid depending upon the bindings established before it executed. – supercat May 10 '14 at 15:05
8

Yes, if you look at this in a very analitic way creating a Deterministic Turing Machine that always stops in a good final state for every single string of a certain language, then you'll have demostrated that is possible. The demostration is pretty straight forward, you must a regular TM with a transition function with only one transition, that looks like this:

TF(w,q) -> (w,Qa) 

Considerations:
    L = { w | w is any possible string }
    w e L
    q e Q
    F is a set with all good final states {Qa,Qr}
    Qa e F

Its been demostrated that a TM has the same computing power that any single real life computer, so this is absolutely possible

guiman
  • 2,088
  • 13
  • 17
  • 1
    What on earth does this mean, to us common layfolk? What is 'w', 'e', 'L', 'q', 'Q', 'Qa', 'Qr', 'F', 'TF'. Without any of these defined I have no frame of reference. – Berin Loritsch Jan 20 '11 at 16:26
  • 2
    Sorry but there is no easy way to explain the Turing Machine approach to this answer. Check this link to clarify a little: http://en.wikipedia.org/wiki/Turing_machine – guiman Jan 20 '11 at 17:25
5

I guess it depends on what you mean by valid syntax.

You could design a language that accepted any string but ignored anything that had not been prescribed specific meaning. This is basically the equivalent of saying "I'll get rid of syntax errors but saying they're not errors" - pretty pointless and hugely undesirable for many reasons.

Beyond that the only way you could have a language which had no syntax errors would be to have every possible string have a valid instruction / use associated with it. The only way I can see to do that would be to have all operations as single characters and to ensure that every single character had an operation assigned to it.

There are a million things wrong with this - obviously there are no reserved words, it's all about where it's used in context and as a result it would be basically illegible and, while immune from syntax errors would be far more likely to experience every other sort of error.

So theoretically possible (AmmoQ puts it far more neatly than I) but entirely undesirable.

Jon Hopkins
  • 22,734
  • 11
  • 90
  • 137
  • I've read that TECO was like that, each character being assigned a meaning. – David Thornley Jan 19 '11 at 14:41
  • 3
    Machine code works pretty much that way. Every possible combination of bytes can be viewed as a program that does *something*, even if all it does is causing an interrupt. – user281377 Jan 19 '11 at 15:19
  • David, thats what I was thinking, very TECO like. Although IIRC TECO input could contain syntax errors. But it demonstrates the difficulty of such a dense language -very hard to read and prone to difficult to understand errors. – Omega Centauri Jan 19 '11 at 15:22
  • @user281377: On the 6502, there are quite a few instructions with no defined meanings. Some have behaviors which are consistent, useful, and not available with any documented instruction (my favorite is nicknamed "DCP"--decrement a memory address and compare the result with the accumulator, setting flags appropriately), but some have behavior that depend upon bus timings in weird and bizarre ways, and some will lock up the processor hard enough to require a reset (even a "non-maskable" interrupt won't help). I think the latter instructions could be considered "syntax errors". – supercat May 10 '14 at 15:02
5

Code in a non-text based programming language may not have syntax errors.

I am thinking of a visual language like BYOB. You can not accidentally type "if x ten else foo" because the "syntax" is defined by graphical blocks.

LennyProgrammers
  • 5,649
  • 24
  • 37
3

The very purpose of syntax is to differentiate between valid and non-valid in a manner that's faster and more effective than executing the code. Syntax is just an optimisation, what goes in it and what goes into semantics is arbitrary.

Usually you want quite the opposite: to make the syntax stretch as far as possible to save more time, but of course you can alsp omit syntax altogether and declare every error a semantic one: you'll end up with a non-tokenizing interpreter.

biziclop
  • 3,351
  • 21
  • 22
0

Ahbefiasdlk aslerhsofa;f jwi [asdfasdf]aew /&Q!@#$ }{ ;-P

So what does that mean?

As long as the language has structure and grammar, there will always be the concept of a syntax error. The question is whether you enforce it or not. People will make mistakes, and syntax errors are what most language designers reach for to help programmers avoid stupid mistakes.

A syntax error is an error introduced by programmers writing code that has no meaning to the language.

It is impossible to get rid of syntax errors based on the above definition. We've all mispelled identifiers, we've all mispelled method names. Having the language silently accept the mispelling and happily do nothing is not my idea of an enjoyable experience.

It is possible to design a language that can use any valid unicode character (or character sequence) as identifiers. There are challenges, such as normalizing equivalent characters/character sequences so that they are recognized as the same thing--but it's possible. NOTE: there are four standard types of unicode normalization.

Berin Loritsch
  • 45,784
  • 7
  • 87
  • 160
  • 1
    The need for structure does not require a grammar. Consider Piet where the structure is in the position of the character (or color) in a grid, not its relation to other characters in a morpheme. – Mike Samuel Jan 19 '11 at 19:03
  • 1
    Violate the structure and what happens? – Berin Loritsch Jan 19 '11 at 19:29