17

Does there exist a Turing complete programming language such that for a fixed alphabet (say, ASCII), every possible permutation of those characters is a semantically valid program capable of being executed?

We consider infinite loops to also be semantically valid.

I know some data formats, such as Markdown, possess universal semantic validity (every input is valid), but I cannot off-hand think of a programming language with this property.

mp-
  • 287
  • 1
  • 2
  • @PhilipKendall That is a valid (if degenerate) answer. Of course almost all programs will immediately JMP into unknown space and crash. But that doesn't mean they aren't meaningful. Reminds me of [superoptimzation](https://en.wikipedia.org/wiki/Superoptimization). – mp- Dec 24 '18 at 07:19
  • Suppose you have a trivial language when given a letter ascii it prints it and moves onto the next character. That technically satisfies your definition. Any input is valid. It would be a rather silly language as you can't do anything productive with it, but of course we're not talking about practicality. – Neil Dec 24 '18 at 07:24
  • 1
    @Neil: that is not a Turing-complete language. – Doc Brown Dec 24 '18 at 10:35
  • I've heard this about APL, but I'm not sure if it is true, might have been true once upon a time, or is merely a comic exaggeration. Snippets of APL I've seen online don't bear evidence against it, though... – trent Mar 11 '21 at 22:18

5 Answers5

12

Every octet sequence can be interpreted as valid Z80 code as there are no invalid opcodes or arguments; I imagine the same would apply to various other processors, I just personally know Z80.

For low-level things like this, you possibly start running into questions about what it means to "execute a program":

  • what happens if it jumps outside the initialized space?
  • How does the "program" terminate anyway?
alecxe
  • 331
  • 4
  • 15
Philip Kendall
  • 22,899
  • 9
  • 58
  • 61
  • 2
    *"what happens if it jumps outside the initialised space"* - by simply adding a rule "jumps to adress 0 in this case", it should be quite simple to handle. Should not be too hard to define some extra semantics for deailing with those edge cases. – Doc Brown Dec 24 '18 at 10:26
  • 2
    jumps outside initialised space is syntactically valid, and possibly even semantically valid as well (e.g. its well defined semantics is to produce a segfault). – Lie Ryan Dec 24 '18 at 11:47
10

Such questions about programming languages are almost universally answered with yes. If there currently is no language that has the requested property, you can bet that someone will see it as a challenge to create a (toy) language that does have the property.

As an example of a language where every permutation of the alphabet's characters is syntactically valid is the whitespace language, where the alphabet of the language itself consists of space, tab and linefeed.

Bart van Ingen Schenau
  • 71,712
  • 20
  • 110
  • 179
  • 5
    Also, since any non-whitespace character is ignored as a comment, not just every permutation of whitespace, but in fact every permutation of ASCII characters is also syntactically valid. – Jörg W Mittag Dec 24 '18 at 09:00
  • 2
    To be nitty: having a short look into the [Whitespace tutorial](https://web.archive.org/web/20151108084710/http://compsoc.dur.ac.uk:80/whitespace/tutorial.html), it seems a stack manipulation [Space] must not be followed by a [Tab] character. But I guess it will be simple to extend the language for these special cases (for example, by giving them No-Op semantics). – Doc Brown Dec 24 '18 at 10:47
  • @docbrown, in the examples I do see sequences of [Space][Tab], so I am not sure where you got your conclusion from. – Bart van Ingen Schenau Dec 24 '18 at 13:04
  • 1
    @BartvanIngenSchenau: look into the tutorial: a stack manipulation is introduced by a [Space] followed by one of four possible commands, which start either with another [Space] or [LF]. The sequence [Space][Tab] can occur as part of a number, or as part of a different command, of course, but a [Tab] is not a valid character after a stack command introduced by [Space] . – Doc Brown Dec 24 '18 at 13:13
0

Why settle for strings (of characters)? You should rephrase that question to "any series of tokens". Not bits, bits are so 20th century computerish.

So, infinite loops are OK you say. What about division by zero? If you are lenient enough wjth your definition of valid you may get a yes but most combinations would still be meaningless. So I would say you may always get a technically valid program but hardly ever a semantically valid program.

Martin Maat
  • 18,218
  • 3
  • 30
  • 57
  • "bits are so 20th century computerish": What do you mean by this? – Giorgio Dec 24 '18 at 12:51
  • I mean strings, characters and bits are a technical detail, essentially you are talking about instructions and operants. How those are implemented is irrelevant to the question. – Martin Maat Dec 24 '18 at 13:50
  • Yes, but this observation was true even 50 years ago (formal languages are even older), so I am still not sure I understand what you mean. – Giorgio Dec 24 '18 at 17:19
  • 1
    I just wanted to bring down the question to its logical essence. It is not about strings or bytes, it is about instructions and arguments. Can you mix those up in an arbitrary way and be sure to get a valid program? – Martin Maat Dec 24 '18 at 19:41
0

If you reduce the question to a language for which all characters it accepts are valid programs, Brainfuck does this.

https://gsurma.medium.com/meta-intelligence-writing-programs-that-write-programs-part-1-genetic-evolution-679b65c37c5f

There are only 8 characters, all of them are valid operations, so randomly generated strings of these 8 characters are all valid programs.

  • 2
    If you only have an opening brace and no closing brace is it a valid program? – Jerry Jeremiah Mar 11 '21 at 21:39
  • @JerryJeremiah No. But it could be interesting to explore how to extend the language so that unmatched brackets are still syntactically valid and preferably also carry some semantics. For comparison, according to Wikipedia, the runtime error of invalid memory access is treated differently by implementation, e.g. by using the address modulo the memory size. – Elias Hasle May 24 '21 at 13:59
0

Simple. Start with the C language. Now we build a compiler for a language C' which works like this: The compiler looks for the longest prefix of the program that could be extended to a valid C program, and ignores the rest of the program. Then it adds the shortest and alphabetically first string to that prefix which produces a valid C program, and translates the result using a C compiler. Obviously any program is valid.

Oh well. C is not turing complete. Add arbitrary large integers and pointers to make it turing complete.

gnasher729
  • 42,090
  • 4
  • 59
  • 119