Why don't compilers support non-English keywords?

Question

When you read C, C#, Java, Python, PHP and many other programming languages all the syntax is written in English.

Simple code like this

if (X+1 > 4) {
}
while (A == true) {
}

Is written in English even if the programmer's native language is something else.

Would it not be more effective to allow the programmer to use their own native language in the source code, like this (my bad Spanish so forgive me if it's wrong)

si (X+1 > 4) {
}
mientras (A == verdadero) {
}

There is no technical reason why a compiler or parser should be stuck with it's syntax hard coded. Any developer who has done internalization knows there are many easy ways to support other spoken languages.

You can run most IDE platforms in different languages. Yet, the programming language the developer is using is stuck in English.

Why have none of the major languages added support for language packs?

The Basic dialect in Excel 95 was localized. And that ... [caused some problems](http://office.microsoft.com/en-au/help/sharing-excel-workbooks-across-language-versions-HA001138296.aspx). — Joachim Sauer, Jan 01 '14 at 18:27
@JoachimSauer I'm not looking to change the native language of source code once it's written. I would assume Spanish source code would always remain Spanish. So that should fix any issues.... — Reactgular, Jan 01 '14 at 18:30
one main problem was that code written on a German version of excel 95 simply did not run in an English version. Only later versions of Excel fixed that (by always translating to English). — Joachim Sauer, Jan 01 '14 at 18:31
@JoachimSauer sounds more like an issue with scripting then compiling. Once compiled or converted to byte code the source is no longer needed. — Reactgular, Jan 01 '14 at 18:32
Long story short, it doesn't provide any benefits and causes some problems (cf. Yannis Rizos above). Translating the keywords to another language does not help programmers speaking that language. Why would it? — , Jan 01 '14 at 18:34
The key words are part of the language. You translate them, it's no longer the same language. "Yes" means the same thing as "Oui", but that doesn't make "Oui" English. Same idea here. The source code is for human consumption. You'd no longer be able to learn "C#" if the key words could be in any spoken language. — MetalMikester, Jan 01 '14 at 18:37
@MetalMikester However, the languages are closely related (so closely that you can convert between them with a regex) and are subsumed by a language which permits each translation of the keywords as an alias for the others. Learning C#-with-french-keywords is trivial once you know C#-with-english-keywords, just mentally replace one with the other. — , Jan 01 '14 at 18:40
@delnan Not all languages are closely related. Besides, I'm usually busy enough figuring out what the other guy did (or what I did 6 months ago :D) without having to mentally replace the languages key words. I mean, seriously... Don't we have enough garbage to learn and deal with already? — MetalMikester, Jan 01 '14 at 18:43
English is the lingua franca of programming. The reasons don't matter. You can argue that it's arbitrary. But it's certainly not more arbitrary than any human languages in itself. As for `Any developer who has done internalization knows there are many easy ways to support other spoken languages` ... let's say that you're lucky to be a native English speaker - not because you didn't have to learn English later in life, but because you were spared the pain of being surrounded by poorly localized software. Using such software is certainly not more effective ;) — back2dos, Jan 01 '14 at 19:22
BTW: http://en.wikipedia.org/wiki/Non-English-based_programming_languages — el.pescado - нет войне, Jan 01 '14 at 19:24
I feel like the two current answers are missing the point of the question. You seem to be asking about non-English _keywords_ (not variables) alongside IDEs. Are you imagining the IDE could change the keywords between English and whatever language the user uses, when editing the source? — Izkata, Jan 01 '14 at 19:29
wait a minute, I didn't ask what would happen if you reviewed code or shared code from a non-English language. Surely there are many businesses that have all employees speaking another language where there is no benefit to use English. Why can't they just flip a switch and code using that language. Wouldn't this be easier from a mental problem solving perspective. They can read the code using the language they are use to. — Reactgular, Jan 01 '14 at 20:15
Because English is the national language of the Planet Earth. — שינתיא אבישגנת, Jan 01 '14 at 20:18
@MathewFoscarini "Wouldn't this be easier from a mental problem solving perspective". No. Look at all the books, help etc about any programming language (the ones with English keywords anyway). I can only see this as counterproductive. English is my second language and I have zero interest in this. I remember a French version of BASIC back in the 80s. That seemed cool on paper, but then it was pretty obvious we weren't dealing with the same language anymore. — MetalMikester, Jan 02 '14 at 00:08
I wondered about this the other day and thought that's its harder for non-native English speakers to learn the language, but the consensus seems to be that it isn't that much more trouble. As others have pointed out, it seems that supporting non-English keywords would negatively impact the semantic meaning of the source code. — Robert Gomez, Jan 04 '14 at 04:35
The keywords in a language like C or C++ are not actually English words. They are C and C++ words. They have a very specific meaning. "int" is not an English word. Lookup "char" and "auto" in an English dictionary for a good laugh. What about "float" and "double"... This makes C actually easier for non-native English speakers because they are not misled by the meaning of similar English words. — gnasher729, Jul 31 '15 at 19:48
related: [Do people in non-English-speaking countries code in English?](http://programmers.stackexchange.com/q/1483/31260) — gnat, Feb 08 '16 at 21:46

score 17 · Answer 1 · answered Jan 01 '14 at 19:01

Too much effort for too little added value.

The amount of English required to make sense of the keywords of any programming language is really very small - for the majority of second-language speakers it is probably way less than they already knew of the language even before they became programmers. Conversely, allowing all keywords to be replaced by their translations in other languages (and there are thousands of other languages!) would add to the complexity of the language implementation, would pollute the default namespace with thousands of tokens that are no longer legal identifiers, and would run the risk that whenever you have to review any code, you suddenly have to know what 'if' means in Czech to get it right. Almost all practitioners find these costs too high for the benefit of allowing people to program without rudimentary command of English.

Note that method and field names in standard or contributed libraries are another field. Most vendors stick to English for those as well, for largely the same reasons, but under particular circumstances the trade-off can be worthwhile, so there are some third-party languages with non-English public identifiers, but still a tiny minority.

Note also that this has nothing whatsoever to do with the question of whether English is particularly suited to express programming logic, or whether it is fair that U.S. and British programmers have a head start. The language has gained its present de-facto status as a computer standard largely through contingent historical processes (events that might have turned out otherwise, but in fact didn't). Although many people don't like this, few dislike it so much that they are willing to go back and deliberately re-introduce the possibility of fragmentation just to spite the Americans.

Good point about illegal identifiers (which, of course, is only necessary if interoperability between dialects is necessary, and if user-identifiers follow the same syntax as keywords (e.g. older Perl versions sidestepped this completely with sigils. But we wouldn't want all languages to be like Perl \*shudder\*)) — amon, Jan 01 '14 at 19:07
...like Perl... y nt? We could go the full Cobol instead.. "Add 5 to MyVar giving MyRes on size error display "Error" end-add". I thought the reason a language like C# was popular was because of its "terseness", after all C# and VB.NET are exactly the same thing except for the words used. — gbjbaanb, Jan 01 '14 at 19:43
@amon one can program perl without sigils... and in non-english too. See [Lingua::Romana::Perligata](http://search.cpan.org/~dconway/Lingua-Romana-Perligata-0.50/lib/Lingua/Romana/Perligata.pm) — , Jan 01 '14 at 20:38
@MichaelT *Technically*, Perligata is a different language that transpiles to Perl source code (compare the CoffeeScript and JavaScript relationship). There is also a [Klingon version](https://metacpan.org/pod/Lingua::tlhInganHol::yIghun). In Perl4 user subroutines had to be prefixed with an ampersand: `&while(1)` which avoids any clash with builtin operators or keywords. This was relaxed in Perl5. — amon, Jan 01 '14 at 21:08

Arseni Mourzenko · Answer 2 · 2014-01-01T19:20:59.673

The main reason is that all source code should be written in English. This applies as well to variable names, comments, etc.

The reason becomes obvious when you see for the first time a piece of code which is written in a language you don't know. For example:

// Записать изменения конфигурации.
var имя = this.RefreshMeta().ПолныйПуть;
this.Хостинг.Записать(имя);

Suddenly, you can't understand it, so you can't review, modify or unit test it. Additionally, in my example, you can't even work with it because you don't have Russian keyboard. In other words, you can't work with this code any longer.

This might be acceptable for closed source projects done by small companies. With the drawback of forbidding themselves to hire one day a skillful developer abroad who doesn't speak the language the code is written in.

Working in such company where French is preferred for code comments and documentation, I notice several issues with that:

Like said above, hiring is de facto limited to people speaking French.

If the language itself is localized, this means even more hassle compared to a case where only comments or variable names are.
If one day, the code is released as open source, most developers in the world won't be able to work with it.
While we work with .NET Framework and SQL Server which are fully Unicode-compatible, colleagues are often afraid of putting appropriate accents on letters, resulting in barely readable and extremely ugly names such as DELAI_EXPIRE.

^{For people who don't speak French, DELAI_EXPIRE should be written DÉLAI_EXPIRÉ. Written in its first form, it makes it difficult to understand, and can have two meanings: either it means that a delay is currently expiring, or that the delay is already expired.}

If keywords of a language are localized, should accents be included? Should they be removed?
Again related to accents, French keyboards are pretty bad when it comes to writing capital accented letters. You can write the letter À or Ê, but not É, such as in Également. This makes it pretty difficult to use auto-completion within an IDE.

What would happen if one of the keywords of a language begins with a character which is difficult or impossible to type on a keyboard?

A note about IDEs

IDEs are localized because:

They contain a lot of terms, sometimes complicated ones.

By comparison, most languages contain only a few dozens of keywords, which are often words learnt in schools.
They contain text which is intended to be read.

By comparison, keywords are... keywords. For example, the keyword var in C# doesn't have any meaning in English. It's an abbreviation from variable, but still not an English word. Knowing that var is an abbreviation from variable won't help me understanding the concept of implicitly typed variables. It could have been implicit or imptype or unknown.
The usage of a language in an IDE by a colleague doesn't affect other developers. One can have Eclipse in Italian, while his colleagues are happily working in Portuguese.

By comparison, using a specific language in source code means everybody should understand it, because version control won't magically translate comments or keywords written in Russian by a Russian developer to Japanese for his colleagues in Japan.

So why English (5.52% of world's population being native speakers), and not, for example, Mandarin, (14.1% of world's population being native speakers), or Spanish (5.85% of world's population being native speakers)?

Because of historical reasons, English is used internationally and learnt in schools in most countries. It's even more predominant on the web, and even more among programmers. One can either stick with that, and use English for communication with pairs, or try to change the world so that Spanish, Romanian or Kurdish become the next language every programmer agrees to use.

Your argument supports using a single language for keywords and comments, not exclusively using English. In fact, it's a pretty good argument to utilize a translated language, since in some cases English obviously isn't used. — DougM, Jan 01 '14 at 22:10
I'd remove `implicit` as an example replacement for `var` in C#, since `implicit` is also a C# keyword which means something completely different. — Darrel Hoffman, Jan 02 '14 at 03:19

score 11 · Accepted Answer · edited Nov 05 '22 at 09:12

In layman's words:

They do.

In this wikipedia article you can see there are lots of programming languages/compilers/interpreters based in languages other than english.

Here is a sample source code in Linnotte, a French-based programming language:

 nombre Fibonacci : 
  a est un nombre
  début
   questionne a sur "Entrez un nombre :" 
   affiche fibo(a)
 
 fibo :    
  * n est un nombre     
  début
   si n est < 2, retourne n          
   retourne fibo(n-1) + fibo(n-2)

.. and a hello world program in SAKO, a polish-based one:

K) PROGRAM DRUKUJE NAPIS HELLO WORLD
   LINIA
   TEKST:
   HELLO WORLD
   KONIEC

The question would be why most languages/compilers/interpreters use english reserved words.

Compilers and languages are more useful the more people in the world have access to it and software projects are one of the disciplines that require efforts from people in difference countries. For historical reasons english has become the predominant communication language in the modern world, so it is only logical most programming languages use english keywords. Should the course of history been different perhaps programming languages would be mostly latin-based, or french-based.

English has some other advantages: it doesn't use accents and marks like áéíúóâêîôûäëïöü etc which makes it the lowest common denominator regarding keyboard layouts.

It's interesting to know that Niklaus Wirth who designed PASCAL among other languages was Swiss, yet he choose english for the reserved words, as mentioned in this MSDN article.

Another curiosity:

While Perl's keywords and function names are generally in English, it allows modification of its parser to modify the input language, such as in Damian Conway's Lingua::Romana::Perligata module, which allows programs to be written in Latin or his Lingua::tlhInganHol::yIghun Perl language in Klingon. They do not just change the keywords but also the grammar to match the language.

score 2 · Answer 4 · answered Jan 01 '14 at 22:49

It's an error to state that "[in] programming languages all the syntax is written in English" or "Simple code (...) is written in English". Programming languages and natural languages are different things. Programming language tend to borrow some words from mostly English (the most prominent exception is probably null from German), in the same way English language borrows word "program" from French.

Other than a few words, virtually nothing from English is borrowed by programming language - syntax, tenses, conjugation, declension etc.

One may ask why English is the language tha programming languages borrow their vocabulary from. Answer is rather geopolitical than technical in nature.

Having said that, I know several people that do not speak English that are proficint coding in several programming languages.

"The contract was declared null and void". But then take "char" with the primary meaning according to my dictionary "noun (mass noun) material that has been charred, example: She trimmed the char from the wicks of the oil lamps." — gnasher729, Aug 01 '15 at 00:09

Why don't compilers support non-English keywords?

4 Answers4

A note about IDEs

Linked