What is the difference between syntax and grammar?

Question

I understand the difference between syntax and semantics -

Syntax: how the symbols are combined to form a valid expression or statement.
Semantics: the meaning of those symbols that form an expression or statement.

But what is the grammar? For example: sometimes I hear people say that some construct is "grammatically incorrect but syntactically it is correct". What does it mean?

FWIW, this sounds like nonsense to me. If the language's grammar accepts the piece of code, it conforms the syntax. Perhaps someone has a very broad (and nonstandard) definition of "syntax". Context/source? — , Oct 30 '11 at 20:35
@delnan. Not true. For example `int;` is grammatically valid, but syntactically ill-formed in C++. The grammar has no problem with this code, but syntax constraints require that a name is provided if the first part of a declaration contains no *class-specifier* or *enum-specifier* or, in C++11, *friend-specifier*. — Johannes Schaub - litb, Oct 30 '11 at 21:16
@JohannesSchaub-litb: Care to cite the part of the grammar that makes this valid? — , Oct 30 '11 at 21:32
@Johanes That's the reverse of the situation in the question. — Nicole, Oct 30 '11 at 22:11
@ Johannes Schaub: What rule makes "int;" valid? The grammar defines the syntax. — Casey Patton, Oct 30 '11 at 22:18
When I learned my native language in school, I was taught that grammar includes morphemics (including morphology) and syntax. Morphemics may be compared with what lexing deals with in programming languages, i. e., formation of words from phonemes and morphemes, classification of words by part of speech and definition of their properties. Syntax deals with how words are joined together to make sentences. If we are to extend this understanding of grammar onto programming languages, then a grammar includes the alphabet, the lexing rules, and the syntax. So, syntax is part of grammar. — ach, Nov 17 '16 at 10:05

Jerry Coffin · Answer 1 · 2022-09-08T16:00:52.923

8

Syntax just refers to how you express ideas, independent of the underlying ideas themselves.

A grammar is a set of (mostly syntactical) rules about how you form valid statements in a particular language, and the "type" of a statement, based on which rules were used to form that statement.

For example, C++ and Java have similar syntax in many respects, but completely separate grammars. Somebody who can read code in one can probably read code in the other with only minimal difficulty. But being able to read code reasonably well doesn't mean they can write it anywhere close to as well. For example, a C++ programmer might try to use a typedef, not realizing that Java has neither typedef nor anything even roughly equivalent (so in that respect, their syntax is completely different).

edited Sep 08 '22 at 16:00

answered Oct 30 '11 at 21:02

Jerry Coffin

44,385
5
89
162

+1 but you never actually clearly say which of those is the syntax and which of those is the grammar. Is there a standard convention? In contexts where those hairs *are* split, do people normally say "syntax" to refer to "that needs to parse as an unsigned integer"? Do they normally say "grammar" to refer to "that parsed unsigned integer needs to match the number of elements parsed in the following assignment list"? Or what? – mtraceur Sep 08 '22 at 00:13
I'd love to say it could be inferred from your answer, which is "syntax" and which is "grammar", but there seems to be a contradition in the sentences "those syntactical rules that are encoded using the generator's rules, vs. those parts that are enforced separately by code attached to a rule" and "The grammar rule might basically say something like [...] and then separately, there would be a bit of code to check that the unsigned_int was non-zero." – mtraceur Sep 08 '22 at 00:19
@mtraceur: You're right--it didn't really answer the question. I've edited (more accurately, completely replaced the answer). – Jerry Coffin Sep 08 '22 at 16:02

score 4 · Answer 2 · answered Oct 30 '11 at 21:53

The difference is fuzzy and not worth worrying about too much.

People will sometimes include context-sensitive constraints under the umbrella of syntactic correctness. The most common example is a type system. Another is Java's "no statements after return" rule. This simplifies formal discussion: the syntax yields a language (a set of sentences/expressions/programs) which is the domain of the semantics; anything else is "not a program", and the semantics need not bother with it.

In contrast, "grammar" typically refers to a method of describing context-free languages (attribute grammars notwithstanding).

The reason it's not worth worrying about much is that type systems are as often considered the "static semantics" of a language as they are a "syntactic discipline for correctness". And sometimes a language doesn't quite have a proper context-free grammar; C, for example, must feed information from the parser back into the lexer.

Pragmatically, anyone who relies on a distinction between "syntactic" and "grammatical" had better say so and explain what they mean.

I don't understand why the difference is fuzzy. The grammar describes the syntax. — Casey Patton, Oct 31 '11 at 00:59
@Casey, no, according to one usage of the word "syntax", the grammar specifies a *superset* of the syntax. — Ryan Culpepper, Oct 31 '11 at 01:25

score 1 · Answer 3 · answered Sep 08 '22 at 16:14

A grammar is a set of formulas that describe a syntax. There are a number of ways one can do this ranging from regular expressions (for extremely simple languages), to context free grammars, to far less common options.

Similarly, there are a number of methods you can use to describe semantics. Denotational, operational and axiomatic semantics being classes of methods for semantics.

Casey Patton · Answer 4 · 2011-10-30T20:49:55.527

A grammar is a set of rules to define a language. Rather, the grammar describes the syntax and semantics. A language might have two different grammars:

Syntax grammar (a set of rules that describes the ordering of symbols in the language)
Semantics grammar (a set of rules describing the valid semantic placement and use of those symbols)

For example, a part of the grammar in C might look something like:

if statement -> if_keyword "(" expression ")" if_block
if_keyword -> "if"
logical_statement -> some other stuff here...

Meaning:

an if statement is made of an if keyword followed by a parenthesis followed by an expression followed by a parenthesis followed by an if block
an if keyword is ....

Take a look at this way of defining a grammar. If you're really curious about grammars, take a look at GNU Bison, which is basically a tool for describing the grammar of a language.

The "grammatically incorrect but syntactically correct" doesn't make too much sense. Maybe they're referring to a grammar that describes the semantics of a language. It would certainly make more sense to just say "not semantically correct" though.

No, grammar does not define semantics and should never do it, unless it is something exotic, like http://www.contextfreeart.org/ — SK-logic, Oct 31 '11 at 12:38

score 0 · Answer 5 · answered Sep 08 '22 at 17:07

When you are comparing programming languages to human languages (which the OP is implicitly doing), the best way to think about it is this:

Syntax is the set of valid words (and correct spelling of those words) that make up a language, and how to identify proper nouns (i.e. names)
Grammar is how those words should be structured to make sense

In computer science explicitly, source code is tokenized. For example, there may be a single numerical value that represents a keyword, and a separate one for an open parenthesis. Computers are not restricted in what can be considered a token like human languages are.

The string of tokens then has to conform to certain patterns in order to generate the machine code that the CPU understands. For example, the grammar of a language dictates whether object.verb() is meaningful in the language, and what should happen.

When the parser cannot generate a valid token for any given part of the source code, the parser itself will generate a syntax error. If done well, it will also include enough information to find the offending code (for example a line number).

When you have valid syntax, your lexer will attempt to make sense of those tokens and generate the machine code. If there is an error in this stage, you will commonly get a generic compilation error (hopefully with a line number).

What is the difference between syntax and grammar?

5 Answers5