24

I was applying markdown comments in the xml comments of a config file when the XmlParser reported that two hyphens (--) are not allowed in xml comments.

Checking the XML Specification, it appears that xml comment isn't designed to contain two hyphens for compatibility reasons with SGML parsers.

Why do SGML parsers disallow double hyphens in comments?

OnesimusUnbound
  • 755
  • 8
  • 16

2 Answers2

40

This page outlines quite a bit of the HTML/SGML history, and the rather convoluted rules of those two consecutive hyphens (double dash).

The relevant part about SGML:

To put it simply, the double dash at the start and end of the comment do not start and end the comment. Double dash indicates a change in what the comment is allowed to contain. The first -- starts the comment, and tells the browser that the comment is allowed to contain > characters without ending the comment. The second -- does not end the comment. It tells the browser that if it encounters a > character, it must then end the comment. If another -- is added, then it goes back to allowing the > characters.

hakre
  • 1,165
  • 12
  • 17
Joris Timmermans
  • 8,998
  • 2
  • 36
  • 60
  • 7
    [The section you're referring to](http://www.howtocreate.co.uk/SGMLComments.html#doubledash). When I read what the SGML specs intended for `--` within the comment, my head spins around on the complexity it will introduce later on. – OnesimusUnbound May 17 '13 at 14:22
  • 2
    The advice to never use `--` inside a comment seems good to me. But, is there a standard way of escaping it? Suppose I want to create (and share) an output filter to ensure `foo -- bar` never causes a problem. Is there an SGML equivalent of `foo -\- bar`? (I'm sure it's not backslash though!) Or `-` (see [this answer](http://stackoverflow.com/a/2233736/473249)), or something else? If we just replace `--` with `-` or `- -`, the escaping is not reversible. – fazy Oct 09 '14 at 16:59
  • @fazy Sorry, but it is a *comment*, something that is there just so explain what follows to a human reader with a high potential that no user of that document ever reads it. I really do not understand the point in insisting on double dashes as part of a comment. – Timothy Truckle Jul 01 '22 at 15:13
16

Because a double hyphen is the comment delimiter in SGML. The <! starts an SGML instruction, the -- indicates the start or end of a comment. So basically it is for the same reason that a C++ comment cannot contain */.

Daniel Griscom
  • 770
  • 5
  • 15
Jörg W Mittag
  • 101,921
  • 24
  • 218
  • 318