84

I'm sure lots of developers are familiar with XML and JSON, and they've used both of them. Thus no point in explaining what they are, and what is their purpose, even in brief.

If we try to map their concepts, we can say (correct me if I'm wrong):

  1. XML tags are equivalent to JSON {}
  2. XML attributes are equivalent to JSON properties
  3. XML tag collection is equivalent to JSON []

The only thing I can think of, which doesn't exist in JSON, is XML Namespaces.

The question is, considering this mapping, and considering that JSON is highly lighter in this mapping, can we see a world in future (or at least theoretically think of a world) without XML, but with JSON doing everything XML does? Can we use JSON everywhere XML is used?

PS: Please note that I've seen this question. It's something entirely different from what I'm asking here. Thus please don't mention duplicate.

Saeed Neamati
  • 18,142
  • 23
  • 87
  • 125
  • 14
    We can (and should) replace all of that overbloated ill-designed stuff with S-expressions, obviously. World without XML would be a much better place indeed, but that's, unfortunately, nothing but a wishful thinking. – SK-logic Sep 16 '11 at 11:35
  • 19
    Ugh. I loathe these questions. I think this is really a case for using the right tool for the job, and not whether one can replace another entirely. There are so few absolutes in the world, even with computers. I couldn't imagine doing any of the things I do with JSON, at least where the respective technologies stand now. – Philip Regan Sep 16 '11 at 12:17
  • +1 This question, combined with Michael's answer and Philip's comment, perfectly exposed a little more of my ignorance. – Phil Sep 16 '11 at 13:49
  • 2
    @Philip, this is not a question for demolishing something. It just tries to see what JSON lacks, so that we can improve it. :) – Saeed Neamati Sep 16 '11 at 14:13
  • 4
    A discussion about the differences between two technologies to see where improvements can be made is very different than asking whether one can be replaced with the other. The former is more scholarly review than the latter which sounds more antagonistic from frustration than anything – Philip Regan Sep 16 '11 at 14:42
  • 2
    Purely hypothetical thought experiments are [not on-topic here](http://programmers.stackexchange.com/faq#dontask): this isn't a discussion board. If you have a specific problem you're actually facing that's prompted you to speculate about doing this, feel free to ask about that, instead. –  Sep 16 '11 at 15:51
  • 2
    @Mark, I appreciate your comment. I didn't mean that this is a **purely hypothetical thought experiment**. Rather, this is my question (was indeed, since Michael answer is a solid reason), and it's valid enough to follow _six_ guidelines. In lot's of scenarios, a developer may think of what to use for data transfer. I've event thought of replacing DTO with JSNO among layers of a multi-layered application. :) – Saeed Neamati Sep 16 '11 at 16:09
  • 12
    This isn't hypothetical. JSON seems to lack a feature that XML possesses. – S.Lott Sep 16 '11 at 16:46
  • @Saeed If it's actually something you're facing right now, can you revise your question to talk about that instead of asking a hypothetical about it? –  Sep 16 '11 at 20:04
  • 2
    How do you validate that a JSON structure follows a defined template, especially in terms of valid datatypes and ranges? – Paul Tomblin Sep 18 '11 at 22:25
  • 1
    @Paul Tomblin: By using a JSON-schema? – maaartinus Sep 20 '11 at 12:51

12 Answers12

163

The thing that gives XML its power and a lot of its complexity is mixed content. Stuff like this:

<p>A <b>fine</b> mess we're in!</p>

Don't even try to do that in JSON, or manipulate it in conventional programming languages. They weren't designed for the job.

This kind of question usually comes from people who forget that that the M in XML stands for markup. It's a way of taking plain text and adding markup to create structured text. It's quite handy for old-fashioned data too, but that's not what it was designed for or where its main strengths lie. There are plenty of ways of handling simple data, and JSON is one of them.

Michael Kay
  • 3,360
  • 1
  • 15
  • 13
  • 34
    +1: This is the distinguishing feature. Excellent point. – S.Lott Sep 16 '11 at 12:42
  • 7
    @Michael, you just taught me something valuable. This is a great answer. +1. – Saeed Neamati Sep 16 '11 at 14:15
  • 9
    .... There's 3 nodes indie of P, `A `, the B element, and ` mess we're in!`. It's an array, which you can simply explain in JSON. – Incognito Sep 16 '11 at 18:38
  • 2
    @Incognito would you really like to format text on forums with JSON? Really? – Rob Sep 19 '11 at 02:29
  • 5
    @Rob No, but I'm explaining that you could define things expressed by HTML in greater clarity, and perhaps faster parsing via JSON (as less parsing of the text is required to find the different types of nodes). If HTML were JSON-ML, we might have more devs that actually understand the DOM, text nodes, and bindings. – Incognito Sep 19 '11 at 03:29
  • The example above is not XML, or at least not *valid* XML. Its HTML. I don't think the questioner is wondering whether JSON should replace HTML, but rather RSS or a true XML document. – Byrne Reese Sep 19 '11 at 05:27
  • 3
    @Rob: no, but I wouldn't really like to format text in XML either. But I would like a GUI editor that produced an underlying format that was less bloated that XML for transfer and parsing. – gbjbaanb Sep 20 '11 at 12:53
  • 1
    @gbjbaanb Less bloated than This is bold This is italic ? – Rob Sep 20 '11 at 23:09
  • 5
    @ByrneReese: yes it's XML, and yes it's valid. That it’s also HTML is beside the point; in fact, XHTML is also valid XML. :-) – Martijn Sep 22 '11 at 17:19
  • 5
    There's a reason people forget what that M stands for. It's not often used for markup in practice. Still a good point about XML's advantage for that purpose. – PeterAllenWebb Sep 22 '11 at 20:26
  • One of the major difference that I have ever read !! And Proves how XML is has plus point in readability!! – Rookie Programmer Aravind Nov 30 '12 at 16:54
  • 2
    `{ "p": { "#text": [ "A ", " mess we're in!" ],"b": "fine" }}` – Martin Wickman Jun 25 '13 at 09:02
  • 3
    @MartinWickman: I don't understand your JSON... and I don't see how to find out that "fine" should be in the middle. What about `{"p": ["A ", {"b": "fine"}, " mess we're in!"]}`? – maaartinus Apr 15 '14 at 23:05
  • @maaartinus: There's a problem with these representations, what if there are several `p` tags? Because an object is actually a hash-map, you can't put two `p` properties in the same object. Therefore you have to use something equivalent to cwallenpoole's answer below (using a property to indicate tag name), which is much more verbose. – xzhu Nov 27 '14 at 00:18
  • 1
    @maaartinus: Or use the JsonML format: `["p", "A ", ["b", "fine"], " mess we're in!"]` which is essentially using JSON to mimic SXML. – xzhu Nov 27 '14 at 01:19
  • 3
    @trVoldemort No, several p-tags would like `

    par1

    par2

    ` would look like `[{"p": "par1}, {"p": "par2"}]`. Actually, I'm only using a hash key to distinguish the tag from the body. Mimicking SXML would be probably clearer. *While I might agree that XML may be better than JSON for markup, it's a real mess for anything else.* Especially configuration and object serialization.
    – maaartinus Nov 27 '14 at 07:35
32

The main difference, I think, is in the fact that XML is designed to be self-explaining with its dtd's and everything.

With JSON, you have to assume alot about the data you are receiving.

Maarten van Leunen
  • 1,009
  • 1
  • 8
  • 7
  • JSON has schemas coming, but just as with XML it would be foolish to have mixed content in a tagged structure without writing one to check against. – Philip Regan Sep 16 '11 at 12:20
  • 8
    "XML is designed to be self-explaining". Can you provide a link or a reference for this? I don't see it in the W3C standards for XML, and I'm wondering where this notion comes from. I seems like an urban legend more than a stated design goal. – S.Lott Sep 16 '11 at 12:43
  • 6
    @S.Lott: I think what he means by that is the nature of XML tags, in and of themselves, allows tagged content to be self-explanatory, i.e., DTDs are optional so well-formed XML can be parsed without one. But I agree with your take on the issue because, technically, JSON has the same capability, so I don't see self-explanation being the main difference at all (I'm not sure why this keeps getting voted up), but rather Michael Kay is more on the mark. – Philip Regan Sep 16 '11 at 12:51
  • 1
    @Philip Regan: "tagged content to be self-explanatory". It clearly isn't. `tells me nothing`. All it does is help parsing. It provides no semantic information of any kind. – S.Lott Sep 16 '11 at 12:53
  • @S.Lott: it does if you use meaningful tag names. – Michael Borgwardt Sep 16 '11 at 12:57
  • 5
    @S.Lott agreed. I'd have to say the JSON here http://json.org/example.html is easier to understand and better self documented than the associated XML due to its lack of verbosity. – Doug T. Sep 16 '11 at 13:00
  • 4
    @Michael Borgwardt: Without a full XSD (including some kind of ontology support) tag names tell me nothing. "meaningful" is hard to accomplish in general. That leaves me unclear on what "self-explaining" is supposed to mean in the answer. And I don't have evidence that it was even a design goal for XML. – S.Lott Sep 16 '11 at 13:11
  • 1
    @S.Lott: If you take the spec by itself, then no, you are not going to find support for self-explanation as a design goal. But as programmers, then I think we have a responsibility by default to ensure that it is self-explanatory. I think the same could be said of JSON or any other tagging language. – Philip Regan Sep 16 '11 at 14:00
  • 4
    @Philip Regan: As with "self-explaining code", it appears not to be a **feature** of XML. If it's just a universal implementation objective that applies to all software languages (code, data access, markup, whatever) then perhaps folks shouldn't mention it around XML specifically. – S.Lott Sep 16 '11 at 14:10
  • 1
    @S.Lott: I think it is unfair to judge XML on face value of spec like you are. I agree that the answer given here is incorrect because it is stating self-explanation as fact; self-explanation is not an explicit feature of XML within the spec. At the same time, I do think that the *spirit* of XML is to allow data to be self-explanatory, and that is where the responsibility of developers comes in. XML provides the tools, but it is the developer to answer the particular question of self-explanation in the face of their no being DTD. And that would apply to pretty much everything we do. – Philip Regan Sep 16 '11 at 14:57
  • +1 because I agree with the idea being expressed. I understand self descriptiveness to mean the document contains an unambiguous declaration of *how it is to be interpreted*. XML, with Doctypes/schemas/namespaces has, in a standard way, features that permit self descriptiveness. JSON *could* contain similar ideas; with external references in the data; but it's unclear that this should be understood as self descriptiveness. In XML, there's a sharp divide between the data (tags and text) and the metadata (doctypes) but json doesn't make that distinction. – SingleNegationElimination Sep 16 '11 at 15:10
  • 2
    I'm not indicting XML. I'm asking for clarification of the claim that XML is "self-explaining". "the spirit of XML" is the same as the spirit of all software. Why is only XML characterized this way with no factual support? "has...features that permit self descriptiveness" True. No debate. So does everything else. Why does XML, however, always get that tag? Where's the link or reference or quote? – S.Lott Sep 16 '11 at 15:34
  • @S.Lott: Again, I think you are being too literal here. It would seem to me that the mere *act* of tagging content *can*—has the potential—to make it self-explanatory. `random datum` is still better than just `random datum`. Whether the developer chooses to use meaningful tags isn't XML's problem to solve. – Philip Regan Sep 16 '11 at 16:42
  • @Philip Regan: I see you're just repeating your position that all languages (including XML) can be self-explnatory. That's possibly a good thing to repeat. However, it doesn't explain why folks seem to put "self-explaining" with XML more than any other language. Scheme is every bit as self-explaining as XML. Yet, somehow, XML gets all the press and Scheme is ignored. – S.Lott Sep 16 '11 at 16:45
  • @S.Lott: I see your question now. I think because it is a cultural issue, anyone would be hard-pressed to find real evidence of how XML gets the "self-explanatory" moniker. Popularity, perhaps? Marketing? I don't know much about Scheme, so I can't make a specific comparison. – Philip Regan Sep 16 '11 at 16:54
  • @Philip Regan : I asked ""XML is designed to be self-explaining". Can you provide a link or a reference for this?" So you have no explanation, either? Is that the point? You agree but have no evidence? No link or anything? That's all I'm looking for is evidence for the constant claim that XML is self-explaining. Not bashing XML. Not disagreeing with any of your points. Just looking for a link or a reference or a quote. – S.Lott Sep 16 '11 at 17:05
  • @S.Lott: I understand where you are coming from. Yes, I agree with the notion that XML can be self-explanatory, in and of itself. I am neither agreeing nor disagreeing with the notion that XML deserves that title over other technologies because I do not have enough experience to properly speak to that point. I have a suspicion that you will not find a point of evidence that says XML is *most* deserving of that title because the actual statement of the concept has been lost in the mist of programming culture, that simply the goal of XML—tagging content—is what drives that notion. – Philip Regan Sep 16 '11 at 17:48
22

A literal translation to JSON is often less succinct and less clear. Consider:

<foo>
   <x:bar x:prop1="g">
      <quuz />
   </bar>
</foo>

The most effective JSON representation I have seen of this:

{"localName":"foo",
 "children": // you need to have a special array to hold all children
 [
    {"localName": "bar",
     "namespace": "x"
        // once again, to ensure that there are no collisions,
        // attributes should be brought out into their own JSON structure 
        "attributes":[
            {"localName":"prop1",
             "namespace":"x",
             "value":"g"}
        ],
         "children":[
             {"name":"quux"}
         ]
     }
 ]}

Now, imagine that for an entire XML file. I am not saying that JSON does not have its place, but XML should not be ruled out.

cwallenpoole
  • 905
  • 5
  • 16
  • 8
    Now consider SXML: `(foo (x:bar (@ (x:prop1 "g")) (quuz)))` – SK-logic Sep 16 '11 at 11:48
  • 2
    @SK-Logic: That's great for a trivial example, but I couldn't imagine doing deeply nested, mixed content—like a book—with that. I think SXML is as much an academic exercise as anything. – Philip Regan Sep 16 '11 at 12:22
  • 3
    @Philip Regan: How can be writing an S-Exp any harder then using chevronitis, when it's a trivial 1:1 transformation into a less verbose form? – maaartinus Sep 16 '11 at 12:48
  • @maartinus: My field of expertise is in book publishing: textbooks of any kind are deep, complex beasts with a wide array of content that requires explicit management. DocBook and DITA are much more readable than the example given above. – Philip Regan Sep 16 '11 at 12:54
  • 1
    @Philip Regan, SXML is very easy to edit, quite unlike XML. And of course it is a much better choice for protocols, needless to mention the superiority of the available tooling. – SK-logic Sep 16 '11 at 12:56
  • @SK-logic are you seriously suggesting a lack of good tooling for xml? – Andy Mar 28 '15 at 16:56
  • @Andy of course I do. Just compare the functionality and flexibility of a full-blown Scheme vs. any of the XML tools, especially something as awful as xslt. – SK-logic Mar 30 '15 at 08:55
  • @SK-logic On the other hand, plain XML is already used as an interchange formats pretty much everywhere. When compared to a format which can leverage full-blown Python, PHP, ECMAScript, Java, and C#, SXML is pretty weak. (And, to be honest, the weakest standard XML library I've found has been in Common Lisp) – cwallenpoole Mar 30 '15 at 17:05
  • The whole {"localName":"prop1", "namespace":"x", "value":"g"} can be replaced with "x:prop1": "g" so your apparent complexity is just due to a clumsy translation. – gnasher729 Jan 14 '17 at 12:33
14

JSON and XML are both ways of formatting data. Both are capable of doing it perfectly well, so can JSON do everything XML does? Yes.

But..... A more relevant question might not be what XML/JSON can do, but rather, what can you do with XML/JSON.

There are several things you can do with XML that I don't think you can with JSON, such as translate with XLST, search with XPath and validate with schemas. All very, very useful.

Qwerky
  • 1,582
  • 8
  • 17
  • 5
    Except for mixed-content where the data contains tags. JSON doesn't do that very well at all. – S.Lott Sep 16 '11 at 12:44
11

There's a lot of functionality using XSLT that may not be possible with JSON. So, if they're not functionally equivalent they couldn't replace each other.

svick
  • 9,999
  • 1
  • 37
  • 51
StuperUser
  • 6,133
  • 1
  • 28
  • 56
  • 3
    That said, you could use another language to deserialise, manipulate and serialise JSON, and XSLT isn't XML, so this point is moot really. – StuperUser Sep 16 '11 at 10:52
  • 3
    XSLT ***is*** XML -- see the schema [here](http://www.w3.org/XML/2000/04schema-hacking/xslt.xsd) – treecoder Sep 16 '11 at 12:19
  • Thanks @greengit, I have only had a brief expose to it, updated the answer. – StuperUser Sep 16 '11 at 12:21
  • 2
    @StuperUser: How could be it *"impossible"* with JSON? It's just a transformation, maybe the tools are missing yet. Or is the problem related to the lack of attributes in JSON? – maaartinus Sep 20 '11 at 12:57
  • @maaartinus where are you getting "impossible" to quote? Transformation can be done with any language, but that's not JSON. XSLT is XML and the manipulation is done without needing another language. – StuperUser Sep 20 '11 at 13:03
  • 1
    @StuperUser: XSLT is an language (subset of XML) for which some interpreters were written in at least one another language (probably in C, Java, ...). The same could be done for JSON (define some JSON-T, write the intepreter), couldn't it? – maaartinus Sep 20 '11 at 15:45
  • @maaartinus I don't see why not. – StuperUser Sep 20 '11 at 15:53
8

The fact is, we're going to have to live with both for a long time, and being a JSON bigot is "considered harmful."

Scott C Wilson
  • 3,908
  • 20
  • 25
7

JSON is fairly new and legacy systems wont support it. Upgrading legacy systems is expesive and introduces bugs. JSON wont replace XML any time in the near future.

Tom Squires
  • 17,695
  • 11
  • 67
  • 88
  • 2
    thanks for your reply. What I have in mind, is a technical review, rather than an implementation strategy. I just want to know for example, for new versions of those legacy systems, can we drop XML entirely and use JSON? If not, what we miss in JSON? – Saeed Neamati Sep 16 '11 at 10:42
  • On the other hand, I haven't used any XML, just JSON, in the last few years. And good riddance. Of course XML is more enterprisy. Which is great for job security, not so great for efficiency. – gnasher729 Jan 14 '17 at 12:35
6

I'd say that cwallenpoole makes an excellent point. While most XML can be translated to JSON, whether doing so is better for it is a separate point.

JSON lends itself to data structures at least as well as XML and probably better, but XML reads much more naturally than JSON when marking up textual documents, where tags are used within a larger flow of text rather than simply as a way to delimit a hierarchy of fields.

While HTML 5 may have its own parser, that still leaves applications like DocBook.

ssokolow
  • 161
  • 1
  • 4
6

It depends on the domain. In terms of web services? Absolutely. It's utterly shameful that vendors are still pushing SOAP on their customers. REST + JSON all the way.

Now, when you're talking about complex, structured data with style information like Docbook or other implementation? That's a proper domain for XML.

Jason Lewis
  • 2,113
  • 13
  • 18
4

Why limit yourself to JSON when YAML is a super set and much more expressive and therefore powerful than XML or JSON.

That said, if you use the correct serialization frameworks you should be able to serialize and de-serialize all the above mentioned formats with a couple of simple lines of code.

3

It gets ugly when you try to model these two objects in JSON:

<customer><name>John Doe</name></customer>
<employee><name>John Doe</name</employee>

Using JSON as it is used to in 99% cases one gets lost with:

{ name: "John Doe" } 

And now you have to add some meta-structures and all the beauty of JSON is gone while you are left with the downsides.

  • 11
    the equivalent JSON to your provided XML is `{ customer: { name: 'John Doe' }, employee : { name: 'John Doe' } }`. So technically, your answer is not correct. :) – Saeed Neamati Sep 20 '11 at 03:57
  • Sure, the only thing lacking in JSON are the attributes, and there are useless for modeling *objects* (unlike for markup). Sometimes attributes get used as a shortcut for what can be expressed using nested data (e.g., in Hibernate config files), which is handy but actually the existence of choice makes it harder. Config files and modeling objects are two places where JSON is clearly superior. – maaartinus Sep 20 '11 at 13:03
  • 2
    @SaeedNeamati, so how would you model `John DoeJohn Doe` in JSON? – svick Sep 22 '11 at 19:26
  • 6
    { customers : [ { name: 'John Doe' }, { name: 'John Doe' } ] } ? – scrwtp Sep 22 '11 at 19:59
  • @SaeedNeamati @svick, Do you think JSON can be adapted well for this sort of structure? `` – Matt R Nov 27 '12 at 13:41
  • @V3ss0n First: please avoid condescension -- even if someone doesn't know Javascript well, it's unkind to lambast them for their lack of knowledge. Second: Your proposed scheme doesn't actually solve the problem I posed -- please look again at my example, which contained *two* cars. – Matt R Apr 29 '13 at 19:54
  • @MattR `{ "person": { "vehicles": [ { "type": "car", "colour": "red" }, { "type": "car", "wheels": 3 }, { "type": "bicycle", "wheels": 2 } ] } }` you can copy-paste it to http://jsonlint.com/ for easier reading. – Stijn Jul 26 '13 at 14:28
  • 2
    @Stijn -- right, and that works, but it confirms the comment from the original answer, that "you have to add some meta-structures" to model certain things that fall out more naturally in XML. – Matt R Jul 30 '13 at 08:11
  • `John Doe John Doe` – TRiG Jan 04 '14 at 04:02
  • @MattR: Right, translating attributes into JSON makes it look bad. However, I'd say that the very existence of attributes is bad. Imagine you need some details on wheels. So you get `` which looks terrible itself and you have to support the now obsolete attribute `wheels`. And a possible inconsistency when both the attribute and details are present. – maaartinus Apr 15 '14 at 23:40
  • @maartinus -- I don't think attributes are the problem, rather, JSON doesn't have a direct way of denoting the type of an object, so the encoding can be more cumbersome for applications where you want that. – Matt R Jul 11 '14 at 13:17
3

I don't know if such a facility exists for JSON, but in .NET at least you can validate XML against a given schema. That's a valuable advantage of XML in my eyes.

Grant Palin
  • 1,721
  • 2
  • 14
  • 28