Is serialization strategy part of an abi?

Question

According to semver, the major version of a component must be updated when an abi-breaking change is incorporated. Wikipedia does a good job of describing how abi defines the interaction between different components and includes data types in its definition. But, I'm unsure whether that definition includes serializable objects (abstractly, not Java's horrible interface).

If the serialization strategy is not included in the definition, and a change is made to said strategy between minor versions, then I can imagine the following scenarios happening:

Machine running v1.0 makes a remote call to machine running v1.1 and the call fails.
Server with large persistent work queue updates from v1.0 to v1.1 and queue can no longer be processed.

Additionally, if the modification to the format can be made in a backwards compatible way, does that change anything?

I just read the semver page again, it actually says nothing about abi, only api. — David Cowden, Mar 12 '15 at 23:17
I think it is reasonable to say that if you are using serialized objects, that your applications has two ABIs: one for its code and one for its persistent data. You can break or not break them independently of each other. — 5gon12eder, Mar 12 '15 at 23:38

score 3 · Answer 1 · 2015-03-12T23:45:43.770

3

Serialized data is, well, data. If you break support for input files that used to work, or change the output files in ways that break external programs, that's a backwards incompatible change. If previously working input files are still supported and the changes to output files is such that it won't break external programs (or there are no external programs), it's a backwards compatible change. This holds for any files and data formats, regardless of whether "serialization" figures into their creation.

As with many API changes and bug fixes, this is a bit tricky: Almost anything has the potential to break some use case, however obscure, even if it technically doesn't affect the documented API. Serialization formats are usually undocumented, but (for good reasons!) generally the expectation is that serialized data from older versions can be read, unless explicitly and prominently documented otherwise. Violating this expectation will be considered a backwards incompatible change and no amount of hairsplitting will save you from the wrath of your users.

If the serialization format is documented (even if only as "it's a pickle file"), the case is even more clear cut. You documented that the file is in that format; changing the format in a way that breaks previously-compliant readers is backwards incompatible. Note that one can change a file format without breaking readers, as long as the change is not too radical and the original format was designed with extension in mind.

edited Mar 12 '15 at 23:45

answered Mar 12 '15 at 23:39

Not sure if I'm getting this right, but some well-known serialization libraries [do](https://developers.google.com/protocol-buffers/docs/encoding) [have](http://avro.apache.org/docs/current/spec.html) [documentation](https://github.com/msgpack/msgpack/blob/master/spec.md)...? – h.j.k. Mar 13 '15 at 01:48
@h.j.k. Those libraries document their formats, yes, but applications that either use those or roll their own custom format (binary or more often JSON/XML/etc.) frequently don't. – Mar 13 '15 at 09:01
ah got you, thanks for the explanation. I mistakenly read "(for good reasons!) before the "but" and thought you meant there are good reasons for undocumented serialization formats, when you really mean there are good reasons for "the expectation is that...". My bad. :) – h.j.k. Mar 13 '15 at 09:11
@delnan although I tend to agree with you, do you happen to know if there is any literature that might support this conclusion? – David Cowden Mar 13 '15 at 17:28
@DavidCowden Which part specifically do you want a reference for? That inputs and output files can be part of an API? That serialization formats are usually documented? That they are expected to remain compatible? Anything else? – Mar 13 '15 at 17:46
@delnan the first one (that input and output formats are part of your api). I guessing you don't have any refs off the top of your head though since you responded with a question (; – David Cowden Mar 13 '15 at 21:47

Is serialization strategy part of an abi?

1 Answers1