Should serialization and deserialization be the responsibility of the class being serialized?

Question

I'm currently in the (re)design phase of several model classes of a C# .NET application. (Model as in M of MVC). The model classes already have plenty of well-designed data, behaviors, and interrelationships. I am rewriting the model from Python to C#.

In the old Python model, I think I see a wart. Each model knows how to serialize itself, and the serialization logic has nothing to do with the rest of the behavior of any of the classes. For example, imagine:

Image class with a .toJPG(String filePath) .fromJPG(String filePath) method
ImageMetaData class with a .toString() and .fromString(String serialized) method.

You can imagine how these serialization methods are not cohesive with the rest of the class, yet only the class can be guaranteed to know sufficient data to serialize itself.

Is it common practice for a class to know how to serialize and deserialize itself? Or am I missing a common pattern?

Zymus · Accepted Answer · 2015-07-04T04:00:46.803

I generally avoid having the class know how to serialize itself, for a couple of reasons. First, if you want to (de)serialize to/from a different format, you now need to pollute the model with that extra logic. If the model is accessed via an interface, then you also pollute the contract.

public class Image
{
    public void toJPG(String filePath) { ... }

    public Image fromJPG(String filePath) { ... }
}

But what if you want to serialize it to/from a PNG, and GIF? Now the class becomes

public class Image
{
    public void toJPG(String filePath) { ... }

    public Image fromJPG(String filePath) { ... }

    public void toPNG(String filePath) { ... }

    public Image fromPNG(String filePath) { ... }

    public void toGIF(String filePath) { ... }

    public Image fromGIF(String filePath) { ... }
}

Instead, I typically like to use a pattern similar to the following:

public interface ImageSerializer
{
    void serialize(Image src, Stream outputStream);

    Image deserialize(Stream inputStream);
}

public class JPGImageSerializer : ImageSerializer
{
    public void serialize(Image src, Stream outputStream) { ... }

    public Image deserialize(Stream inputStream) { ... }
}

public class PNGImageSerializer : ImageSerializer
{
    public void serialize(Image src, Stream outputStream) { ... }

    public Image deserialize(Stream inputStream) { ... }
}

public class GIFImageSerializer : ImageSerializer
{
    public void serialize(Image src, Stream outputStream) { ... }

    public Image deserialize(Stream inputStream) { ... }
}

Now, at this point, one of the caveats with this design is that the serializers need to know the identity of the object it's serializing. Some would say that this is bad design, as the implementation leaks outside of the class. The risk/reward of this is really up to you, but you could slightly tweak the classes to do something like

public class Image
{
    public void serializeTo(ImageSerializer serializer, Stream outputStream)
    {
        serializer.serialize(this.pixelData, outputStream);
    }

    public void deserializeFrom(ImageSerializer serializer, Stream inputStream)
    {
        this.pixelData = serializer.deserialize(inputStream);
    }
}

This is more of a general example, as images usually have metadata that goes along with it; things like compression level, colorspace, etc. which may complicate the process.

I'd recommend that serialize to/from an abstract IOStream or the binary format (text being a specific kind of binary format). This way you are not restricted to writing to a file. Wanting to send the data over the network would be an important alternative output location. — unholysampler, Jul 04 '15 at 03:46
Very good point. I was thinking about that, but had a brain fart. I'll update the code. — Zymus, Jul 04 '15 at 03:55
I assume that as more serialization formats are supported (i.e. more implementations of the `ImageSerializer` interface are written), the `ImageSerializer` interface will also need to grow. EX: A new format supports optional compression, previous ones didn't --> add compression configurability to `ImageSerializer` interface. But then the other formats are cluttered with features that don't apply to them. The more I think about it, the less I think inheritance applies here. — kdbanman, Jul 06 '15 at 16:07
While I understand where you're coming from, I feel it is not an issue, for a couple of reason. If it's an existing image format, chances are that the serializer already knows how to deal with compression levels, and if it's a new one, you'll have to write it anyway. One solution, is to overload the methods, something like `void serialize(Image image, Stream outputStream, SerializerSettings settings);` Then it's just a case of wiring up the existing compression and metadata logic to the new method. — Zymus, Jul 06 '15 at 16:26

score 3 · Answer 2 · answered Jul 04 '15 at 04:07

Serialization is a two part problem:

Knowledge about how to instantiate a class aka structure.
Knowledge about how to persist/transfer the information that is needed to instantiate a class aka mechanics.

As far as possible, structure should be kept separate from the mechanics. This increases the modularity of your system. If you bury the information on #2 within your class then you break modularity because now your class must be modified to keep pace with new ways of serialization (if they come along).

In the context of image serialization you would keep the information on serialization separate from the class itself and keep it rather in the algorithms that can determine the format of serialization --therefore, different classes for JPEG, PNG, BMP etc. If tomorrow a new serialization algorithm comes along you simply code that algorithm and your class contract remains unchanged.

In the context of IPC, you can keep your class separate and then selectively declare the information that is needed for serialization (by annotations/attributes). Then your serialization algorithm can decide whether to use JSON, Google Protocol Buffers, or XML for serialization. It can even decide whether to use the Jackson parser or your custom parser --there are many options you'd get easily when you design in a modular fashion!

Can you give me an example of how those two things can be decoupled? I'm not sure I understand the distinction. — kdbanman, Jul 06 '15 at 15:28

Should serialization and deserialization be the responsibility of the class being serialized?

2 Answers2