How to separate a serialization code from application in a large c++ project

Question

I'm working on a fairly large c++ project which uses boost's serialization. The issue that I have with the way it is currently organized is that serialization is weaved into the main source code on all levels of the application from gui to the core. But sometimes I need to build the core (or other components) separately and I don't want to have to link to boost for its serialization when I don't need serialization at all. So the issues are:

No way to build low-level components separately without having to link to boost;
There's no need for serialization when separate components are compiled (for testing for example), so it's a waste of compilation time;
Serialization creates code clutter when it's mixed with logic.

I came up with a certain scheme that I'd like to share with the community before initiating a project-wise refactoring which is time-consuming. I have also looked into ways of making the serialization non-intrusive, as suggested in boost documentation, but only solves one of the problems, which is code clutter, and even then it's unclear how to make it work with derived classes.

So at the moment, the typical serialized class looks somewhat like this

#include <boost/serialization/...>
...
#include <boost/serialization/...>

class A 
{
    T1 field1;
    T2 field2;
public:
    T operation1();
    T operation2();
private:
    friend class boost::serialization::access;

    template<class Archive>
    void serialize(Archive &ar, const unsigned int version) {
        ar & boost:serialization::make_nvp("field1", field1);
        ar & boost:serialization::make_nvp("field2", field2);
    }
}

BOOST_CLASS_VERSION(A, 1);
BOOST_CLASS_IMPLEMENTATION(A, boost::serialization::object_class_info)
//BOOST_CLASS_EXPORT(A) // for derived classes

And I'd rather it looked like this

#include <serialization.h>

class A 
{
    SERIALIZABLE
private:
    T1 field1;
    T2 field2;
public:
    T operation1();
    T operation2();
}

Here serialization.h contains includes of all necessary boost headers and defines a macro SERIALIZABLE that expands to

private:
friend class boost::serialization::access;
template<class Archive> void serialize(Archive &ar, const unsigned int version);

At the top-level CMakeList.txt an option ENABLE_SERIALIZATION is introduced. If the option is ON and names SERIALIZABLE and SERIALIZABLE_SPLITTED are defined (the latter is for save() and load() functions when we need those separately). If ENABLE_SERIALIZATION is OFF, names SERIALIZABLE and SERIALIZABLE_SPLITTED are defined empty and no boost headers are included in serialization.h.

The implementation of the function serialize() (or save() and load() when needed) is in a separate cpp. For example for a library called geometry we'll have library geometry_serialization containing cpp with implementations of serialize(), save() and load() functions. One of the key features is that it compiles and links only when ENABLE_SERIALIZATION is ON.

The advantages that I see in it:

solves the problems that have motivated this refactoring;
looks a lot cleaner;
while the serialization is now separate from the main source code and doesn't cause clutter, there is a reminder left in the code that there's serialization code for this class somewhere else (which is good because total omission of that fact might also lead to problems).

Possibly shortcomings that I worry about:

the system is homebrewed so it needs to be explained to each and every developer working on the project;
the whole thing might be a reinvention of something that already exists in the boost serialization library and I just missed it;
possible existence of a more common and sensible approach that I don't know about.

Thank you in advance for any feedback and suggestions!

It appears that [these macros already exist](https://www.boost.org/doc/libs/1_72_0/libs/serialization/doc/wrappers.html#nvp). You can define your own macros that either map to the Boost ones (when enabled), and empty the definition when you need to compile without serialization. You can also use variadic macro to accept multiple member fields and expand into a concatenation expression. — rwong, Mar 20 '20 at 23:55
I think the difficulty you are seeing is due to a different issue: the issue of the body of member function templates. Being a template function, any code that calls this template function has to have access to the body of this template function within the same compilation unit. Thus, you are left with several options, one is to decide exactly what classes you will use with `Archive`, and convert the member function template into a non-template member function. — rwong, Mar 21 '20 at 01:52

How to separate a serialization code from application in a large c++ project

0 Answers0