I'm working on a fairly large c++ project which uses boost's serialization. The issue that I have with the way it is currently organized is that serialization is weaved into the main source code on all levels of the application from gui to the core. But sometimes I need to build the core (or other components) separately and I don't want to have to link to boost for its serialization when I don't need serialization at all. So the issues are:
- No way to build low-level components separately without having to link to boost;
- There's no need for serialization when separate components are compiled (for testing for example), so it's a waste of compilation time;
- Serialization creates code clutter when it's mixed with logic.
I came up with a certain scheme that I'd like to share with the community before initiating a project-wise refactoring which is time-consuming. I have also looked into ways of making the serialization non-intrusive, as suggested in boost documentation, but only solves one of the problems, which is code clutter, and even then it's unclear how to make it work with derived classes.
So at the moment, the typical serialized class looks somewhat like this
#include <boost/serialization/...>
...
#include <boost/serialization/...>
class A
{
T1 field1;
T2 field2;
public:
T operation1();
T operation2();
private:
friend class boost::serialization::access;
template<class Archive>
void serialize(Archive &ar, const unsigned int version) {
ar & boost:serialization::make_nvp("field1", field1);
ar & boost:serialization::make_nvp("field2", field2);
}
}
BOOST_CLASS_VERSION(A, 1);
BOOST_CLASS_IMPLEMENTATION(A, boost::serialization::object_class_info)
//BOOST_CLASS_EXPORT(A) // for derived classes
And I'd rather it looked like this
#include <serialization.h>
class A
{
SERIALIZABLE
private:
T1 field1;
T2 field2;
public:
T operation1();
T operation2();
}
Here serialization.h contains includes of all necessary boost headers and defines a macro SERIALIZABLE
that expands to
private:
friend class boost::serialization::access;
template<class Archive> void serialize(Archive &ar, const unsigned int version);
At the top-level CMakeList.txt an option ENABLE_SERIALIZATION
is introduced. If the option is ON
and names SERIALIZABLE
and SERIALIZABLE_SPLITTED
are defined (the latter is for save()
and load()
functions when we need those separately). If ENABLE_SERIALIZATION
is OFF
, names SERIALIZABLE
and SERIALIZABLE_SPLITTED
are defined empty and no boost headers are included in serialization.h.
The implementation of the function serialize()
(or save()
and load()
when needed) is in a separate cpp. For example for a library called geometry
we'll have library geometry_serialization
containing cpp with implementations of serialize()
, save()
and load()
functions. One of the key features is that it compiles and links only when ENABLE_SERIALIZATION
is ON
.
The advantages that I see in it:
- solves the problems that have motivated this refactoring;
- looks a lot cleaner;
- while the serialization is now separate from the main source code and doesn't cause clutter, there is a reminder left in the code that there's serialization code for this class somewhere else (which is good because total omission of that fact might also lead to problems).
Possibly shortcomings that I worry about:
the system is homebrewed so it needs to be explained to each and every developer working on the project;
the whole thing might be a reinvention of something that already exists in the boost serialization library and I just missed it;
possible existence of a more common and sensible approach that I don't know about.
Thank you in advance for any feedback and suggestions!