Deserializing an object

#include <x/deserialize.H>

int intvalue;

std::vector<std::string> strarray;

std::ifstream ifs("object.dat");

std::istreambuf_iterator<char> beg(ifs.rdbuf()), end;

typedef x::deserialize::iterator<std::istreambuf_iterator<char> > deser_t;

deser_t deser(beg, end);

deser(intvalue);
deser(strarray);

The x::deserialize namespace defines the iterator template class. This iterator implements an operator() that deserializes an object. The parameter to this template class is the input iterator type, std::istreambuf_iterator<char> is the popular choice. The constructor takes a reference to the beginning input iterator and an ending input iterator. iterator saves a reference to both iterators, which must exist as long as the iterator object itself remains in scope. The input iterator's value type must be either char or unsigned char.

iterator::operator() deserializes the passed object from the byte stream. It returns a reference to this, allowing for the following shorthand:

deser(intvalue)(strarray);

An exception gets thrown if the type of objects that were serialized does not match the type of objects to deserialize (but see below).

Note that objects being deserialized must be constructed beforehand and then deserialized.

An exception gets thrown if the input iterator has reached the ending input iterator before all objects have been deserialized.

Maximum sequence sizes

When deserializing an object, an optional second parameter to the iterator call sets the maximum sequence size that will be deserialized:

std::string objname;

deser(objname, 255);

If the string's size comes in to more than 255 characters, an exception gets thrown. This is used in environments where the serialized byte stream comes from an untrusted source. Normally, after receiving the sequence's size, the deserialization iterator allocates the suitable amount of memory, and this prevents an untrusted source from sending a large sequence size that allocates a huge amount of memory.

The second parameter is allowed only when the object is a container. The serialization iterator also allows a second parameter, which is ignored. This allows the definition of the serialize() template method that handles both serialization and deserialization. When serializing, the sequence size parameter gets ignored, when deserializing, it gets checked.

It's presumed that a thin API layer handles serialization on the untrusted source side, and provides a meaningful error path that enforces the maximum sequence size. However, since the source is untrusted, the deserialization side needs to enforce this check anyway.

Deserializing any one of several objects


class object_iter;

class Aobject {

// ...
public:

    Aobject(object_iter &);
};

class Bobject;

class Bobject_wrapper : public Bobject {

// ...
    Bobject_wrapper(object_iter &);
};

// ...

class object_iter {

public:

    template<typename iter_type>
    static void classlist(iter_type &iter)
    {
        iter.template serialize<Aobject>();
	iter.template serialize<Bobject, Bobject_wrapper>();
    }

    void deserialized(Aobject &obj)
    {
        // ...
    }

    void deserialized(Bobject &obj)
    {
        // ...
    }
};

// ...

object_iter any_object;

deser_t::any<object_iter> any_deser(deser, any_object);

any_deser();

This is an example of deserializing one of several possible objects. This is done when the serialized byte stream may contain any one of different objects, and whichever one it is needs to be deserialized and handled in some way. The steps to do this are as follows:

  • Define and instantiate a class called an object iterator class. The object iterator class defines an overloaded deserialized() method for each possible class that may be deserialized. The object iterator class also defines a classlist() method.

  • Each class that may be deserialized must have a constructor that takes a reference to the object iterator class as a parameter. Since the same class is presumably used on the serialization side, without the deserialization object iterator class, it should also have other constructors as well, probably a default constructor explicitly defined.

  • Define and instantiate an any template class instance. The deserialization iterator class defines an any template class, that takes the object iterator as the template parameter. Its constructor takes the deserialization iterator, and the object iterator class instance as a parameter.

  • Each invocation of any's operator() deserializes a class from the deserialization iterator, and invokes the appropriate deserialized() method in the object iterator class.

  • The object iterator class instance and the deserialization iterator must remain in scope as long as the any instance remains in scope.

  • From the object iterator's class classlist() method, calls to serialize() may also specify a second template class, which gets instantiated instead of the class being deserialized, when deserialization occurs. The above example results in Bobject_wrapper getting instantiated when deserialize Bobject. Bobject_wrapper subclasses from Bobject, and presumably Bobject's serialize() gets called to do the deed, while it's Bobject_wrapper gets instantiated. The object iterator class's overloaded deserialized() may specify either Bobject_wrapper or Bobject.

More specifically, the requirements of an object iterator class are:

  • A template function named classlist() (see below).

  • When any() determines which object should be deserialized, the templated object gets constructed on the stack. If the second class was given in classlist, the second class gets constructed instead of the first one.

  • The constructed object's serialize() method gets invoked.

  • An overloaded deserialized() method (which can be a template), one for each class iterated in the classlist() method. If two classes were specified in the classlist() method's iteration, either the first class or the second may be used for deserialized().

    any() constructs the appropriate object, and invokes the appropriate deserialized() method. The deserialized object goes out of scope before any() returns.

The object iterator's classlist template function takes one or two template classes as a parameter, and one argument, a reference to the template class.

The template class instance that's passed as an argument to classlist() will have a public member template function, serialize().

classlist() should iterate over a list of all classes that may be deserialized, by invoking the passed iterator's serialize() template method. The previous example iterates over two classes Aobject and Bobject/Bobject_wrapper, and shows the expected implementation pattern for any number of classes that may be deserialized.