When you have swizzled pointers, you cannot follow them (efficiently) until they are unswizzled.
Imagine if you have a bunch of records, each with links (pointers) to other records, in some arbitrary graph.
One naive way to swizzle these is to take the binary value of the pointer as a UID, and serialize this. While we do this we also maintain a table of record address to order in serialization, and we serialize that last. Call this the swizzle table.
When we deserialize, we load up the data structures, and we build a table of (order in serialization) to (new record address in memory). Then we load up the swizzle table, which is a map from (old address) to (order in serialization).
We merge those two tables and we get a (old address) to (new record address in memory) table -- the unswizzle table.
Next, we go over our deserialized records and for each pointer, we apply this map. The old binary value of each address is stored in some pointer; we look it in the unswizzle table, and replace it. Now each pointer is pointing at the address of the record in the new address space.
struct node {
std::vector<node*> links;
void write( OutArch& out ) const& {
out.register_swizzle(this);
out << links.size();
for (node* n:links) {
out << out.swizzle(n);
}
}
static node* read( InArch& in ) {
auto* r = new node;
in.register_unswizzle( r );
std::size_t n;
in >> n;
r->reserve(n);
for (std::size_t i = 0; i<n; ++i) {
std::intptr_t ptr;
in >> ptr;
r->links.push_back( reinterpret_cast<node*>(ptr) ); // danger
}
return r;
}
friend void do_unswizzle( InArch& in, node* n ) {
for (node*& link : n->links ) {
link = in.unswizzle(link);
}
}
};
struct OutArch {
friend void operator<<( OutArch& arch, std::size_t count ); //TODO
friend void operator<<( OutArch& arch, std::intptr_t ptr ); //TODO
std::intptr_t swizzle( void* ptr ) {
return reinterpret_cast<std::intptr_t>(ptr);
}
void register_swizzle( void* ptr ) {
swizzle_table.insert( {(reinterpret_cast<std::intptr_t>(p), record_number} );
++record_number;
}
private:
// increased
std::size_t record_number = 0;
std::map< std::intptr_t, std::size_t > swizzle_table;
};
struct InArch {
friend void operator>>( InArch& arch, std::size_t& count ); //TODO
friend void operator>>( InArch& arch, std::intptr_t& count ); //TODO
template<class T>
void register_unswizzle( T* t ) {
unswizzle_table.insert( {record_number, t} );
++record_number;
unswizzle_tasks.push_back([t](InArch* self){
do_unswizzle( *self, t );
});
}
struct unswizzler_t {
void* ptr;
template<class T>
operator T*()&&{return static_cast<T*>(ptr);}
};
unswizzler_t unswizzle( void* ptr ) {
auto p = reinterpret_cast<std::intptr_t>(ptr);
auto it1 = swizzle_table.find(p);
if (it1 == swizzle_table.end()) return {nullptr};
auto it2 = unswizzle_table.find(it1->second);
if (it2 == unswizzle_table.end()) return {nullptr};
return { it2->second };
}
void load_swizzle_table(); //TODO
void execute_unswizzle() {
for (auto&& task: unswizzle_tasks) {
task(this);
}
}
private:
// increased
std::size_t record_number = 0;
std::map< std::size_t, void* > unswizzle_table;
std::map< std::intptr_t, std::size_t > swizzle_table;
std::vector< std::function< void(InArch*) > > unswizzle_tasks;
};
There are many ways to swizzle. Instead of saving the binary value of the pointer, you can save the order you'll serialize it (for example); but this requires a bit of careful preprocessing or time travel, as you'll have references to structures you haven't serialized yet.
Or you could generate a guid, write the guid out with each record, and keep a swizzle table of {record address} to {guid} in the old process. As you save records, you see if the pointers are in your swizzle table; if not, add them. Then write the guid instead of the pointer. Don't write the swizzle table in this case; the unswizzle table of {guid} to {record address} can be constructed just from a guid header on each record. Then using that unswizzle table, rebuild the records on the destination side.