I need help to make decision. I have a need to transfer some data in my application and have to make a choice between these 3 technologies. I've read about all technologies a little bit (tutorials, documentation) but still can't decide...
How do they compare?
I need support of metadata (capability to receive file and read it without any additional information/files), fast read/write operations, capability to store dynamic data will be a plus (like Python objects)
Things I already know:
- NumPy is pretty fast but can't store dynamic data (like Python objects). (What about metadata?)
- HDF5 is very fast, supports custom attributes, is easy to use, but can't store Python objects. Also HDF5 serializes NumPy data natively, so, IMHO, NumPy has no advantages over HDF5
- Google Protocol Buffers support self-describing too, are pretty fast (but Python support is poor at present time, slow and buggy). CAN store dynamic data. Minuses - self-describing don't work from Python and messages that are >= 1 MB are serializing/deserializing not very fast (read "slow").
PS: data I need to transfer is "result of work" of NumPy/SciPy (arrays, arrays of complicated structs, etc.)
UPD: cross-language access required (C/C++/Python)