4

While exploring the flatbuffer library for fast serialization, I noticed that the library had an incredibly fast way to read flatbuffer vectors into numpy arrays with the 'Variable'AsNumpy() command, but I have been unable to find (in source) a corresponding encoding method for serializing numpy to flatbuffer.

So far, I am seemingly stuck with their example:

for i in reversed(range(0, 10)):
  builder.PrependByte(i)

This is obviously not ideal. In reverse, one can simply call toNumpy() on most data vectors and that works great.

Is there something simple I'm missing or is this functionality just not available?

user1519665
  • 511
  • 5
  • 16
  • It may well not be available. I'd recommend to ping @kbrose on here https://github.com/google/flatbuffers/pull/4390 or in a new issue to see if anyone wants to add it. – Aardappel Mar 12 '18 at 17:35
  • Fair. I am currently hoping to explore the use of the "internal" CreateByteVector() function of the builder class in order to build a function that takes the fast serialization of numpy.ndarray.tobytes() and does it that way. Having issues with the nested assertion statements, however. I do plan on requesting the feature but need a hack for something immediately in the meantime. – user1519665 Mar 12 '18 at 17:38

2 Answers2

4

We can do it as follows. Consider we want to write bytesOfImage = testImage.tobytes() without using the PrependByte().

We can follow below steps:

  1. Make sure builder has been initialized correctly with StartVector.

    Image.ImageStartDataVector(builder, len(bytesOfImage))
    

    This will move head by len(bytesOfImage) of bytes or more depends on alignment operation etc. But we don't need to worry about this as StartVector API will take care of these things. We only need to know latest head after StartVector() call.

  2. Seek the header to correct place before writing into Bytes array.

    builder.head = builder.head - len(bytesOfImage)
    

    StartVector moves the head to new position, and as we know flatbuffers writes data in little endian order ie [N, N-1, N-2,.....0] way. So we need to go back to position len(bytesOfImage) from current updated head before write.

  3. Copy your data into Bytes Array

    builder.Bytes[builder.head : (builder.head + len(bytesOfImage))] = bytesOfImage
    
  4. Call EndVector() to make sure head moved to correct place for future writes.

    data = builder.EndVector(len(bytesOfImage))
    
Amit Sharma
  • 1,987
  • 2
  • 18
  • 29
  • Glad you got this working. I'll accept your answer for the sake of the future, but people should also remain aware of the github issue. – user1519665 May 01 '18 at 14:15
1

See this stackoverflow link for workaround and to monitor if the feature is updated:

https://github.com/google/flatbuffers/issues/4668#issuecomment-372430117

user1519665
  • 511
  • 5
  • 16
  • Hi, did you get it working.. ? I'm also looking for efficient write operation while preparing large blobs.. but it is not working for me. :( Can you please share your findings with me ? – Amit Sharma Apr 26 '18 at 09:25
  • @AmitSharma As linked, the github mentions a direct access of bytes. Yes it is "hackish", but this is what we went with and have no issues with in the 1.5 months. Though we recently switched to protobuffers for unrelated performance reasons. – user1519665 Apr 26 '18 at 15:27
  • Thanks for your reply.. I'm trying the same but it's not straight forward and not working for me; may be because I have complex schema.. – Amit Sharma Apr 29 '18 at 16:47