0

I'm using python and pure-cdb library I have a large dataset with cdb extension which it's content is binary images and I want to read it. Because this data set is large, I'm using chunking method , but when I set the chunk value to 1024, I get error CDB too small , and when I increase its value to 2048, I get struct.error: unpack requires a buffer of 8 bytes.what is the problem? here is my code:

import cdblib
with open('a.cdb', 'rb') as file:
    while chunk := file.read(2048):
        reader = cdblib.Reader(chunk)
        for key, value in reader.iteritems():
          print(key, value)
          print('+{},{}:{}->{}'.format(len(key), len(value), key, value))

thank you for your help

M.rnnnn
  • 112
  • 2
  • 11

1 Answers1

0

Not a direct answer to your question, but instead an alternative solution: https://github.com/gstrauss/mcdb/ has no such limitations and has a python module. https://github.com/gstrauss/mcdb/blob/master/contrib/python-mcdb/README

mcdb removes the 4GB limit inherent in cdb.

gstrauss
  • 2,091
  • 1
  • 12
  • 16