2

I have some bson files (I don't have the database they came from, just the files, called file1.bson and file2.bson) and I would like to be able to translate them to json. My code is the following:

import json
import bson

to_convert = ["./file1", "./file2"]

for i in to_convert:

    INPUTF = i + ".bson"
    OUTPUTF = i + ".json"

    input_file = open(INPUTF, 'r', encoding='utf-8')
    output_file = open(OUTPUTF, 'w', encoding='utf-8')

    reading = (input_file.read()).encode() #reading = (input_file.read()+'\0').encode()
    datas = bson.BSON.decode(reading)
    json.dump(datas, output_file)

It raises "bson.errors.InvalidBSON: bad eoo", which seems to indicate the NULL char at the end of a file is missing, but even when I add it manually (as in the commented part) the error persists.

How can I fix this ?

  • 1
    With mongodb comes a tool called [`bsondump`](http://docs.mongodb.org/manual/reference/program/bsondump/) that does just this: convert BSON files to JSON files. Might save you some coding. – Thomas Oct 11 '16 at 13:36
  • I'm not using mongodb, I just have the bson dumps. –  Oct 11 '16 at 13:38

1 Answers1

-1

Actually this answered my question. Weird how poorly documented is the bson package.

import json
import bson

to_convert = ["./file1", "./file2"]

for i in to_convert:

    INPUTF = i + ".bson"
    OUTPUTF = i + ".json"

    input_file = open(INPUTF, 'rb', encoding='utf-8')
    output_file = open(OUTPUTF, 'w', encoding='utf-8')

    raw = (input_file.read())
    datas = bson.decode_all(raw)
    json.dump(datas, output_file)
Community
  • 1
  • 1
  • So I think this solution works because your files are not single BSON documents, but multiple unknown ones combined together. `decode_all()` can handle this, but `decode()` is designed for a single BSON document. – Mani Mar 11 '21 at 09:01
  • binary doesn't take encoding arg – Charming Robot Jun 28 '21 at 00:33