I need to keep some binary data in my MongoDB collection. It seems that I'm getting different JSON representation of my documents when retrieving the same record using either the C++ driver or the Java driver. Here is an example. Insert three record in MongoDB collection using Mongo shell:
db.binary_test.insert({"name":"Alex", "data" :BinData("0x00", "12345678")})
db.binary_test.insert({"name":"Alex", "data" :BinData("0x80", "12345678")})
The first record uses binary type 0x00 (generic); the second - 0x80 (user defined).
Retrieve these record using Mongo Shell:
db.binary_test.find().pretty()
Output:
{
"_id" : ObjectId("51acf66886174308b610d950"),
"name" : "Alex",
"data" : BinData(0,"12345678")
}
{
"_id" : ObjectId("51acf66c86174308b610d951"),
"name" : "Alex",
"data" : BinData(128,"12345678")
}
Note that the tag is represented as a number, not as a hex-string.
Now retrieve same records using a very simple Java program and convert them to JSON using the strict serializer:
ObjectSerializer serializer = JSONSerializers.getStrict();
System.out.println(serializer.serialize(doc));
Here is the output:
{ "_id" : { "$oid" : "51acf66886174308b610d950"} , "name" : "Alex" , "data" : { "$binary" : "12345678" , "$type" : 0}}
{ "_id" : { "$oid" : "51acf66c86174308b610d951"} , "name" : "Alex" , "data" : { "$binary" : "12345678" , "$type" : -128}}
Note that the binary data type is represented as an integer, not a hex-string.
Now for comparison use MongoDB C++ driver to retrieve the same two records and print them using the jsonString()
method. Here is the output:
{ "_id" : { "$oid" : "51acf66886174308b610d950" }, "name" : "Alex", "data" : { "$binary" : "12345678", "$type" : "00" } }
{ "_id" : { "$oid" : "51acf66c86174308b610d951" }, "name" : "Alex", "data" : { "$binary" : "12345678", "$type" : "80" } }
Now the type is a hex-string, not a number.
So the same record has different JSON representations depending on whether it was retrieved using the C++ driver or the Java driver. This discrepancy creates problems in mixed environments when some software uses the Java driver and some uses the C++ driver. Any suggestions how to solve the problem (other than by changing the driver code)? And which one is correct - the C++ driver that represents the type as a hex-string, or the Java driver? My understanding is that the representation returned by the C++ driver is correct, but can someone confirm this?
MongoDB http interface also returns the hex-string representation - probably because the backend that supports REST interface (mongod) is written in C++.
I'm using Java driver version 2.11.1 and C++ driver version 2.4.3.