** Quick summary: C++ app loading data from SQL server using using OTL4, writing to Mongo using mongocxx bulk_write, the strings seem to getting mangled somehow so they don't work in the aggregation pipeline (but appear fine otherwise). **
I have a simple Mongo collection which doesn't seem to behave as expected with an aggregation pipeline when I'm projecting multiple fields. It's a trivial document, no nesting, fields are just doubles and strings.
First 2 queries work as expected:
> db.TemporaryData.aggregate( [ { $project : { ParametersId:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "ParametersId" : 526988617 }
> db.TemporaryData.aggregate( [ { $project : { Col1:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "Col1" : 575 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "Col1" : 579 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "Col1" : 616 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "Col1" : 617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "Col1" : 622 }
But then combining doesn't return both the fields as expected.
> db.TemporaryData.aggregate( [ { $project : { ParametersId:1, Col1:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "ParametersId" : 526988617 }
It seems to be specific to the ParametersId field, for instance if I choose 2 other fields it's OK.
> db.TemporaryData.aggregate( [ { $project : { Col1:1, Col2:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "Col1" : 575, "Col2" : "1101-2" }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "Col1" : 579, "Col2" : "1103-2" }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "Col1" : 616, "Col2" : "1300-3" }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "Col1" : 617, "Col2" : "1300-3" }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "Col1" : 622, "Col2" : "1400-3" }
For some reason when I include ParametersId field, all hell breaks loose in the pipeline:
> db.TemporaryData.aggregate( [ { $project : { ParametersId:1, Col2:1, Col1:1, Col3:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "ParametersId" : 526988617, "Col1" : 575 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "ParametersId" : 526988617, "Col1" : 579 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "ParametersId" : 526988617, "Col1" : 616 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "ParametersId" : 526988617, "Col1" : 617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "ParametersId" : 526988617, "Col1" : 622 }
DB version and the data:
> db.version()
4.0.2
> db.TemporaryData.find()
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 575, "Col2" : "1101-2", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 579, "Col2" : "1103-2", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 616, "Col2" : "1300-3", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 36, "Col1" : 617, "Col2" : "1300-3", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 622, "Col2" : "1400-3", "Col3" : "CHF" }
Update: enquoting the field names makes no difference. I'm typing all the above in the mongo.exe command line, but I see the same behavior in my C++ application with a slightly more complex pipeline (projecting all fields to guarantee order).
This same app is actually creating the data in the first place - does anyone know anything which can go wrong? All using the mongocxx lib.
** update **
Turns out there's something going wrong with my handling of strings. Without the string fields in the data, it's all fine. So I've knackered my strings, somehow, even though they look and behave correctly in other ways they don't play nice with the aggregation pipeline. I'm using mongocxx::collection.bulk_write to write standard std::strings which are being loaded from sql server through the OTL4 header. In-between there's a strncpy_s when they get stored internally. I can't seem to create a simple reproducible example.