2

Within Spoon I have used the mongoDB Input step. for a given document of form..

{"Number": [
    "4700100004"
],
"Random": [
    "unknown"
],
"List_Of_Vals1": [
    "3",
    "2",
    "1",
],
"List_Of_Vals2": [
    "1",
    "2",
    "3",
]}

I am able to unwind one of the arrays using Mongo query from pdi

[{"$unwind":"$List_Of_Vals1"}]

which produces::

Number       Random    List_Of_Vals1    List_Of_Vals2
"4700100004" "unknown" "3"              ["1","2","3"]
"4700100004" "unknown" "2"              ["1","2","3"]
"4700100004" "unknown" "1"              ["1","2","3"]

But ultimately I need to unwind both arrays sequentially which I thought I might be able to do by writing

[{"$unwind":"$List_Of_Vals1"},{"$unwind":"$List_Of_Vals2"}]

but this returns a duplication of "List_Of_Vals1"

Number       Random    List_Of_Vals1    List_Of_Vals2
"4700100004" "unknown" "3"              "1"
"4700100004" "unknown" "3"              "2"
"4700100004" "unknown" "3"              "3"
...
...
...

What I cant seem to figure out how to get is both unwound without duplication as such:

Number       Random    List_Of_Vals1    List_Of_Vals2
"4700100004" "unknown" "3"              "1"
"4700100004" "unknown" "2"              "2"
"4700100004" "unknown" "1"              "3"

Any Help is much appreciated. Thanks

Tyler Cowan
  • 820
  • 4
  • 13
  • 35

1 Answers1

1

You can use the includeArrayIndex option for $unwind introduced in Mongo 3.2 to achieve this. For each document outputted by the $unwind operation, it adds a new field holding the array index.

You can use these fields in $project and $match stages to filter the output to the correct subset of documents, i.e. those where the indexes match.

db.test.aggregate([
  { $unwind: { path: "$List_Of_Vals1", includeArrayIndex : "index1" } },
  { $unwind: { path: "$List_Of_Vals2", includeArrayIndex : "index2" } },
  { $project: { 
         _id : 1,
         Number: 1,
         Random: 1,
         List_Of_Vals1: 1,
         List_Of_Vals2: 1,
         valid: { $eq: ["$index1", "$index2"] } }
   },
  { $match: { valid: true } } 
]);

Note that unwinding 2 arrays of size n will produce n x n results so you may run into problems with limits if both arrays can be large.

There is an upcoming feature for a $zip operator that should enable a better solution.

sheilak
  • 5,833
  • 7
  • 34
  • 43