0

My goal is to take an input document that has an array within a subtree and copy the entire document into an array of copies of the document, with individual values from that array set in each subsequent copy.

As an example:

Starting Document:

{
  "config": {
    "activeConfig": {
      "sourceDatabase": "test",
      "targetSites": [
        {
          "siteName": "location1",
          "targetDatabase": "devl",
          "siteShortName": "123"
        },
        {
          "siteName": "location2",
          "targetDatabase": "123",
          "siteShortName": "123"
        }
      ]
    }
  },
  "secondData": {
    "queries": [
      {
        "Tablename": "abc",
        "Query": "123"
      }
    ]
  }
}

Expected output:

[ {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ],
      "currentSite" : {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
},
 {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ],
      "currentSite" : {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      }
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
} ]

The JOLT spec I have so far is as follows:

[
  {
    "operation": "shift",
    "spec": {
      "config": {
        "activeConfig": {
          "targetSites": {
            "*": {
              "@4": "[]",
              "@": "[].config.activeConfig.currentSite"
            }
          }
        }
      }
    }
  }
]

Which is getting me close, but not quite there.

[ {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ]
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
}, {
  "config" : {
    "activeConfig" : {
      "currentSite" : {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }
    }
  }
}, {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ]
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
}, {
  "config" : {
    "activeConfig" : {
      "currentSite" : {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      }
    }
  }
} ]

This spec creates the structures I am looking for, but doesn't merge them. So my final array ends up with 4 items in it, 2 copies of the original document, and the two items from the configuration array. My goal is to have those 2 items from the configuration array merge into the document copies, so I have two copies of the original document, each configured with one value.

The only other spec I've gotten close with is

[
  {
    "operation": "shift",
    "spec": {
      "config": {
        "activeConfig": {
          "targetSites": {
            "*": {
              "@4": "[&]",
              "@": "[&].config.activeConfig.currentSite"
            }
          }
        }
      }
    }
  }
]

Which results in two document copies in the final array, but the currentSite section ends up with ALL values from the configuration array in each copy, rather than 1 per

[ {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ],
      "currentSite" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ]
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
}, {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ],
      "currentSite" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ]
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
} ]

(As to WHY, the next step on this document will be to split it into two flow files in a NiFi flow, which will allow each file to be configured separately)

Appreciate any input or assistance you can provide.

Update:

Found another interesting behavior that I'm struggling to grasp.

When I use the following spec, I get an output that doesn't make sense to me.

Spec:

[
  {
    "operation": "shift",
    "spec": {
      "config": {
        "activeConfig": {
          "targetSites": {
            "*": {
              "@4": "[&]",
              "@": "[&].config.activeConfig.currentSite&"
            }
          }
        }
      }
    }
  }
]

Output:

[ {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ],
      "currentSite0" : {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      },
      "currentSite1" : {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      }
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
}, {
  "config" : {
    "activeConfig" : {
      "sourceDatabase" : "test",
      "targetSites" : [ {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      }, {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      } ],
      "currentSite0" : {
        "siteName" : "location1",
        "targetDatabase" : "devl",
        "siteShortName" : "123"
      },
      "currentSite1" : {
        "siteName" : "location2",
        "targetDatabase" : "123",
        "siteShortName" : "123"
      }
    }
  },
  "secondData" : {
    "queries" : [ {
      "Tablename" : "abc",
      "Query" : "123"
    } ]
  }
} ]

I tried changing the output path "@": "[&].config.activeConfig.currentSite&" to use & in two places. This behaves similarly to my second example above where both values end up in both copies, but you can see that in this case, one ends up in currentSite0 and one ends up in currentSite1, in BOTH array indices 0 and 1. This implies that & is behaving like it has values 0 and 1 simultaneously when evaluated within the expression "[&].config.activeConfig.currentSite&". I'm very clearly missing some nuance of the behavior.

Tyler Lee
  • 2,736
  • 13
  • 24

1 Answers1

1

Have to use two shifts. Generally speaking, when doing "stuff" with arrays, have to do a shift operation per "thing" you are trying to do.

In your case, you are 1) wanting to duplicate content into an output array and 2) duplicate a specific targetsite.

Spec

[
  // Step 1: Make the copies of the input data, based on the number
  //  of items in the targetSites array.
  {
    "operation": "shift",
    "spec": {
      "config": {
        "activeConfig": {
          "targetSites": {
            "*": { // targetSites array index
              // go back up 4 levels and grab the whole tree "@4"
              // and write it to the output as a top level array
              // indexed by the "targetSites array index"
              "@4": "[&1]"
            }
          }
        }
      }
    }
  },
  {
    // Step 2 : Annoyingly copy everything across, but use the 
    //  value of the top level array index, to copy the "right" 
    //  data out of the targetSites array.
    "operation": "shift",
    "spec": {
      "*": { // top level array index
        "config": {
          "sourceDatabase": "[&2].config.sourceDatabase", // straight copy across
          "activeConfig": {
            "targetSites": {
              "@": "[&4].config.activeConfig.targetSites", // straight copy across
              //
              // Nifty but very rarely used feature.
              // Use "&3" to lookup the "current" value of the top level array index
              //  and then use that as an index into the targetSites array, and copy
              //  that across as "currentSite"
              "&3": "[&4].config.activeConfig.currentSite"
            }
          }
        },
        "secondData": "[&1].secondData" // straight copy across
      }
    }
  }
]
Milo S
  • 4,466
  • 1
  • 19
  • 22
  • That explains why I couldn't figure it out... Was trying real hard to do it all in one. This looks to be what I need and also teaches me more about Jolt. I appreciate the explanation comments on top of just providing the answer. Cheers! – Tyler Lee Dec 16 '17 at 05:00