2

I'm wondering if there is any way to rename the field by concatenating the values of JSON attributes into existing attribute names using JOLT transformation.

Suppose we have input here:

{
  "auth_id": "0000-0000-0000",
  "read_time": "2022-01-10T00:00:00.0",
  "src_name": "REQ-A001",
  "reading_a": "150.18",
  "reading_b": "12.10",
  "reading_c": "3.00",
  "note": 1
}

What I expect is to concatenate field values from auth_id and read_time before all existing field name, with a colon (:) as separator and the result would become:

Expected:

{
  "0000-0000-0000:2022-01-10T00:00:00.0:auth_id": "0000-0000-0000-0000-0000",
  "0000-0000-0000:2022-01-10T00:00:00.0:read_time": "2022-01-10T00:00:00.0",
  "0000-0000-0000:2022-01-10T00:00:00.0:src_name": "REQ-A001",
  "0000-0000-0000:2022-01-10T00:00:00.0:reading_a": "150.18",
  "0000-0000-0000:2022-01-10T00:00:00.0:reading_b": "12.10",
  "0000-0000-0000:2022-01-10T00:00:00.0:reading_c": "3.00",
  "0000-0000-0000:2022-01-10T00:00:00.0:note": 1
}

So far I've referred to

and come up with a JOLT spec:

[
  {
    "operation": "modify-default-beta",
    "spec": {
      // concatenate field value auth_id and read_time, while using ":" as separator
      "ukey": "=concat(@(1,auth_id),':',@(1,read_time))"
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        // concatenate field values before field name
        "@": "@(2,ukey):&"
      }
    }
  }
]

But I got this output:

{
  "0000-0000-0000:2022-01-10T00:00:00.0": {
    ":auth_id": "0000-0000-0000",
    ":read_time": "2022-01-10T00:00:00.0",
    ":src_name": "REQ-A001",
    ":reading_a": "150.18",
    ":reading_b": "12.10",
    ":reading_c": "3.00",
    ":note": 1
  }
}

I expect to use the JOLT spec in the JoltTransformJSON processor in Nifi.

Any help or guidance is much appreciated!

Mohammadreza Khedri
  • 2,523
  • 1
  • 11
  • 22
Linden HSU
  • 25
  • 6
  • 1
    Thank you! This community has saved my sanity crisis on the job for over five years, and though I haven't answered a question yet, I think I could give out a neat, readable question. – Linden HSU Mar 09 '23 at 06:17

2 Answers2

1

You can use the following transformation spec

[
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "ukey": "=concat(@(1,auth_id),':',@(1,read_time),':')"
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "$": "@(0)" // reverse key-value pairs
      },
      "ukey": "&"
    }
  },
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "*": "=concat(@(1,ukey),@(1,&))"
    }
  },
  {
    "operation": "remove",
    "spec": {
      "ukey": ""
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "$": "@(0)" // reverse key-value pairs back
      }
    }
  },
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "*note": "=toInteger"
    }
  }
]

the demo1 on the site http://jolt-demo.appspot.com/ is :

enter image description here

the alternative option, which gives the same result , uses consecutive shift transformation specs after having used the modify transformation you already have, as follows

[
  {
    "operation": "modify-default-beta",
    "spec": {
      "ukey": "=concat(@(1,auth_id),':',@(1,read_time))"
    }
  },
  { // nest the attributes under common object with "ukey" key
    "operation": "shift",
    "spec": {
      "*": {
        "@": "@(2,ukey).&"
      }
    }
  },
  {// concatenate upper object and one level inner object keys for the attributes except for "ukey"
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "@": "&2:&1" // concatenation 
        },
        "ukey": {      // exception case
          "*": {
            "*": ""
          }
        }
      }
    }
  }
]

the demo2 on the site http://jolt-demo.appspot.com/ is :

enter image description here

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
  • 1
    Though in the first spec, I found that it will **merge field names into an array** on the first key-value reverse if they share the same value (such as `0`, `null` etc.), which I wasn't expecting, the second spec is highly readable and looks nice to people who weren't familiar with the wildcards. Thank you! – Linden HSU Mar 09 '23 at 06:19
  • 1
    You're welcome @LindenHSU have a nice study and work! Yes, I've thought on case to make it better after the first one. Best wishes. – Barbaros Özhan Mar 09 '23 at 06:19
1

You can use this shorter spec:

[
  {
    "operation": "shift",
    "spec": {
      "*": "@(1,auth_id).@(1,read_time).&"
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "*": "&2:&1:&"
        }
      }
    }
  }
]

enter image description here

Or you can use another way like below:

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "@(2,auth_id)": "keys[#2]",
        "@(2,read_time)": "keys[#2]",
        "$": "keys[#2]",
        "@": "values"
      }
    }
  },
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "keys": {
        "*": "=join(':',@0)"
      }
    }
  },
  {
    "operation": "shift",
    "spec": {
      "values": {
        "*": "@(2,keys[#1])"
      }
    }
  }
]

1, Shift operation Create keys and values separately.

2, Modify operation Concate all keys with :.

3, Shift operation Create our desired output.

Note: Please run each spec separately to understand this code better.

Mohammadreza Khedri
  • 2,523
  • 1
  • 11
  • 22
  • 1
    I didn't know using the wildcards in these specs would be this neat! I've read both specs about how the data transform from one operation to another, and wow, such a super move. Though both of you have a great answer, still, I'm accepting this one as the answer. – Linden HSU Mar 09 '23 at 06:20