1

I have file1.json and plain text file2, Where using file2 values compare with file.json with matching values of file2 there will be the corresponding field which is CaseID in file1.json the resultant file should consist of those values. I have mentioned cases below with expected results.

I was trying to extract using the awk tool, where I don't get my expected answer

 awk -F, 'FNR==NR {f2[$1];next} !($0 in f2)' file2 file1

file1.json

{
    "Cases": [{
            "CaseID": "100",
            "CaseUpdatedByUser": "XYZ",
            "Case": {
                "CaseName": "Apple",
                "ID": "1"
            }
        },
        {
            "CaseID": "350",
            "CaseUpdatedByUser": "ABC",
            "Case": {
                "CaseName": "Mango",
                "ID": "1"
            }
        },
        {
            "CaseID": "440",
            "CaseUpdatedByUser": "PQR",
            "Case": {
                "CaseName": "Strawberry",
                "ID": "1"
            }
        }
    ]
}

file2

Apple
Strawberry
Mango

Expected output:

100
350
440
mpx
  • 3,081
  • 2
  • 26
  • 56
  • This might help: `jq -r '.Cases[] | "\(.Case.CaseName);\(.CaseID)"' file1` – Cyrus Jul 15 '21 at 16:37
  • @Cyrus My system is configured with customized repo, so will not be able to download ```jq``` – Suhas Poojari Jul 15 '21 at 16:40
  • @Cyrus is this command comparing values with file2. i dont see anything about file2 in command – Suhas Poojari Jul 15 '21 at 16:42
  • No. Correct. But this makes it easier to process the output with `awk`. – Cyrus Jul 15 '21 at 17:12
  • @Cyrus somehow i managed to pull out result using ```jq``` .. with this result i used ```awk``` but it will result out file contents of last argument which is passed.. ```jq -r '.Cases[] | "\(.Case.CaseName);\(.CaseID)"' file1 >> f1``` I re-directed result into f1 contents of f1 file ~~~ Apple;100 Mango;350 Strawberry;440 ~~~ now if use awk ``` awk -F, 'FNR==NR {f2[$1];next} !($0 in f2)' file2 f1``` it will result into contents of f1 – Suhas Poojari Jul 15 '21 at 17:56
  • Is the order of the numbers important? – Cyrus Jul 15 '21 at 19:24
  • @Cyrus yes it should be sorted and should be in ascending order – Suhas Poojari Jul 15 '21 at 20:01

2 Answers2

2

How about if you write an extract.py module that helps you to get the exact information that you need.

The module is flexible so it can be imported as a module into any project.

I've tried with a complex and long json file and it worked just fine.

The code of this module is:

#extract.py

def json_extract(obj, key):
    arr = []

    def extract(obj, arr, key):
        if isinstance(obj, dict):
            for k, v in obj.items():
                if isinstance(v, (dict, list)):
                    extract(v, arr, key)
                elif k == key:
                    arr.append(v)
        elif isinstance(obj, list):
            for item in obj:
                extract(item, arr, key)
        return arr
    
    values = extract(obj, arr, key)
    return values

For further explanation, this is the URL of the original post ( Extract Nested Data From Complex JSON ).

Pin90
  • 91
  • 1
  • 10
-1

With jq, awk and sort:

jq -r '.Cases[] | "\(.Case.CaseName);\(.CaseID)"' file1 \
  | awk -F ';' 'NR==FNR{array[$1]=$2; next} {print array[$1]}' - file2 \
  | sort -n

Output:

100
350
440
Cyrus
  • 84,225
  • 14
  • 89
  • 153