I am trying to extract data from a large json file. I need to match data from nested objects and lists as it is extracted.
I have written dozens of specs for this. The closest 'wrong' solution left the output full of null values. The current spec is the very close but it leaves some of the data in lists and I need it distributed differently. It uses two shifts.
[
{
"Contents": [
{
"original": "<h4>Hour 1</h4>",
"type": "other"
},
{
"content": {
"artist": "01art1816-01",
"catalog_no": "cat1816-01",
"comp_work": "comp1816-01",
"organ": "org1816-01"
},
"original": "1816-01",
"type": "listing"
},
{
"content": {
"artist": "art1816-02",
"catalog_no": "1816-02",
"comp_work": "1816-02",
"organ": "1816-02"
},
"original": "1816-02",
"type": "listing"
}
],
"filepath": "/listings/2018/1816/index.html",
"program_number": "1816"
},
{
"Contents": [
{
"original": "<h4>Hour 1</h4>",
"type": "other"
},
{
"content": {
"artist": "02art1839-01",
"catalog_no": "1839-01",
"comp_work": "1839-01",
"organ": "1839-01"
},
"original": "1839-01",
"type": "listing"
},
{
"original": "origin-othr",
"type": "other"
}
],
"filepath": "/listings/2018/1839/index.html",
"program_number": "1839"
},
{
"Contents": [
{
"original": "<h4>Part 1</h4>",
"type": "other"
},
{
"content": {
"artist": "03art8843-01",
"catalog_no": "8843-01",
"comp_work": "8843-01",
"organ": "8843-01"
},
"original": "8843-01",
"type": "listing"
},
{
"content": {
"artist": "art8843-02",
"catalog_no": "8843-02",
"comp_work": "8843-02",
"organ": "8843-02"
},
"original": "8843-02",
"type": "listing"
}
],
"filepath": "/listings/1988/8843/index.html",
"program_number": "8843"
}
]
I need the 'program_number' and 'filepath' to be in each object under content.
Expected:
{
"playlist": [{
"show": "1816",
"path": "/listings/2018/1816/index.html",
"artist": "artist1816-01",
"catalog_no": "cat1816-01",
"comp_work": "comp1816-01",
"organ": "org1816-01"
}, {
"show": "1816",
"path": "/listings/2018/1816/index.html",
"artist": "artist1816-02",
"catalog_no": "cat1816-02",
"comp_work": "comp1816-02",
"organ": "org1816-02"
}, {
"show": "1839",
"path": "/listings/2018/1839/index.html",
"artist": "artist1839-01",
"catalog_no": "cat1839-01",
"comp_work": "comp1839-01",
"organ": "org1839-01"
}, {
"show": "8843",
"path": "/listings/1988/8843/index.html",
"artist": "artist8843-01",
"catalog_no": "cat8843-01",
"comp_work": "comp8843-01",
"organ": "org8843-01"
}, {
"show": "8843",
"path": "/listings/1988/8843/index.html",
"artist": "artist8843-02",
"catalog_no": "cat8843-02",
"comp_work": "comp8843-02",
"organ": "org8843-02"
}]
}
Actual:
{
"playlist": [{
"show": "1816",
"path": "/listings/2018/1816/index.html",
"artist": ["artist1816-01", "artist1816-02"],
"catalog_no": ["cat1816-01", "cat1816-02"],
"comp_work": ["comp1816-01", "comp1816-02"],
"organ": ["org1816-01", "org1816-02"]
}, {
"show": "1839",
"path": "/listings/2018/1839/index.html",
"artist": "artist1839-01",
"catalog_no": "cat1839-01",
"comp_work": "comp1839-01",
"organ": "org1839-01"
}, {
"show": "8843",
"path": "/listings/1988/8843/index.html",
"artist": ["artist8843-01", "artist8843-02"],
"catalog_no": ["cat8843-01", "cat8843-02"],
"comp_work": ["comp8843-01", "comp8843-02"],
"organ": ["org8843-01", "org8843-02"]
}]
}
Using this Spec:
[
{
"operation": "shift",
"spec": {
"*": {
"program_number": "playlist.[&1].show",
"filepath": "playlist.[&1].path",
"Contents": {
"*": {
"type": {
"listing": {
"@(2,content)": "playlist.[&5]"
}
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"playlist": {
"*": {
"*": {
"artist": "playlist[&2].artist",
"catalog_no": "playlist[&2].catalog_no",
"comp_work": "playlist[&2].comp_work",
"organ": "playlist[&2].organ",
"@show": "playlist[&2].show",
"@path": "playlist[&2].path"
}
}
}
}
}
]