0

I am wanting to write a script to

  1. fetch information then returning Json file
  2. filter Json file
  3. then converting that Json to CSV.

I have figured out steps 1 and 2, but am stuck on steps 3. Currently I have to use an online Json to CSV converter to get the desired output.

The Online Json to CSV tool uses python for users to connect to it's API to use the conversation tool. Possibly means that the tool itself is a python module.

Json file to convert

[{
  "matchId":"2068447050405",
  "timestamp":1658361314,
  "clubs": {
    "39335": {
      "toa":"486",
      "details": {
        "name":"Team one",
        "clubId":39335
      }},
    "111655": {
      "toa":"229",
      "details": {
        "name":"Team two",
        "clubId":111655
      }}},
  "players": {
    "39335": {
      "189908959": {
        "position":"defenseMen",
        "toiseconds":"3600",
        "playername":"player one"
      },
      "828715674": {
        "position":"rightWing",
        "toiseconds":"3600",
        "playername":"player two"
      }},
    "111655": {
      "515447555": {
        "position":"defenseMen",
        "toiseconds":"3600",
        "playername":"player three"
      },
      "806370074": {
        "position":"center",
        "toiseconds":"3600",
        "playername":"player four"
      }}}}]

Desired output csv code

"matchId","timestamp","clubs__|","clubs__|__toa","clubs__|__details__name","clubs__|__details__clubId","players__|","players__||","players__||__position","players__||__toiseconds","players__||__playername"
"2068447050405","1658361314","39335","486","Team one","39335","39335","189908959","defenseMen","3600","player one"
"2068447050405","1658361314","111655","229","Team two","111655","39335","828715674","rightWing","3600","player two"
"2068447050405","1658361314","","","","","111655","515447555","defenseMen","3600","player three"
"2068447050405","1658361314","","","","","111655","806370074","center","3600","player four"

How it looks in a spreadsheet Sheet example

Some believe the filter is having an effect on how the csv out put is formatted, here is a link to the full json file and csv output of that file. Code is to long to post on this page.

Original JSON before filter Original JSON

CSV output of original JSON file CSV output

Edit I should have mentioned this, The "Jason file to convert is only a small sample of the actual Json I wish to convert. I assumed I would be able to simple add to the code used to answer, I was wrong. The Json I intend to use has 9 total columns for clubs and 52 columns for Players.

Joseph
  • 5
  • 3
  • 1
    opinion: that's a strange output format: the clubs columns have no relationship to the players columns even though they're in the same row. – glenn jackman Aug 30 '22 at 23:11
  • Seconding that opinion. I think "2)filter Json file" accidentally destroyed those relationships as they were perceived to not materialize directly in the CSV output. – pmf Aug 30 '22 at 23:34
  • @glennjackman I have no idea how the online Json to CSV converter creates the output. There are 4 selections to chose from in how it treats the nested data, "header/detail report style, all on same row(inside separate columns), concatenate (inside shared columns), and matrix style. This one is the matrix style. Also both clubs and players are top objects in the array, therefore I believe that automatically makes them headers. – Joseph Aug 31 '22 at 04:59
  • @pmf an unfiltered Json would produce the same format style, but with a lot more information. – Joseph Aug 31 '22 at 05:01

1 Answers1

0

I'm working hard to really grok jq, so here you go: with no explanation:

jq -r '
    .[]
    | [.matchId, .timestamp] as [$matchId, $timestamp]
    | (.players | [to_entries[] | .key as $id1 | .value | to_entries[] | [$id1, .key, .value.position, .value.toiseconds, .value.playername]]) as $players
    | (.clubs | [to_entries[] | [.key, .value.toa, .value.details.name, .value.details.clubId]]) as $clubs
    | range([$players, $clubs] | map(length) | max)
    | [$matchId, $timestamp] + ($clubs[.] // ["","","",""]) + ($players[.] // ["","","","",""])
    | @csv
' file.json
"2068447050405",1658361314,"39335","486","Team one",39335,"39335","189908959","defenseMen","3600","player one"
"2068447050405",1658361314,"111655","229","Team two",111655,"39335","828715674","rightWing","3600","player two"
"2068447050405",1658361314,"","","","","111655","515447555","defenseMen","3600","player three"
"2068447050405",1658361314,"","","","","111655","806370074","center","3600","player four"

The default value arrays of empty strings needs to be the same size as the amount of "real" data you're grabbing.

Since this is a PITA to keep aligned, an update:

jq -r '
    def empty_strings: reduce range(length) as $i ([]; . + [""]);

    .[]
    | [.matchId, .timestamp] as [$matchId, $timestamp]
    | (.players | [to_entries[] | .key as $id1 | .value | to_entries[] | [$id1, .key, .value.position, .value.toiseconds, .value.playername]]) as $players
    | (.clubs | [to_entries[] | [.key, .value.toa, .value.details.name, .value.details.clubId]]) as $clubs
    | range([$players, $clubs] | map(length) | max)
    | [$matchId, $timestamp]
      + ($clubs[.]   // ($clubs[0] | empty_strings))
      + ($players[.] // ($players[0] | empty_strings))
    | @csv
' file.json
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • It does work for the small sample that I used as reference, other than there's no headers. But how do I make changes for more keys and values? I attempted to just add .value.addedkey but it just messed with the whole format. – Joseph Aug 31 '22 at 18:12
  • Thanks, the updated version I was able to make changes to. Then again I could have messed up when trying to edit the the original. My text editor liked to auto add these "/,. When one is used. – Joseph Sep 04 '22 at 04:47