-1

I have a legacy cli tool which outputs a structured list with sub-items intended with a tab (stackoverflow won't let me put tabs here so I replaced them with 4 spaces in this example).

Heading One:
    Sub One: 'Value 1'
    Sub Two: 'Value 2'
Heading Two:
    Sub Three: 'Value 3'
    Sub Four: 'Value 4'
Key One: 'This key has no heading' 

I try to achieve an JSON output like

{
  "Heading One": {
    "Sub One": "Value 1",
    "Sub Two": "Value 2"
  },
  "Heading Two": {
    "Sub Three": "Value 3",
    "Sub Four": "Value 4"
  },
  "Key One": "This key has no heading"
}

Is this possible with jq or do I need to write a more complex python-script?

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
bam
  • 954
  • 8
  • 26
  • The input provided looks close to a YAML syntax. Can the assumption be made? – Inian Jul 12 '22 at 15:16
  • @Inian sadly, no. The headings sometimes include special characters like quotes and parenthesis. AFAIK, YAML does not allow that. – bam Jul 12 '22 at 15:20
  • Can you provide an input that is close to your actual input? From what I was going to post as an answer - https://github.com/mikefarah/yq does understand such special characters and can transform them to JSON. If you can update a real world example, I can try and post an answer – Inian Jul 12 '22 at 15:22
  • But too bad you cannot use Tabs in YAML - [A YAML file cannot contain tabs as indentation](https://stackoverflow.com/q/19975954/5291015) – Inian Jul 12 '22 at 15:35

1 Answers1

2

This is an approach for a deeply nested input. It splits on top-level items using a negative look-ahead regex on tabs following newlines, then separates the head and "unindents" the rest by removing one tab following a newline, which serves as input for a recursive call.

jq -Rs '
  def comp:
    reduce (splits("\n(?!\\t)") | select(length > 0)) as $item ({};
      ($item | index(":")) as $hpos | .[$item[:$hpos]] = (
        $item[$hpos + 1:] | gsub("\n\t"; "\n")
        | if test("\n") then comp else .[index("'\''") + 1: rindex("'\''")] end
      )
    );
  comp
'
{
  "Heading One": {
    "Sub One": "Value 1",
    "Sub Two": "Value 2"
  },
  "Heading Two": {
    "Sub Three": "Value 3",
    "Sub Four": "Value 4"
  },
  "Key One": "This key has no heading"
}
pmf
  • 24,478
  • 2
  • 22
  • 31
  • Oh wow, I did not expect the expression to be this extensive. Thanks for not only answering the question, but also providing a perfeclty fine working solution! – bam Jul 13 '22 at 07:20