3

In below yaml, I want to iterate through the states and capture both key, value to create associative arrays for each state and then print them.

Note: I'm using yq from mikefarah repo.

data.yaml

data:
  continent:
    countries:
      - state1:
          capital1: city1
          capital2: city2
      - state2:
          capital2: city2
      - state3:
          capital3: city3
      - state4:
          capital4: city4
          capital5: city5

Expected output:

echo state1[@]
key capital1, value city1
key capital2, value city2

echo state2[@]
key capital2, value city2

and so on

I tried multiple ways but unable to traverse each state and capture to array

Try1

#!/bin/bash

declare -A state1_list

while IFS="=" read -r key value; do state1_list["$key"]=$value; done < <(
  yq '.mft.aw-dev.components[].state1 | to_entries | map([.key, .value] | join("=")) | .[]' data.yaml
)

for key in "${!state1_list[@]}"; do printf "key %s, value %s\n" "$key" "${state1_list[$key]}"; done 

Try2

for i in $(yq '.data.continent.countries[] | keys' data.yaml| sed -e 's/-//g' | sed '/^$/d'); do echo -- $i --; yq '.mft.aw-dev.components[].$i.state[]' data.yaml; done
(For iterating and then finding values)

TIA

jagatjyoti
  • 699
  • 3
  • 10
  • 29

2 Answers2

3

After a bunch of trial and error:

yq -r '
    .data.continent.countries[]
    | to_entries
    | .[0].key as $state
    | .[0].value
    | to_entries
    | map([$state, .key, .value])
    | @tsv
' data.yaml

outputs

state1  capital1    city1
state1  capital2    city2
state2  capital2    city2
state3  capital3    city3
state4  capital4    city4
state4  capital5    city5

I see that capital2 occurs in both state1 and state2

Then

declare -A seen=()
while IFS=$'\t' read -r state capital city; do
    if ! [[ -v seen["$state"] ]]; then
        seen["$state"]=''
        declare -A "$state"
        declare -n s="$state"
    fi
    s["$capital"]=$city
done < <(
    yq -r '
        .data.continent.countries[] 
        | to_entries 
        | .[0].key as $state
        | .[0].value 
        | to_entries 
        | map([$state, .key, .value]) 
        | @tsv
    ' data.yaml
)
declare -p "${!seen[@]}"

outputs

declare -A state1=([capital1]="city1" [capital2]="city2" )
declare -A state2=([capital2]="city2" )
declare -A state3=([capital3]="city3" )
declare -A state4=([capital5]="city5" [capital4]="city4" )

$ yq --version
yq (https://github.com/mikefarah/yq/) version v4.34.1
$ bash --version
GNU bash, version 5.2.15(1)-release ...

Having multiple arrays may be too difficult to work with. Putting the data into a single pseudo-multi-dimensional array:

declare -A capitals

while IFS=$'\t' read -r state capital city; do
    capitals["${state},${capital}"]=$city
done < <(
    yq -r '...'
)

In this case:

$ declare -p capitals
declare -A capitals=([state3,capital3]="city3" [state4,capital4]="city4" [state4,capital5]="city5" [state2,capital2]="city2" [state1,capital2]="city2" [state1,capital1]="city1" )

and

$ for key in "${!capitals[@]}"; do
    if [[ $key == "state4",* ]]; then
        echo "$key -> ${capitals["$key"]}"
    fi
done

state4,capital4 -> city4
state4,capital5 -> city5

This opens the door for the separator character being part of the data as well. Awk uses character \034 for this purpose.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • Getting below as output: $ bash data.sh declare -A state1=([state2]=$'state3\tstate4\tcapital5\tcity5' ) yq --version yq (https://github.com/mikefarah/yq/) version 4.28.1 bash --version GNU bash, version 4.4.20(1)-release (x86_64-redhat-linux-gnu) Did not work as expected. – jagatjyoti Jul 31 '23 at 07:44
  • I can't reproduce that output with bash 4.4. – glenn jackman Jul 31 '23 at 11:34
2

You could first collect all names and their items into a bash array using mapfile and a single call to yq, then use that array to construct one complete declare -A statement. This makes use of yq's @sh converter for names used in a shell.

unset state1 state2 state3 state4

mapfile -t < <(yq '
  .data.continent.countries[][]
  | key + "=(" + (map("[" + (key | @sh) + "]=" + @sh) | join(" ")) + ")"
' data.yaml) && declare -A "${MAPFILE[@]}"

echo "${state2[capital2]}"
city2

Tested with mikefarah/yq version v4.34.1, and GNU bash version 5.0.3.

pmf
  • 24,478
  • 2
  • 22
  • 31
  • Getting error: $ bash data1.sh Error: 3:36: invalid input text "@sh) + \"]=\" + @s..." declare -A BASH_ALIASES=() declare -A BASH_CMDS=() Same version as pasted in above answer. – jagatjyoti Jul 31 '23 at 07:47
  • @jagatjyoti Can you post the versions of mikefarah/yq and bash you are using? I suppose you are using a yq version prior to [v4.31.1](https://github.com/mikefarah/yq/releases/tag/v4.31.1) which introduced the `@sh` encoder, and was released on Feb 20, 2023. – pmf Jul 31 '23 at 08:15
  • 1
    @jagatjyoti For testing, you could change `(key | @sh) + "]=" + @sh` to `key + "]=" + .` to see if `@sh` is the only reason for it to fail. Omitting it for production, however, opens a vulnerability of your code, facilitating malicious code injection. – pmf Jul 31 '23 at 08:29
  • Thanks, after changing to above it works fine. How do I get the entire stanza for state1 or state4 ? I'm not interested in extracting single values from keys. – jagatjyoti Jul 31 '23 at 09:30
  • @jagatjyoti What exactly do you mean by "the entire stanza for state1 or state4"? What should the output contain if not "single values"? Can you post a (syntactically valid) bash command and its output, as you desire it to work? (`echo state1[@]` dos not print any variable contents, it just literally prints `state1[@]`.) – pmf Jul 31 '23 at 10:05
  • @jagatjyoti Rather guessing, but maybe you mean something like this: With `mapfile -t < <(yq '.data.continent.countries[][] | key + "=" + (map("key " + key + ", value " + .) | join("; "))' data.yaml) && declare -A "${MAPFILE[@]}"` executed, `echo "$state1"` outputs `key capital1, value city1; key capital2, value city2`. Don't forget to reinstate `@sh` for production! – pmf Jul 31 '23 at 10:05