4

I'm testing the feasibility of using PyYAML v3.12 within a RHEL7 environment to parse the contents of moderately complex YAML config files, by feeding it a key and getting the keypair value back. The query would look something like this python my_yaml_search.py key_to_search and having it print back the value, for example:

Desired bash command: python search_yaml.py $servername

Desired response (value only, not key-value): myServer14

So far I've created the following .py:

import sys
import yaml
key = sys.argv[1]

with open("config.yml") as f:
    try:
        data = yaml.safe_load(f)
        for k, v in data.items():
            if data[k].has_key(key):
                print data[k][v]
    
    except yaml.YAMLError as exc:
        print "Error: key not found in YAML"

config.yml:

---
server:
    servername: myServer14
    filename: testfile.zip
    location: http://test-location/1.com
    repo:
        server_name_fqdn: server.name.fqdn.com
        port: 1234

So far, running python search_yaml.py $servername produces a list index out of range; python search_yaml.py servername produces nothing. I'm new to Python/PyYAML, so I assume I'm likely passing in a variable to the program incorrectly and sys might not be the Python library I need, however I'm hitting a brick wall on how to do this correctly - any input would save my sanity.

  • Hi MissCatAssTrophy, and welcome to Stack Overflow! Are you using Python 2? If so, could you edit the question to reflect that? (Or better yet, consider switching to use Python 3, since Python 2 is now essentially obsolete) Also, when I try to reproduce your result, I get a `TypeError` from running `python2 search_yaml.py servername`. Do you really get no output at all? – David Z Oct 07 '20 at 04:47
  • What if there are multiple/nested keys matching the input? – wim Oct 07 '20 at 05:00
  • A dumb grep / awk is probably going to be way faster and easier than yaml parsing for this task, by the way. – wim Oct 07 '20 at 05:02
  • Sounds like you are reinventing `yq`. Is there a reason you don't want to use existing tools? – tripleee Oct 07 '20 at 05:28

1 Answers1

2

If you know all the keys that you're traversing, you can do this:

import sys
import yaml

key = sys.argv[1]

with open("config.yml") as f:
    data = yaml.safe_load(f)
    n = key.count('.')
    parts = key.split('.')
    res = None
    i = 0
    while i <= n:
        try:
            if not res:
                res = data[parts[i]]
            else:
                res = res[parts[i]]
        except (yaml.YAMLError, KeyError) as exc:
            print ("Error: key not found in YAML")
            res = None
        i = i + 1
    if res:
        print(res)

Testing

~# python search_yaml.py server.repo.port
~# 1234

~# python search_yaml.py server.servername
~# myServer14

This may have bugs, and I made the code just to see if it can be easily be done without third-party tools.

CLI apps for YAML

You might be interested in yq program. There are actually two programs with the same name, one is implemented with Go, the other is Python-based (probably more complex than the code above) :-)

The Go-based yq. You can either install the provided statically-compiled yq binary from GitHub releases or install using yum from commercial GetPageSpeed repository, for the sake of easy updates later on:

sudo yum -y install https://extras.getpagespeed.com/release-latest.rpm
sudo yum -y install yq

Then you can simply:

~# yq read config.yml server.servername
~# myServer14
Danila Vershinin
  • 8,725
  • 2
  • 29
  • 35
  • You, sir, are the hero we needed but didn't deserve. Your answer worked; and I'll also be looking into getting YQ approved as well. Thank you! – MissCatAssTrophy Oct 07 '20 at 22:38