If you have to deal with a YAML document with multiple different tags, and
are only interested in a subset of them, you should still
handle them all. If the elements you are intersted in are nested
within other tagged constructs you at least need to handle all of the "enclosing" tags
properly.
There is however no need to handle all of the tags individually, you
can write a constructor routine that can handle mappings, sequences
and scalars register that to PyYAML's SafeLoader
using:
import yaml
inp = """\
MyEIP:
Type: !Join [ "::", [AWS, EC2, EIP] ]
Properties:
InstanceId: !Ref MyEC2Instance
"""
description = []
def any_constructor(loader, tag_suffix, node):
if isinstance(node, yaml.MappingNode):
return loader.construct_mapping(node)
if isinstance(node, yaml.SequenceNode):
return loader.construct_sequence(node)
return loader.construct_scalar(node)
yaml.add_multi_constructor('', any_constructor, Loader=yaml.SafeLoader)
data = yaml.safe_load(inp)
print(data)
which gives:
{'MyEIP': {'Type': ['::', ['AWS', 'EC2', 'EIP']], 'Properties': {'InstanceId': 'MyEC2Instance'}}}
(inp
can also be a file opened for reading).
As you see above will also continue to work if an unexpected !Join
tag shows up in your code,
as well as any other tag like !Equal
. The tags are just dropped.
Since there are no variables in YAML, it is a bit of guesswork what
you mean by "like to parse the description variable only". If that has
an explicit tag (e.g. !Description
), you can filter out the values by adding 2-3 lines
to the any_constructor
, by matching the tag_suffix
parameter.
if tag_suffix == u'!Description':
description.append(loader.construct_scalar(node))
It is however more likely that there is some key in a mapping that is a scalar description
,
and that you are interested in the value associated with that key.
if isinstance(node, yaml.MappingNode):
d = loader.construct_mapping(node)
for k in d:
if k == 'description':
description.append(d[k])
return d
If you know the exact position in the data hierarchy, You can of
course also walk the data
structure and extract anything you need
based on keys or list positions. Especially in that case you'd be better of
using my ruamel.yaml
, was this can load tagged YAML in round-trip mode without
extra effort (assuming the above inp
):
from ruamel.yaml import YAML
with YAML() as yaml:
data = yaml.load(inp)