1

I'm using PyYAML, and I'm trying to attach line numbers to dicts in my nested YAML structure, but only at the topmost level, e.g.:

- id: 123
  __line: 1
  links:
    - name: "abc"
    # no line number here
    - name: "def"

There is this answer that shows how to attach line numbers by overriding SafeLoader, but this attaches the __line field to all dicts at any nesting level.

I ask in one of the comments there how to make this only apply to the top-most level of a structure, and @augurar gives an idea for a solution, but I need more help.

Since the constructor performs a depth-first traversal, you could add an attribute to track current depth and override construct_object() to increment/decrement it appropriately. You'd need some extra logic to handle anchors correctly, if needed for your use case.

Here is what I've tried so far:

class SafeLineLoader(SafeLoader):
    def construct_mapping(self, node, deep=True):
        mapping = super(SafeLineLoader, self).construct_mapping(node, deep=deep)
        if node.get("depth") == 0:
            mapping["line"] = node.start_mark.line + 1
        print(node)
        return mapping

    def construct_object(self, node, deep=False, depth=0):
        return super().construct_object(node, deep=deep, depth=depth + 1)

But I get the following error:

    return super().construct_object(node, deep=deep, depth=depth + 1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: BaseConstructor.construct_object() got an unexpected keyword argument 'depth'

I guess because PyYAML's construct_object function is not expecting that depth argument. This may end up being a Python noob question about wrapping class methods and adding new arguments, or it may be that I just don't understand the architecture of this PyYAML library.

V. Rubinetti
  • 1,324
  • 13
  • 21
  • If you would upgrade to `ruamel.yaml`, you could just inspect the `.lc` attribute of the dict after loading. That is a `LineCol` instance which has a `line` and `col` attribute. So you do something like `data['some']['path'].lc.line` after loading into `data` – Anthon May 16 '23 at 18:30

1 Answers1

1

The depth information is not part of the original node attributes, nor is it propagated through the original calls, so you can't just call the original construct_object method with an additional depth argument.

Instead, notice that the compose_node method conveniently includes a parent argument, which is None for the root node, so you can override this method to add parent as an attribute to all nodes. Also note that all document nodes are sequence nodes themselves so a top-level mapping node always has a sequence node as a parent, so to check if a mapping node is a top-level one you can check if its parent node has a parent of None:

class SafeLineLoader(SafeLoader):
    def compose_node(self, parent, index):
        node = super().compose_node(parent, index)
        node.parent = parent
        return node

    def construct_mapping(self, node, deep=False):
        mapping = super().construct_mapping(node, deep=deep)
        if not node.parent.parent:
            mapping['__line__'] = node.start_mark.line + 1
        return mapping

Demo: https://replit.com/@blhsing/ThankfulBluevioletHarddrive

blhsing
  • 91,368
  • 6
  • 71
  • 106