45

I'm executing the following python code:

import yaml


foo = {
    'name': 'foo',
    'my_list': [{'foo': 'test', 'bar': 'test2'}, {'foo': 'test3', 'bar': 'test4'}],
    'hello': 'world'
}

print(yaml.dump(foo, default_flow_style=False))

but is printing:

hello: world
my_list:
- bar: test2
  foo: test
- bar: test4
  foo: test3
name: foo

instead of:

hello: world
my_list:
  - bar: test2
    foo: test
  - bar: test4
    foo: test3
name: foo

How can I indent the my_list elements this way?

Anthon
  • 69,918
  • 32
  • 186
  • 246
fj123x
  • 6,904
  • 12
  • 46
  • 58
  • 1
    Both versions are correct, so at worst this is an aesthetic concern. –  Aug 03 '14 at 20:05

2 Answers2

73

This ticket suggests the current implementation correctly follows the spec:

The “-”, “?” and “:” characters used to denote block collection entries are perceived by people to be part of the indentation. This is handled on a case-by-case basis by the relevant productions.

On the same thread, there is also this code snippet (modified to fit your example) to get the behavior you are looking for:

import yaml

class MyDumper(yaml.Dumper):

    def increase_indent(self, flow=False, indentless=False):
        return super(MyDumper, self).increase_indent(flow, False)

foo = {
    'name': 'foo',
    'my_list': [
        {'foo': 'test', 'bar': 'test2'},
        {'foo': 'test3', 'bar': 'test4'}],
    'hello': 'world',
}

print yaml.dump(foo, Dumper=MyDumper, default_flow_style=False)
Shayan Salehian
  • 149
  • 2
  • 16
Jace Browning
  • 11,699
  • 10
  • 66
  • 90
  • I ran into this problem generating data files for `OpenCV`. The `OpenCV` YAML parser requires the extra indent. Otherwise it throws an exception. This solution fixes it. I do wish `PyYAML` didn't require so much sub-classing to make things work. JSON is much less complex, except there's no built-in OpenCV parser for it. – orodbhen Sep 01 '17 at 01:58
  • 3
    That sounds like an OpenCV bug, because unindented lists (while ugly) are valid YAML. – Marius Gedminas Sep 10 '20 at 07:29
  • It is important to NOT rename `increase_indent()` in the above code. – Gordon Fogus Dec 29 '22 at 23:45
9

Your output, as shown, is incomplete as print(yaml.dump()) gives you an extra empty line after name: foo. It is also slower and uses more memory than directly streaming to sys.stdout.

You are probably using PyYAML and, apart from only supporting the outdated YAML 1.1 specification, it is very limited in control over the dumped YAML.

I suggest you use ruamel.yaml (disclaimer: I am the author of that package), where you can specify identation separately for mappings and sequences and also indicate how far to offset the dash within the indent before the sequence element:

import sys
import ruamel.yaml

foo = {
    'name': 'foo',
    'my_list': [{'foo': 'test', 'bar': 'test2'}, {'foo': 'test3', 'bar': 'test4'}],
    'hello': 'world'
}


yaml = ruamel.yaml.YAML()
yaml.indent(sequence=4, offset=2)
yaml.dump(foo, sys.stdout)

which gives:

name: foo
my_list:
  - foo: test
    bar: test2
  - foo: test3
    bar: test4
hello: world

Please note that the order of the keys is implementation dependent (but can be controlled, as ruamel.yaml can round-trip the above without changes).

Anthon
  • 69,918
  • 32
  • 186
  • 246
  • While on a whole I do agree with you and I do prefer using *ruamel.yaml* in my own projects (often with a fallback to *PyYAML*), the `print()` call can still easily be corrected to not output an extra linefeed: `print(something, end='')` – blubberdiblub Sep 07 '19 at 05:47