4

With

stream = open(afile, 'r')
self.meta = yaml.load(stream)

you can easyly read YAML file in python, but it has not got --- at the end I reach error (same with ...):

yaml.composer.ComposerError: expected a single document in the stream
  in "El-punt-de-llibre.md", line 2, column 1
but found another document
  in "El-punt-de-llibre.md", line 6, column 1

But YAML specs allow that:

YAML uses three dashes (“---”) to separate directives from document content. This also serves to signal the start of a document if no directives are present. Three dots ( “...”) indicate the end of a document without starting a new one, for use in communication channels.

So, how do you read this

---
title: "El punt de llibre"
abstract: "Estimar a quina pàgina està el punt de llibre"
keywords: ["when", "activitat", "3/3", "grup", "estimació", "aproximació", "funció lineal - proporcionalitat", "ca"]
comments: true
...

in python?

somenxavier
  • 1,206
  • 3
  • 20
  • 43
  • Is there some reason you can't just omit the beginning `---` and ending `...`? – walrus Dec 01 '17 at 12:50
  • https://stackoverflow.com/questions/42522562/how-to-parse-a-yaml-file-with-multiple-documents – akoeltringer Dec 01 '17 at 12:55
  • It's also worth noting that the spec you linked is for YAML v1.2, while pyyaml only supports YAML v1.1 (not that it should necessarily make a difference, but I can't find any examples in the 1.1 spec of precisely the scenario you describe) – walrus Dec 01 '17 at 13:00
  • I use `---` and `...` because I use it also from other readers (pandoc for example). Even I have just **one** document I *can* close the document with `...` formally – somenxavier Dec 03 '17 at 16:52
  • I found [frontmatter package](https://python-frontmatter.readthedocs.io/en/latest/) which do it properly without libyaml headaches. – somenxavier Nov 15 '20 at 21:46

3 Answers3

4

Your YAML stream/file appears to have more than document in it, for example trying to parse this would give the same error message:

---
title: "El punt de llibre"
abstract: "Estimar a quina pàgina està el punt de llibre"
keywords: ["when", "activitat", "3/3", "grup", "estimació", "aproximació", "funció lineal - proporcionalitat", "ca"]
comments: true
...
---
title: "El punt de llibre"
abstract: "Estimar a quina pàgina està el punt de llibre"
keywords: ["when", "activitat", "3/3", "grup", "estimació", "aproximació", "funció lineal - proporcionalitat", "ca"]
comments: true
...
---
title: "El punt de llibre"
abstract: "Estimar a quina pàgina està el punt de llibre"
keywords: ["when", "activitat", "3/3", "grup", "estimació", "aproximació", "funció lineal - proporcionalitat", "ca"]
comments: true
...

To process such a stream you could use the following approach:

import yaml

with open('test.yaml') as f_yaml:
    for doc in yaml.safe_load_all(f_yaml):
        print doc

Which would show you the following:

{'keywords': ['when', 'activitat', '3/3', 'grup', u'estimaci\xf3', u'aproximaci\xf3', u'funci\xf3 lineal - proporcionalitat', 'ca'], 'abstract': u'Estimar a quina p\xe0gina est\xe0 el punt de llibre', 'comments': True, 'title': 'El punt de llibre'}
{'keywords': ['when', 'activitat', '3/3', 'grup', u'estimaci\xf3', u'aproximaci\xf3', u'funci\xf3 lineal - proporcionalitat', 'ca'], 'abstract': u'Estimar a quina p\xe0gina est\xe0 el punt de llibre', 'comments': True, 'title': 'El punt de llibre'}
{'keywords': ['when', 'activitat', '3/3', 'grup', u'estimaci\xf3', u'aproximaci\xf3', u'funci\xf3 lineal - proporcionalitat', 'ca'], 'abstract': u'Estimar a quina p\xe0gina est\xe0 el punt de llibre', 'comments': True, 'title': 'El punt de llibre'}
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
  • 1
    PyYAML's `load_all` is documented to be potentially unsafe, and its use is seldom (if ever) necessary. You should use `yaml.safe_load_all()` – Anthon Dec 01 '17 at 15:17
2

If your YAML source contains more than one document, you can get the first document with

list(yaml.safe_load_all(stream))[0]

However, it seems strange that a ... causes PyYaml to break and you may want to report that as bug.

flyx
  • 35,506
  • 7
  • 89
  • 126
  • Using `yaml.safe_load_all()` would be better. – Anthon Dec 01 '17 at 15:19
  • @Anthon ah yes, I forgot that [the fix](https://github.com/yaml/pyyaml/commit/7b68405c81db889f83c32846462b238ccae5be80) has not yet been released. – flyx Dec 01 '17 at 17:23
1

Use ruamel.yaml file to handle a YAML file with comments and spaces

import ruamel.yaml
yaml = ruamel.yaml.YAML()
with open(yaml_file) as f:
    for doc in yaml.load_all(f):
        print(doc)           
Vipul Sharda
  • 387
  • 3
  • 6