3

I have encountered problem in yaml-cpp parser. When I try to load following definition:

DsUniversity:
  university_typ: {type: enum, values:[Fachhochschule, Universitat, Berufsakademie]}
  students_at_university: {type: string(50)}

I'm getting following error:

Error: yaml-cpp: error at line 2, column 39: end of map flow not found

I tried to verify yaml validity on http://yaml-online-parser.appspot.com/ and http://yamllint.com/ and both services reports yaml as valid.

Problem is caused by missing space after "values:" definition. When yaml is updated to following format:

DsUniversity:
  university_typ: {type: enum, values: [Fachhochschule, Universitat, Berufsakademie]}
  students_at_university: {type: string(50)}

everything works as expected.

Is there any way how to configure/update/fix yaml-cpp parser to proceed also yamls with missing space after colon?

Added: It seems that problem is caused by requirement for empty char as separator. When I simplified testing snippet to

DsUniversity:[Fachhochschule, Universitat, Berufsakademie]

yaml-cpp parser reads it as one scalar value "DsUniversity:[Fachhochschule, Universitat, Berufsakademie]". When empty char is added after colon, yaml-cpp correctly loads element with sequence.

Ludek Vodicka
  • 1,610
  • 1
  • 18
  • 33
  • 1
    I figured out that space after colon is forced by regular expressions defined in exp.h. When I updated all these regExs (EndScalar, EndScalarInFlow, Value, ValueInFlow) yaml-cpp correctly parses my document. Unfortunately I'm not sure if this hack doesn't broke anything else. If there is be better solution, please let me know. – Ludek Vodicka Sep 03 '14 at 13:16

2 Answers2

5

yaml-cpp is correct here, and those online validators are incorrect. From the YAML 1.2 spec:

7.4.2. Flow Mappings

Normally, YAML insists the “:” mapping value indicator be separated from the value by white space. A benefit of this restriction is that the “:” character can be used inside plain scalars, as long as it is not followed by white space. This allows for unquoted URLs and timestamps. It is also a potential source for confusion as “a:1” is a plain scalar and not a key: value pair.

...

To ensure JSON compatibility, if a key inside a flow mapping is JSON-like, YAML allows the following value to be specified adjacent to the “:”. This causes no ambiguity, as all JSON-like keys are surrounded by indicators. However, as this greatly reduces readability, YAML processors should separate the value from the “:” on output, even in this case.

In your example, you're in a flow mapping (meaning a map surrounded by {}), but your key is not JSON-like: you just have a plain scalar (values is unquoted). To be JSON-like, the key needs to be either single- or double-quoted, or it can be a nested flow sequence or map itself.

In your simplified example,

DsUniversity:[Fachhochschule, Universitat, Berufsakademie]

both yaml-cpp and the online validators parse this correctly as a single scalar - in order to be a map, as you intend, you're required a space after the :.

Why does YAML require that space?

In the simple plain scalar case:

a:b

could be ambiguous: it could be read as either a scalar a:b, or a map {a: b}. YAML chooses to read this as a scalar so that URLs can be easily embedded in YAML without quoting:

http://stackoverflow.com

is a scalar (like you'd expect), not a map {http: //stackoverflow.com}!

In a flow context, there's one case where this isn't ambiguous: when the key is quoted, e.g.:

{"a":b}

This is called JSON-like because it's similar to JSON, which requires quotes around all scalars. In this case, YAML knows that the key ends at the end-quote, and so it can be sure that the value starts immediately.

This behavior is explicitly allowed because JSON itself allows things like

{"a":"b"}

Since YAML 1.2 is a strict superset of JSON, this must be legal in YAML.

Jesse Beder
  • 33,081
  • 21
  • 109
  • 146
  • Thank you for very detailed answer! It's good to know why these restrictions are defined and I have to agree with them all. Unfortunately our application needs to parse yaml files defined by our customers for Doctrine/Doctrine2 ORM where these ORM frameworks have their own parsers which allow to define ":" as separator without space. Right now I'm running tests on these yaml files if hack I described above will work correctly. – Ludek Vodicka Sep 03 '14 at 15:22
0

I think it would be beneficial to parse scalar/keys differently immediately inside a flow map{, if you agree, vote here please.

https://github.com/yaml/yaml-spec/issues/267

Revin
  • 95
  • 3
  • 8