38

How can I load a YAML file and convert it to a Python JSON object?

My YAML file looks like this:

Section:
    heading: Heading 1
    font: 
        name: Times New Roman
        size: 22
        color_theme: ACCENT_2

SubSection:
    heading: Heading 3
    font:
        name: Times New Roman
        size: 15
        color_theme: ACCENT_2
Paragraph:
    font:
        name: Times New Roman
        size: 11
        color_theme: ACCENT_2
Table:
    style: MediumGrid3-Accent2
ReKx
  • 996
  • 2
  • 10
  • 23

6 Answers6

39

The PyYAML library is intended for this purpose

pip install pyyaml
import yaml
import json
with open("example.yaml", 'r') as yaml_in, open("example.json", "w") as json_out:
    yaml_object = yaml.safe_load(yaml_in) # yaml_object will be a list or a dict
    json.dump(yaml_object, json_out)

Notes: PyYAML only supports the pre-2009, YAML 1.1 specification.
ruamel.yaml is an option if YAML 1.2 is required.

pip install ruamel.yaml
Vemund Kvam
  • 509
  • 3
  • 6
  • I agree, it's a more clear answer. I'll leave my answer here since it includes the file handling part, although it was not asked specifically for it is probably needed more often than not. – Vemund Kvam Jun 13 '18 at 21:39
  • I copied the pip install part when you mentioned the other answer being cleared, thanks. – Vemund Kvam Jun 13 '18 at 22:09
  • 2
    PyYAML's `load()` is documented to be unsafe, and there is no excuse for using it instead of `safe_load()` here (or almost anywhere else). You fail to mention that PyYAML only supports the old, pre-2009, YAML 1.1 specification. – Anthon Jun 14 '18 at 07:22
  • 1
    I had no idea, I'll include it in the answer, thanks. – Vemund Kvam Jun 18 '18 at 09:07
  • For tracking the PyYaml specs: https://github.com/yaml/pyyaml/issues/116 – Flair Mar 12 '20 at 18:03
37

you can use PyYAML

pip install PyYAML

And in the ipython console:

In [1]: import yaml

In [2]: document = """Section:
   ...:     heading: Heading 1
   ...:     font: 
   ...:         name: Times New Roman
   ...:         size: 22
   ...:         color_theme: ACCENT_2
   ...: 
   ...: SubSection:
   ...:     heading: Heading 3
   ...:     font:
   ...:         name: Times New Roman
   ...:         size: 15
   ...:         color_theme: ACCENT_2
   ...: Paragraph:
   ...:     font:
   ...:         name: Times New Roman
   ...:         size: 11
   ...:         color_theme: ACCENT_2
   ...: Table:
   ...:     style: MediumGrid3-Accent2"""
   ...:     

In [3]: yaml.load(document)
Out[3]: 
{'Paragraph': {'font': {'color_theme': 'ACCENT_2',
   'name': 'Times New Roman',
   'size': 11}},
 'Section': {'font': {'color_theme': 'ACCENT_2',
   'name': 'Times New Roman',
   'size': 22},
  'heading': 'Heading 1'},
 'SubSection': {'font': {'color_theme': 'ACCENT_2',
   'name': 'Times New Roman',
   'size': 15},
  'heading': 'Heading 3'},
 'Table': {'style': 'MediumGrid3-Accent2'}}
Brown Bear
  • 19,655
  • 10
  • 58
  • 76
  • Not only what the previous comment said, but you're using IPython's console, and not plain Python console ;) – Charles David Jun 13 '18 at 21:32
  • 3
    1) Where is the JSON the OP requested? JSON [strings](https://www.json.org/) have double quotes. 2) PyYAML's `load()` is documented to be unsafe, and there is no excuse for using it instead of `safe_load()` here (or almost anywhere else). 3) You fail to mention that PyYAML only supports the old, pre-2009, YAML 1.1 specification. – Anthon Jun 14 '18 at 07:20
  • 1
    1. what did you mean about the JSON type in Python, may be you can help me to read about it. is the dict. Other comments is good and interesting as your answer, thank you. – Brown Bear Jun 15 '18 at 13:10
17

There is no such thing as a Python JSON object. JSON is a language independent file format that finds its roots in JavaScript, and is supported by many languages.

If your YAML document adheres to the old 1.1 standard, i.e. pre-2009, you can use PyYAML as suggested by some of the other answers.

If it uses the newer YAML 1.2 specification, which made YAML into a superset of JSON, you should use ruamel.yaml (disclaimer: I am the author of that package, which is a fork of PyYAML).

import ruamel.yaml
import json

in_file = 'input.yaml'
out_file = 'output.json'

yaml = ruamel.yaml.YAML(typ='safe')
with open(in_file) as fpi:
    data = yaml.load(fpi)
with open(out_file, 'w') as fpo:
    json.dump(data, fpo, indent=2)

which generates output.json:

{
  "Section": {
    "heading": "Heading 1",
    "font": {
      "name": "Times New Roman",
      "size": 22,
      "color_theme": "ACCENT_2"
    }
  },
  "SubSection": {
    "heading": "Heading 3",
    "font": {
      "name": "Times New Roman",
      "size": 15,
      "color_theme": "ACCENT_2"
    }
  },
  "Paragraph": {
    "font": {
      "name": "Times New Roman",
      "size": 11,
      "color_theme": "ACCENT_2"
    }
  },
  "Table": {
    "style": "MediumGrid3-Accent2"
  }
}

ruamel.yaml, apart from supporting YAML 1.2, has many PyYAML bugs fixed. You should also note that PyYAML's load() is also documented to be unsafe, if you don't have full control over the input at all times. PyYAML also loads scalar numbers 021 as integer 17 instead of 21 and converts scalar strings like on, yes, off to boolean values (resp. True, True and False).

Anthon
  • 69,918
  • 32
  • 186
  • 246
  • Thanks for developing a modern and higher-quality package. PyYAML is not abandonware, the latest release (5.4.1) is Jan 2021. Did you submit PRs only to have them rejected, or did you fork without trying to fix the original? – Dave Apr 03 '21 at 16:11
  • @Dave PRs were not rejected, there was just no answer for several years and then the project was moved and open PRs and issues dropped in the process I haven't seen anything that IMO warrants a major version number change since 3.12, and PyYAML is still on the standard that was superseded in 2009. – Anthon Apr 03 '21 at 20:57
  • Thanks. For some reason I cannot install pyyaml with pip... It looks like it works fine but cannot import it in any scripts... but ruamel.yaml works fine and can import it so going with that ;) - PS: I really wish you did not name it with a dot in the name... messes with IDEs no end – AustEcon Jan 23 '22 at 10:06
  • @AustEcon What IDE are you using that cannot handle Python namespaces? Maybe you should post a question here on StackOverflow, there might be a workaround. ( myself develop using kakoune and its LSP under Linux, and don't have any problems). – Anthon Jan 23 '22 at 15:25
  • PyCharm (JetBrains) which is a very popular, very good IDE. It could be something in my environment to do with recently installing anaconda and messing up my PATHs etc... But I don't have any issues with any other packages and I code a lot on many different projects daily.. (and purged Anaconda from PATH) – AustEcon Jan 24 '22 at 04:58
5

In python3 you can use pyyaml.

$ pip3 install pyyaml

Then you load your yaml file and dump it into json:

import yaml, json

with open('./file.yaml') as f:
    print(json.dumps(yaml.load(f)))

Output:

{"Section": null, "heading": "Heading 1", "font": {"name": "Times New Roman", "size": 22, "color_theme": "ACCENT_2"}, "SubSection": {"heading": "Heading 3", "font": {"name": "Times New Roman", "size": 15, "color_theme": "ACCENT_2"}}, "Paragraph": {"font": {"name": "Times New Roman", "size": 11, "color_theme": "ACCENT_2"}}, "Table": {"style": "MediumGrid3-Accent2"}}
Yuancheng
  • 59
  • 4
  • 2
    PyYAML's `load()` is documented to be unsafe, and there is no excuse for using it instead of `safe_load()` here (or almost anywhere else). Like many others you fail to mention that PyYAML only supports the old, pre-2009, YAML 1.1 specification. – Anthon Jun 14 '18 at 07:24
1

For what it's worth, here is a shell alias based on ruamel.yaml that works as a filter:

pip3 install ruamel.yaml
alias yaml2json="python3 -c 'import json, sys, ruamel.yaml as Y; print(json.dumps(Y.YAML(typ=\"safe\").load(sys.stdin), indent=2))'"

Usage:

yaml2json < foo.yaml > foo.json
m000
  • 5,932
  • 3
  • 31
  • 28
0

Here is a simple solution of how to do it without saving the json to a file:

import yaml
import json

with open("your_yaml_file.yaml") as f:       
    yaml_obj = yaml.safe_load(f) 
    json_str = json.dumps(yaml_obj)
    json_dict = json.loads(json_str)
    print(json_dict)
tsveti_iko
  • 6,834
  • 3
  • 47
  • 39