3

I got this from a process's output using subprocess.Popen() :

    { about: 'RRDtool xport JSON output',
  meta: {
    start: 1401778440,
    step: 60,
    end: 1401778440,
    legend: [
      'rta_MIN',
      'rta_MAX',
      'rta_AVERAGE'
          ]
     },
  data: [
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null  ]
  ]
}

It doesn't seem to be a valid json to me. I have used ast.literal_eval() and json.loads(), but with no luck. Can someone help me in the right direction ? Thanks in advance.

dotslash
  • 2,041
  • 2
  • 16
  • 15

2 Answers2

6

Indeed, older versions of rddtool export ECMA-script, not JSON. According to this debian bug report upgrading 1.4.8 should give you proper JSON. Also see the project CHANGELOG:

JSON output of xport is now actually json compilant by its keys being properly quoted now.

If you cannot upgrade, you have two options here; either attempt to reformat to apply quoting the object key identifiers, or use a parser that's more lenient and parses ECMA-script object notation.

The latter can be done with the external demjson library:

>>> import demjson
>>> demjson.decode('''\
... { about: 'RRDtool xport JSON output',
...   meta: {
...     start: 1401778440,
...     step: 60,
...     end: 1401778440,
...     legend: [
...       'rta_MIN',
...       'rta_MAX',
...       'rta_AVERAGE'
...           ]
...      },
...   data: [
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null  ]
...   ]
... }''')
{u'about': u'RRDtool xport JSON output', u'meta': {u'start': 1401778440, u'step': 60, u'end': 1401778440, u'legend': [u'rta_MIN', u'rta_MAX', u'rta_AVERAGE']}, u'data': [[None, None, None], [None, None, None], [None, None, None], [None, None, None], [None, None, None], [None, None, None]]}

Repairing can be done using a regular expression; I am going to assume that all identifiers are on a new line or directly after the opening { curly brace. Single quotes in the list will have to be changed to double quotes; this will only work if there are no embedded single quotes in the values too:

import re
import json

yourtext = re.sub(r'(?:^|(?<={))\s*(\w+)(?=:)', r' "\1"', yourtext, flags=re.M)
yourtext = re.sub(r"'", r'"', yourtext)
data = json.loads(yourtext)
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Is there any advantage of using `demjson` over `yaml` ? Bcz both are giving me proper result. – dotslash Jun 03 '14 at 07:54
  • 2
    @Shark: YAML claims to be a superset of JSON, where this syntax happens to work for YAML too. demjson explicitly allows for the syntax used here as it sees this as ECMAScript. It could be there is *some* syntax that RDDTool produces that either tool doesn't parse, but my gut instinct is to use demjson here. – Martijn Pieters Jun 03 '14 at 07:59
  • 2
    @Shark One possible advantage is that your output explicitly claims to be JSON. It only *happens* to be valid YAML, so a lenient JSON parser might be less likely to get tripped up by it at some point down the road. – Zero Piraeus Jun 03 '14 at 08:02
4

It is indeed not valid JSON. It is, however, valid YAML, so the third-party PyYAML library might help you out:

>>> import yaml
>>> yaml.load(text)
{
    'about': 'RRDtool xport JSON output',
    'meta': {
        'start': 1401778440,
        'step': 60,
        'end': 1401778440,
        'legend': [
            'rta_MIN',
            'rta_MAX',
            'rta_AVERAGE'
        ]
    },
    'data': [
        [None, None, None],
        [None, None, None],
        [None, None, None],
        [None, None, None],
        [None, None, None],
        [None, None, None]
    ]
}
Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160