1

I have a dictionary in Python that I would like to serialize in JSON and convert to a proper C string so that it contains a valid JSON string that corresponds to my input dictionary. I'm using the result to autogenerate a line in a C source file. Got it? Here's an example:

>>> import json
>>> mydict = {'a':1, 'b': 'a string with "quotes" and \t and \\backslashes'}
>>> json.dumps(mydict)
'{"a": 1, "b": "a string with \\"quotes\\" and \\t and \\\\backslashes"}'
>>> print(json.dumps(mydict))
{"a": 1, "b": "a string with \"quotes\" and \t and \\backslashes"}

What I need to generate is the following C string:

"{\"a\": 1, \"b\": \"a string with \\\"quotes\\\" and \\t and \\\\backslashes\"}"

In other words, I need to escape the backslash and double-quote on the result of calling json.dumps(mydict). At least I think I do.... Will the following work? Or am I missing an obvious corner case?

>>> s = '"'+json.dumps(mydict).replace('\\','\\\\').replace('"','\\"')+'"'
>>> print s
"{\"a\": 1, \"b\": \"a string with \\\"quotes\\\" and \\t and \\\\backslashes\"}"
Jason S
  • 184,598
  • 164
  • 608
  • 970
  • You don't want to define a dictionary in C. You want a string that is syntactic C that could be parsed to generate your dictionary? Why not finesse the whole problem by storing the result in a file and loading the file at run-time? Then you don't have to escape stuff. – hughdbrown Oct 22 '10 at 20:33
  • I'm not parsing it in C. It's an embedded system where I'm transmitting the string verbatim to a PC, where the string is parsed. – Jason S Oct 22 '10 at 20:37
  • ...and actually it gets better than that, it's also for a DWARF directive that stashes the string in a symbol file. I can't do this in runtime; there is no file system accessible to my embedded system. – Jason S Oct 22 '10 at 20:38

3 Answers3

3

Your original suggestion and the answer from hughdbrown looks correct to me, but I've found a slightly shorter answer:

c_string = json.dumps( json.dumps(mydict) )

test script:

>>> import json
>>> mydict = {'a':1, 'b': 'a string with "quotes" and \t and \\backslashes'}
>>> c_string = json.dumps( json.dumps(mydict) )
>>> print( c_string )
"{\"a\": 1, \"b\": \"a string with \\\"quotes\\\" and \\t and \\\\backslashes\"}"

which looks like exactly the proper C string you want.

(Fortunately Python's "json.dumps()" passes forward-slashes straight through without change -- unlike some JSON encoders that prefix each forward-slash with a backslash. Such as the one described at Processing escaped url strings within json using python ).

Community
  • 1
  • 1
David Cary
  • 5,250
  • 6
  • 53
  • 66
  • interesting! That's pretty clever. I'm no longer working on this project and no longer have access to the software + test cases, so I can't easily try it out though. – Jason S Apr 23 '13 at 02:07
2

A C string starts with a quote and ends with a quote, has no embedded nulls, has all embedded quotes escaped with backslash, and all embedded backslash literals are doubled.

So take your string, double the backslashes and escape the quotes with a backslash. I think your code is exactly what you need:

s = '"' + json.dumps(mydict).replace('\\', r'\\').replace('"', r'\"') + '"'

Alternatively, you could go for this slightly less robust version:

def c_string(s):
    all_chars = (chr(x) for x in range(256))
    trans_table = dict((c, c) for c in all_chars)
    trans_table.update({'"': r'\"', '\\': r'\\'})
    return "".join(trans_table[c] for c in s)

def dwarf_string(d):
    import json
    return '"' + c_string(json.dumps(d)) + '"'

I'd love to use string.maketrans() but a translation table can map a character to at most a single character.

hughdbrown
  • 47,733
  • 20
  • 85
  • 108
0

Maybe this is what you want:

repr(json.dumps(mydict))
kanaka
  • 70,845
  • 23
  • 144
  • 140