1

I have a text file containing something that behaves like C-strings. For example:

something = "some text\nin two lines\tand tab";
somethingElse = "some text with \"quotes\"";

Fetching things between quotes is not a problem. Problem is that later I'm processing this string and slash escapes makes this hard.

I'd like to decode these strings, process them, then encode them back to C-string literals.

So from that raw input

some text\\with line wrap\nand \"quote\"

I need:

some text\with line wrap
and "quote"

and vice versa.

What I've tried:

I've found some API for processing Python string literals (string_escape), it is close to what I need, but since I'm processing C-strings it is useless. I've tried find other codecs to match my problem but no luck so far.

Marek R
  • 32,568
  • 6
  • 55
  • 140

1 Answers1

0

I'm looking for a simple solution also, and json module seems to be the easiest solution. The following is my quick hack. Note that there are still issues if/when both the single (') and double quote (") appear in the same string... And I suspect you will have issues with unicode characters...

def c_decode(in_str:str) -> str:
    return json.loads(in_str.join('""' if '"' not in in_str else "''"))

def c_encode(in_str:str) -> str:
    """ Encode a string literal as per C"""
    return json.dumps(in_str)[1:-1]

Note also that if in_str is "AB\n\r\tYZ" ...

then we alternatively have: ("%r"%(in_str.join('""')))[2:-2]
giving: 'AB\\n\\r\\tYZ' # almost the same c_encode above. 

Here's hoping that someone has a nicer solution.

NevilleDNZ
  • 1,269
  • 12
  • 31