14

I've got many thousands of lines of python code that has python2.7+ style string formatting (e.g. without indices in the {}s)

"{} {}".format('foo', 'bar')

I need to run this code under python2.6 which requires the indices.

I'm wondering if anyone knows of a painless way allow python2.6 to run this code. It'd be great if there was a from __future__ import blah solution to the problem. I don't see one. Something along those lines would be my first choice.

A distant second would be some script that can automate the process of adding the indices, at least in the obvious cases:

"{0} {1}".format('foo', 'bar')
PhilR
  • 5,375
  • 1
  • 21
  • 27
Captain Midday
  • 659
  • 8
  • 18
  • Might not be an actual solution, but one could try to monkey-patch a `format` method on `str` that does a nasty replacement of `{}` with `{index}` and then calls the 'real' `format` with the added indices. – akaIDIOT Dec 11 '13 at 16:12
  • 2
    @akaIDIOT: I don't think that will work for two reasons. First, `TypeError: can't set attributes of built-in/extension type 'str'`. Second, that wouldn't affect string literals, which don't actually call `str`. – DSM Dec 11 '13 at 16:13
  • 1
    Can you install Python 2.7 on the machine? There are Python distributions which will can install and run without root permissions... – YXD Dec 11 '13 at 16:28
  • 1
    any possible reason for using a outdated version of python? – K DawG Dec 11 '13 at 16:30

4 Answers4

7

It doesn't quite preserve the whitespacing and could probably be made a bit smarter, but it will at least identify Python strings (apostrophes/quotes/multi line) correctly without resorting to a regex or external parser:

import tokenize
from itertools import count
import re

with open('your_file') as fin:
    output = []
    tokens = tokenize.generate_tokens(fin.readline)
    for num, val in (token[:2] for token in tokens):
        if num == tokenize.STRING:
            val = re.sub('{}', lambda L, c=count(): '{{{0}}}'.format(next(c)), val)
        output.append((num, val))

print tokenize.untokenize(output) # write to file instead...

Example input:

s = "{} {}".format('foo', 'bar')
if something:
    do_something('{} {} {}'.format(1, 2, 3))

Example output (note slightly iffy whitespacing):

s ="{0} {1}".format ('foo','bar')
if something :
    do_something ('{0} {1} {2}'.format (1 ,2 ,3 ))
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • Thanks all. I have no choice but to maintain 2.6 compatibility. Otherwise I'd require an upgrade. tokenize is definitely the magic I was looking for. – Captain Midday Dec 12 '13 at 15:05
0

You could define a function to re-format your format strings:

def reformat(s):
    return "".join("".join((x, str(i), "}")) 
                   for i, x in list(enumerate(s.split("}")))[:-1])
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
0

Maybe a good old sed-regex like:

sed source.py -e 's/{}/%s/g; s/\.format(/ % (/'

your example would get changed to something like:

"%s %s" % ('foo', 'bar')

Granted you loose the fancy new style .format() but imho it's almost never useful for trivial value insertions.

Don Question
  • 11,227
  • 5
  • 36
  • 54
0

A conversion script could be pretty simple. You can find strings to replace with regex:

fmt = "['\"][^'\"]*{}.*?['\"]\.format"

str1 = "x; '{} {}'.format(['foo', 'bar'])"
str2 = "This is a function; 'First is {}, second is {}'.format(['x1', 'x2']); some more code"
str3 = 'This doesn't have anything but a format. format(x)'
str4 = "This has an old-style format; '{0} {1}'.format(['some', 'list'])"
str5 = "'{0}'.format(1); '{} {}'.format(['x', 'y'])"

def add_format_indices(instr):
    text = instr.group(0)
    i = 0
    while '{}' in text:
        text = text.replace('{}', '{%d}'%i, 1)
        i = i+1
    return text

def reformat_text(text):
    return re.sub(fmt, add_format_indices, text)

reformat_text(str1)
"x; '{0} {1}'.format(['foo', 'bar'])"
reformat_text(str2)
"This is a function; 'First is {0}, second is {1}'.format(['x1', 'x2']); some more code"
reformat_text(str3)
"This doesn't have anything but a format. format(x)"
reformat_text(str4)
"This has an old-style format; '{0} {1}'.format(['some', 'list'])"
reformat_text(str5)
"'{0}'.format(1); '{0} {1}'.format(['x', 'y'])"

I think you could throw a whole file through this. You can probably find a faster implementation of add_format_indices, and obviously it hasn't been tested a whole lot.

Too bad there isn't an import __past__, but in general that's not something usually offered (see the 2to3 script for an example), so this is probably your next best option.

Corley Brigman
  • 11,633
  • 5
  • 33
  • 40