9

Is there any way to find (even a best guess) the "printed" length of a string in python? E.g. 'potaa\bto' is 8 characters in len but only 6 characters wide printed on a tty.

Expected usage:

s = 'potato\x1b[01;32mpotato\x1b[0;0mpotato'
len(s)   # 32
plen(s)  # 18
Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105
wim
  • 338,267
  • 99
  • 616
  • 750
  • 1
    What is `plen` of `"abc "`? How about `"123\t456"`? `"12345\r67"`? `"123456 \n789"`? `"123456 \r78\n9abcd"`? Essentially, you have to decide on the rules for your character set and write an algorithm. – Mark Tolonen Feb 15 '13 at 06:56
  • 1
    This is really a hard one. I tried different approaches, including some `subprocess.Popen(...).communicate()` tries, but to no avail. – Thorsten Kranz Feb 15 '13 at 07:14

3 Answers3

4

At least for the ANSI TTY escape sequence, this works:

import re
strip_ANSI_pat = re.compile(r"""
    \x1b     # literal ESC
    \[       # literal [
    [;\d]*   # zero or more digits or semicolons
    [A-Za-z] # a letter
    """, re.VERBOSE).sub

def strip_ANSI(s):
    return strip_ANSI_pat("", s)

s = 'potato\x1b[01;32mpotato\x1b[0;0mpotato'

print s, len(s)
s1=strip_ANSI(s)
print s1, len(s1)

Prints:

potato[01;32mpotato[0;0mpotato 32
potatopotatopotato 18

For backspaces \b or vertical tabs or \r vs \n -- it depends how and where it is printed, no?

dawg
  • 98,345
  • 23
  • 131
  • 206
  • I'm looking for a more general solution ... there are many other non-printing characters than in my example. Yes it depends how and where, I guess... this is just for pretty-printing / tabulation so it's not too drastic if it gets them wrong sometimes – wim Feb 15 '13 at 06:53
  • You might wade into [curses](http://docs.python.org/2/library/curses.html) then... – dawg Feb 15 '13 at 06:56
1

The bash shell had exactly the same need, in order to know when the user's typed input wraps to the next line, in the presence of non-printable characters in the prompt string. Their solution was to not even try - instead, they require that anyone setting a prompt string put \[ and \] around non-printing portions of the prompt. The printed length is calculated to be the length of the string, with these special sequences and all text between them filtered out. (The special sequences are omitted on output, of course.)

jasonharper
  • 9,450
  • 2
  • 18
  • 42
0

The printed length of a string depends on the type of the string.

Normal strings in python 2.x are in utf-8. The length of utf-8 is equal to the bytes in String. Change the type to unicode, len() delivers now printed signs. So Formatting works:

value = 'abcäöücdf'
len_value  = len(value)
len_uvalue = len(unicode(value,'utf-8'))
size = self['size'] + len_value-len_uvalue
print value[:min(len(value),size)].ljust(size)
volker
  • 9
  • 1