Filtering out ANSI escape sequences

Question

I have a python script that's trying to interpret a trace of data written to and read from stdout and stdin, respectively. The problem is that this data is riddled with ANSI escapes I don't care about. These escapes are JSON encoded, so they look like "\033[A" and "\033]0;". I don't actually need to interpret the codes, but I do need to know how many characters are included in each (you'll notice the first sequence is 6 characters while the second is 7). Is there a straightforward way to filter out these codes from the strings I have?

The `colcrt` program already does this. It's not in Python, but if that's a requirement, it could be ported or wrapped. — tripleee, Nov 22 '12 at 05:01

score 15 · Answer 1 · answered Nov 25 '15 at 20:09

15

The complete regexp for Control Sequences (aka ANSI Escape Sequences) is

/(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]/

Refer to ECMA-48 Section 5.4 and ANSI escape code

answered Nov 25 '15 at 20:09

Jeff

2,095
25
18

2

you should not copy your answers from one question to another. – Jean-François Fabre Feb 16 '19 at 09:13

score 10 · Answer 2 · answered Apr 03 '13 at 06:51

Another variant:

def strip_ansi_codes(s):
    """
    >>> import blessings
    >>> term = blessings.Terminal()
    >>> foo = 'hidden'+term.clear_bol+'foo'+term.color(5)+'bar'+term.color(255)+'baz'
    >>> repr(strip_ansi_codes(foo))
    u'hiddenfoobarbaz'
    """
    return re.sub(r'\x1b\[([0-9,A-Z]{1,2}(;[0-9]{1,2})?(;[0-9]{3})?)?[m|K]?', '', s)

score 3 · Answer 3 · answered Feb 07 '13 at 07:46

#!/usr/bin/env python
import re

ansi_pattern = '\033\[((?:\d|;)*)([a-zA-Z])'
ansi_eng = re.compile(ansi_pattern)

def strip_escape(string=''):
    lastend = 0
    matches = []
    newstring = str(string)
    for match in ansi_eng.finditer(string):
        start = match.start()
        end = match.end()
        matches.append(match)
    matches.reverse()
    for match in matches:
        start = match.start()
        end = match.end()
        string = string[0:start] + string[end:]
    return string

if __name__ == '__main__':
    import sys
    import os

    lname = sys.argv[-1]
    fname = os.path.basename(__file__)
    if lname != fname:
        with open(lname, 'r') as fd:
            for line in fd.readlines():
                print strip_escape(line).rstrip()
    else:
        USAGE = '%s <filename>' % fname
        print USAGE

score 2 · Answer 4 · answered Mar 21 '17 at 07:56

2

This worked for me:

re.sub(r'\x1b\[[\d;]+m', '', s)

answered Mar 21 '17 at 07:56

Elliot Chance

5,526
10
49
80

score 1 · Answer 5 · answered Nov 22 '12 at 05:37

1

It's far from perfect, but this regex may get you somwhere:

import re
text = r'begin \033[A middle \033]0; end'
print re.sub(r'\\[0-9]+(\[|\])[0-9]*;?[A-Z]?', '', text)

It already removes your two examples correctly.

answered Nov 22 '12 at 05:37

BoppreH

8,014
4
34
71

score 0 · Answer 6 · answered Nov 25 '12 at 02:38

0

FWIW, this Python regex seemed to work for me. I don't actually know if it's accurate, but empirically it seems to work:

r'\\033[\[\]]([0-9]{1,2}([;@][0-9]{0,2})*)*[mKP]?'

answered Nov 25 '12 at 02:38

rivenmyst137

345
1
4
10

Filtering out ANSI escape sequences

6 Answers6

Linked