4

I'm trying to parse and remove any \command (\textit, etc...) from each line loaded (from .tex file or other commands from lilypond files as [\clef, \key, \time]).

How could I do that?

What I've tried

import re
f = open('example.tex')
lines = f.readlines()
f.close()

pattern = '^\\*([a-z]|[0-9])' # this is the wrong regex!!
clean = []
for line in lines:
    remove = re.match(pattern, line)
    if remove:
        clean.append(remove.group())

print(clean)

Example

Input

#!/usr/bin/latex

\item More things
\subitem Anything

Expected output

More things
Anything
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
arnaldo
  • 876
  • 8
  • 22

2 Answers2

2

You could use a simple regex substitution using this pattern ^\\[^\s]*:

Sample code in python:

import re
p = re.compile(r"^\\[^\s]*", re.MULTILINE)

str = '''
\item More things
\subitem Anything
'''

subst = ""

print re.sub(p, subst, str)

The result would be:

More things
Anything
Caio Oliveira
  • 1,243
  • 13
  • 22
0

This will work:

'\\\w+\s'

It searches for the backslash, then for one or more characters, and a space.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
el3ien
  • 5,362
  • 1
  • 17
  • 33