I have some well-behaved xml files I want to reformat (NOT PARSE!) using regex. The goal is to have every <trkpt>
pairs as oneliners.
The following code works, but I'd like to get the operations performed in a single regex substitution instead of the loop, so that I don't need to concatenate the strings back.
import re
xml = """
<trkseg>
<trkpt lon="-51.2220657617" lat="-30.1072524581">
<time>2012-08-25T10:20:44Z</time>
<ele>0</ele>
</trkpt>
<trkpt lon="-51.2220657617" lat="-30.1072524581">
<time>2012-08-25T10:20:44Z</time>
<ele>0</ele>
</trkpt>
<trkpt lon="-51.2220657617" lat="-30.1072524581">
<time>2012-08-25T10:20:44Z</time>
<ele>0</ele>
</trkpt>
</trkseg>
"""
for trkpt in re.findall('<trkpt.*?</trkpt>', xml, re.DOTALL):
print re.sub('>\s*<', '><', trkpt, re.DOTALL)
An answer using sed
would also be welcome.
Thanks for reading