Sed Command delete before first instance and after last

Question

I'm looking for a sed command to clean up some kml files I have. The files are all on a single line and look like this

<some text><kml><Document><name> Name </name><Placemark><name> Hotel 01 </name></Placemark><Placemark><name> Hotel 02 </name></Placemark><Placemark><name> Hotel 03 </name></Placemark></Document></kml>

Ideally I want the only the parts starting with (and including) the first <Placemark> element to the last (and including) </Placemark> element and these sections from all the kml files output to a single file.

I'd be happy with a command to either delete all text before the first <Placemark> and delete all text after last </Placemark>, or a command to extract the content after the first <Placemark> and before the last </Placemark>.

A command that I've managed to botch together so far is:

find . -name 'kmlFiles00*' -exec sed -r 's/^.{879}/ /' {} \; | sed -e 's/<\/Document><\/kml>//g' > placemarks_`date +%d-%m-%Y`.list

which has worked in getting rid of the first 879 characters and then removing all the instances of </Document></kml> before outputting it all into final file, but this is pretty messy so I'm looking for a cleaner command. I have also tried

sed -e 's/^.*<Placemark> //' -e 's/<\/Placemark>.*$//'

Which I know is getting closer but still fails

score 2 · Answer 1 · answered May 05 '13 at 17:05

2

awk NF=NF FPAT='<Placemark>.*</Placemark>'

define a field as being <Placemark>.*</Placemark>
force rebuild of the line, printing all fields

answered May 05 '13 at 17:05

Zombo

1
62
391
407

score 0 · Answer 2 · answered May 05 '13 at 23:28

0

This might work for you (GNU sed):

sed -r 's/<Placemark>/\n&/;s/.*\n(.*<\/Placemark>).*/\1/' file

answered May 05 '13 at 23:28

potong

55,640
6
51
83

Sed Command delete before first instance and after last

2 Answers2