-1

How could I remove said html tag with sed?

Example:

<div id="header"><span id="navbar">... Content ...</span></div>

What I tried:

sed 's!<div id=\"header\">.*\?</div>!!g'

In mind this should work according to this regex reference

user1263513
  • 91
  • 1
  • 8

4 Answers4

2

This might work for you:

sed '
>    /<div id="header"><span id="navbar">/{ # search for start tags
>    s//\n/                                 # replace start tags with newline
>    :a                                     # label a
>    /\n<\/span><\/div>/bb                  # search for end tags and if so goto label b
>    s/\n./\n/                              # end tags not found bump along a character
>    ta                                     # goto label a if last substitution ok
>    :b                                     # label b
>    s///                                   # delete end tags and newline
>    /^$/d                                  # check for empty line and if so delete
>    }' file

N.B. This expects start/end tags to be on the same line.

potong
  • 55,640
  • 6
  • 51
  • 83
0

sed doesn't support .*?(greedy matching).
You can try ssed(super sed).

kev
  • 155,172
  • 47
  • 273
  • 272
  • Given the sources, would I be able to compile them using Android NDK? As the site says it doesn't require support libraries. And if I have the compiled binary, will the sed command work as intended? – user1263513 Apr 23 '12 at 12:07
0

If your goal is to remove the html tag and its contents from a file you could try the following command.

NOTE: All of the following commands are inline edits. The file you are wanting changed will be changed immediately upon running this command. PRIOR TO TESTING PLEASE BACKUP YOUR FILE.

If the tag is all on one line you could try the following.

sed -i 's/<div id=\"header\"><span id=\"navbar\".*<\/span><\/div>//g' /yourfile

If the tag is on multiple lines like the example below try the command that follows.

<div id="header"><span id="navbar">
    ... Content ...
</span></div>

sed -i '/<div id=\"header\"><span id=\"navbar\">/,/<\/span><\/div>/g' /yourfile

NOTE: If you are working on OS X you will need to change the (sed -i) to (sed -i '')

E1Suave
  • 268
  • 2
  • 10
0

with sed would be:

testers="<div id="header"><span id=\"navbar\">... Content ...</span> some stuf </div>"
echo $testers| sed -E 's/<[\w ="/]+>// g'

solved with ssed instead of sed witch means (super sed) you can install it very easy in any POSIX system, so here I go.

testers="<div id="header"><span id=\"navbar\">... Content ...</span> some stuf </div>"
echo $testers| ssed -R -e 's/<[\w ="/]+>// g'

The result was.

... Content ... some more stuf

Cheers.

mahmoh
  • 802
  • 1
  • 9
  • 15