0

I want to remove the text between the end of one HTML tag and the beginning of another.

The tags on the page have different text between them. There are of course multiple different blocks too delete on the page.

</h1>
Section: ab (1)<br>Updated: 2015-05-01<br><a href="file:///home/gareththomasnz/Desktop/VirtualBoxShare/merged.html#2_index">Index</a>
<a href="file:///man/man2html">Return to Main Contents</a><hr>

<p>
<a name="2_lbAB">&nbsp;</a>
</p><h2>

Everything in between /H1 and H2 tags through the whole page must be deleted.

Tried a few things but cant get it to work - any suggestions?

Ken White
  • 123,280
  • 14
  • 225
  • 444
Gareth Thomas
  • 420
  • 3
  • 4
  • 1
    We don't add SOLVED to titles here when a question is answered. You indicate it was solved by either accepting the answer someone provided you or by writing your own answer (in the space provided below) and accepting it as the solution. – Ken White Feb 13 '16 at 03:47

2 Answers2

0

http://sundstedt.se/blog/delete-specific-text-blocks-between-two-characters/

this is a solution

Deletes a random text block between any characters without using regex

Gareth Thomas
  • 420
  • 3
  • 4
0

Turn on DOTALL and use a reluctant quantifier:

Search: (?s)(?<=</h1>).*?(?=<h2>)
Replace: <blank>

Note: I'm not familiar with powergrep, so it may use "slash delimited" regex syntax, so:

/(?<=</h1>).*?(?=<h2>)/s
Bohemian
  • 412,405
  • 93
  • 575
  • 722