7
<?php
    $str = "word <a href=\"word\">word</word>word word";
    $str = preg_replace("/word(?!([^<]+)?>)/i","repl",$str);
    echo $str;
    # repl <word word="word">repl</word>
?>

source: http://pureform.wordpress.com/2008/01/04/matching-a-word-characters-outside-of-html-tags/

Unfortunality my project needs a semantic libs avaliable only for Java...

// Thanks Celso

Mikepote
  • 6,042
  • 3
  • 34
  • 38
celsowm
  • 846
  • 9
  • 34
  • 59

3 Answers3

13

Use the String.replaceAll() method:

class Test {
  public static void main(String[] args) {
    String str = "word <a href=\"word\">word</word>word word";
    str = str.replaceAll("word(?!([^<]+)?>)", "repl");
    System.out.println(str);
  }
}

Hope this helps.

kolrie
  • 12,562
  • 14
  • 64
  • 98
3

To translate that regex for use in Java, all you have to do is get rid of the / delimiters and change the trailing i to an inline modifier, (?i). But it's not a very good regex; I would use this instead:

(?i)word(?![^<>]++>)

According to RegexBuddy's Debug feature, when it tries to match the word in <a href="word">, the original regex requires 23 steps to reject it, while this one takes only seven steps. The actual Java code is

str = str.replaceAll("(?i)word(?![^<>]++>)", "repl");
Alan Moore
  • 73,866
  • 12
  • 100
  • 156
1

Before providing a further answer, are you trying to parse an html document? If so, don't use regexes, use an html parser.

Zak
  • 24,947
  • 11
  • 38
  • 68
  • my tool "generates" XHTML replacing terms in a txt in a new tags using the terms as a value inside of tag, i am using the replaceAll approach because some terms can be composited like "Celso Araujo Fontes". Example, how replaceAll myTerm in this situation myTerm is cool friend – celsowm Jul 22 '10 at 00:47