js regex: replace words not in a span tag

Question

For example:

var htmlString = "It's a <span title='mark'>nice day</span> and also a <span title=''>sunny day</span>, it's day for surfing.";

want to replace the last two words "day" with "night", and skip the first one with tag span title "mark".

var replaceString = "day";
var reg=new RegExp("(?!title=\'mark\'>).*"+replaceString+".*(?!<\/span>)","gi")    
var bb=htmlString.replace(reg,"night");    
alert(bb) 

// I can not get the right result with the above code
// Final result wanted: "It's a <span title='mark'>nice day</span> and also a <span title=''>sunny night</span>, it's night for surfing.";

UPDATE: the following works, but only matches 3 "day" in a sentence, how to make it match uncertain numbers of "day"?

alert(htmlString.replace(/(<span.*?'(?!mark)'>.*?)day(.*?<\/span>)|(?!>)day/gi, "$1night$2"));

Thanks.

You get additional penalty points for trying to parse HTML with regex in *JavaScript*, when you **have a DOM parser** at your fingertips. — Niet the Dark Absol, Oct 12 '15 at 11:44
At least once a week, someone wants to use regex with HTML or XML... *Don't do it for god sakes!* — meskobalazs, Oct 12 '15 at 11:45
Actually, there is no hint in [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) how to parse (X)HTML in JS. — Wiktor Stribiżew, Oct 12 '15 at 12:10
@NiettheDarkAbsol where is the link that you have provided to OP, for improving his knowledge ? — Anonymous0day, Oct 12 '15 at 12:58
@stribizhev I post an update script that works but only for 3 words in a text. Can make it suitable for uncertain number of a word? — jdleung, Oct 12 '15 at 14:49
@stribizhev I know nothing about DOM ;-( . Actually, I just want it to search the word in text and add a span tag on it, if the word had been added the tag, skip it. The replaceString is a loop something like: ["nice day","good day,"day", "night"]. If a "nice day" in the text, it will be match twice: "nice day" and "day". so I want to detect if it's beed added tag, and when the "day" comes, skips adding tag. — jdleung, Oct 12 '15 at 15:18
Are you working with an HTML as a string, or do you need to modify the opened document in a browser? — Wiktor Stribiżew, Oct 12 '15 at 15:20
@stribizhev It's a iOS app, the text is from sql and displayed in a webview. If I modify the text contain tags and "'" in iOS, javascript cannot read it. So I choose to add tag things via js. — jdleung, Oct 12 '15 at 15:25

score 0 · Accepted Answer · answered Oct 15 '15 at 19:47

Here is how you can achieve that with a DOM-based approach:

function textNodesUnder(el){
  var n, walk=document.createTreeWalker(el,NodeFilter.SHOW_TEXT,null,false);
  while(n=walk.nextNode())
  {
       if (n.parentNode.nodeName.toLowerCase() !== 'span' ||
          (n.parentNode.nodeName.toLowerCase() === 'span' &&
           n.parentNode.getAttribute("title") !== 'mark'))
        n.nodeValue =  n.nodeValue.replace(/\bday\b/g, "night"); 
  }
  return el.firstChild.innerHTML;
} 

function replaceTextNotInSpecificTag(s) {
  var doc = document.createDocumentFragment();
  var wrapper = document.createElement('myelt');
  wrapper.innerHTML = s;
  doc.appendChild( wrapper );
  return textNodesUnder(doc);
}

var s = "It's a <span title='mark'>nice day</span> and also a <span title=''>sunny day</span>, it's day for <span>surfing day</span>.";
console.log(replaceTextNotInSpecificTag(s));

Result:

It's a nice day and also a sunny night, it's night for surfing night.

First, we create a document fragment, then create an element myelt, then append it as a child to the document fragment allowing us to access the DOM nodes with a dom parser.

Then, using document.createTreeWalker with SHOW_TEXT filter we get access to all text nodes. We walk the nodes, and if the node name is not span or if it is a span tag with a title attribute whose value is not equal to "mark", we perform a search and replace.

It works great! I use a variable for my need: `replace(keyword, ""+newKeyword+"");`, and two things need to be improved. 1. I fail to use variable like `"/"+keyword+"/gi"`. 2. Tags `<` and `>` become `<` and `>` though the browser can read out but looks a little ugly when `alert(s)`. Thanks. — jdleung, Oct 16 '15 at 02:54
To build a dynamic regex you need a RegExp constructor, `RegExp(keyword, "gi")`. You also now seem to be adding tags to text elements but you cannot. Instead, you need to create element nodes, and set nodeValue with the keyword. — Wiktor Stribiżew, Oct 16 '15 at 06:17

js regex: replace words not in a span tag

1 Answers1

Linked