10

I am looking to do something similar to this plugin http://johannburkard.de/blog/programming/javascript/highlight-javascript-text-higlighting-jquery-plugin.html

But the problem I am facing is that the above plugin does not allow you to highlight words within html.

So if you are looking for my text

inside html like

this is <a href="#">my</a> text that needs highlighting

You will not get any highlighting.

Is there a way to highlight text while ignoring any html tags in between?

Mike
  • 12,359
  • 17
  • 65
  • 86

3 Answers3

7

I fiddled some RegEx which allows HTML tags at the position of whitespace chars:

<div id="test">this is <a href="#">my</a> text that needs highlighting</div>

JavaScript:

var src_str = $("#test").html();
var term = "my text";
term = term.replace(/(\s+)/,"(<[^>]+>)*$1(<[^>]+>)*");
var pattern = new RegExp("("+term+")", "i");

src_str = src_str.replace(pattern, "<mark>$1</mark>");
src_str = src_str.replace(/(<mark>[^<>]*)((<[^>]+>)+)([^<>]*<\/mark>)/,"$1</mark>$2<mark>$4");

$("#test").html(src_str);

Try it here: http://jsfiddle.net/UPs3V/

Julien Schmidt
  • 1,060
  • 8
  • 8
  • 1
    that is looking good, but if you add a space between "my" and "term" it does not highlight. I am referring to the JSFiddle – Mike Feb 06 '12 at 23:58
  • i fixed that bug, but a new one turns up.. since the script outputs something like `my text` the browser engine auto corrects it to `my text`. Any ideas? :/ – Julien Schmidt Feb 07 '12 at 00:06
  • yikes, that looks like some serious regex. I will run it through the ringer and see how it performs. Speed is a concern for me. I do have the option of doing this on the server side so it might make sense to convert this to php too. I did some reading and some people are recommending a php Dom Parser, but I am not sure where to start with that. – Mike Feb 07 '12 at 03:29
  • The RegEx looks heavier than it is. I think it would be possible to use a callback function instead of the 2nd `src_str.replace`. I'll take a look later – Julien Schmidt Feb 07 '12 at 09:46
  • 1
    If you replace `my text` with `href` in the second line of JavaScript, it will break the logic. Do you think you have time to look over it? It would help me greatly. – Adrian Marinica May 13 '13 at 10:16
  • A string doesn't have the method `highlight()`. Under what circumstance would this answer above the word **Edit** actually work? – Archonic Sep 13 '13 at 22:10
  • You are right, this answer is complete nonsense. I removed it. Like this http://jsbin.com/etepEzu/1/edit it would work, though. – Julien Schmidt Sep 14 '13 at 07:25
  • @AdrianMar do you still need it? – Julien Schmidt Sep 14 '13 at 07:26
  • Personally, I changed my whole approach. :) For posterity though, if you would have time to fix it, you would be a great person. :) – Adrian Marinica Sep 14 '13 at 23:35
  • @JulienSchmidt hey man, did you managed to find the solution to this problem, i am also working with johann burkard plugin but it breaks when with words within html. Do you still have the solution, i badly need it. Here is what i was trying to do since, i got some responses from stack overflow but still no good solution: [click here](http://stackoverflow.com/questions/23931579/jquery-clone-node-error/) I have tried your jsfiddle example but when there is a space between words, it breaks. Any help will be grateful. Thanks. – ronish May 29 '14 at 21:01
  • @AdrianMar please see my above comment, i need some help. – ronish May 29 '14 at 21:22
  • 1
    Although the effort is deeply appreciated, it must be noted that parsing HTML with regex has a tendency to open up the gates of hell and begin a flood of flesh-eating monsters who destroy all that is good and beautiful. – Teekin Aug 02 '14 at 17:45
  • If you try to highlight (is my text) it won't work, or basically if the link was in the middle of the highlighted text it won't work, any idea why and how to fix this ? – omar Sep 03 '15 at 03:10
2

I would like to leave a comment but stack doesn't allow me or I can't seem to find a button for it so have to do it in an answer. There is a problem with the script. for example: Highlighting fox in the following string:

<img src="brown fox.jpg" title="The brown fox" /><p>some text containing fox.</p>

Would break the img tag:

var term = "fox";
term = term.replace(/(\s+)/,"(<[^>]+>)*$1(<[^>]+>)*");
var pattern = new RegExp("("+term+")", "i");
src_str ='<img src="brown fox.jpg" title="The brown fox" />'
    +'<p>some text containing fox.</p>'
src_str = src_str.replace(pattern, "<mark>$1</mark>");
src_str = src_str.replace(/(<mark>[^<>]*)((<[^>]+>)+)([^<>]*<\/mark>)/,"$1</mark>$2<mark>$4");
src_str

I was working on a highlight script and solved that problem not realizing the script might have to highlight over multiple tags. Here is how I got it not to destroy tags by trying to highlight content within tags, this script highlights multiple instances within content as well but not over multiple tags:

str='<img src="brown fox.jpg" title="The brown fox" />'
    +'<p>some text containing fox. And onother fox.</p>'
var word="fox";
word="(\\b"+ 
    word.replace(/([{}()[\]\\.?*+^$|=!:~-])/g, "\\$1")
        + "\\b)";
var r = new RegExp(word,"igm")
str.replace(/(>[^<]+)/igm,function(a){
    return a.replace(r,"<span class='hl'>$1</span>");
})

Not sure how to combine these scripts and make a highlight over multiple tags work but will keep my eye on this thread.

HMR
  • 37,593
  • 24
  • 91
  • 160
1

I tried the accepted solution and it seems to break on complex pages such as this one (infinite recursion). Also, I don't really see the benefit of relying on jQuery, so I rewrote the code to be independent of jQuery (it's also substantially shorter and doesn't support the fancy options).

It's easy enough to add the options back in and use classes if you like.

function highlight(re, node, depth, maxdepth){
    node = node || document.body;
    depth = depth || 0;
    maxdepth = maxdepth || 10;

    if( node.nodeType === 3 ){
        var match = node.data.match(re);
        if(match){
            var span = document.createElement('span');
            span.style.backgroundColor = '#ffa';
            var wordNode = node.splitText(match.index);
            wordNode.splitText(match[0].length);
            var wordClone = wordNode.cloneNode(true);
            span.appendChild(wordClone);
            wordNode.parentNode.replaceChild(span, wordNode);
            return 1;
        }
    } else if( depth === maxdepth ){
        console.log( 'Reached max recursion depth!', node );
    } else if( (node.nodeType === 1 && node.childNodes) && !/(script|style)/i.test(node.tagName) ) {
        for( var i = 0; i < node.childNodes.length; i++ ){
            i += highlight(re, node.childNodes[i], depth + 1, maxdepth);
        }
    }
    return 0;
}
podperson
  • 2,284
  • 2
  • 24
  • 24