5

Suppose I have an HTML page that looks something like this:

<html><body>
00123
<input value="00123">
00456
</body></html>

And I want to use javascript/jQuery to make it look like this:

<html><body>
<a href="#00123">00123</a>
<input value="00123">
<a href="#00456">00456</a>
</body></html>

Essentially I want to wrap regular expression matched plain strings in the document with HTML anchor tags. In this example, I want to do something like:

$('body').html($('body').html().replace(/(00\d+)/, '<a href="#$1">$1</a>'));

See the jsFiddle with this example: http://jsfiddle.net/NATnr/2/

The problem with this solution is that the text inside the input element is matched and replaced.

Does anyone know how to only match and replace plain text in a document in this manner using javascript/jQuery?

Rusty Fausak
  • 7,355
  • 1
  • 27
  • 38

4 Answers4

8

Try filtering the body's contents() by nodeType to get only the Text Nodes, then replace them with jQuery-generated anchor elements (any extra text in these nodes will be kept as Text Node):

$('body').contents().filter(function() {
    return this.nodeType === 3;
}).each(function() {
    $(this).replaceWith($(this).text().replace(/(00\d+)/g, '<a href="#$1">$1</a>'));
});

Fiddle

As you know, most often it's not a good idea to parse HTML with Regex (look out for the ponies, they are evil), but if you isolate a part of the HTML you want to parse and it follows a relatively simple pattern, it is a viable option.

edit: Included the g flag (global modifier) in your Regex to allow for matching multiple anchors inside a single Text Node.

Community
  • 1
  • 1
Fabrício Matté
  • 69,329
  • 26
  • 129
  • 166
6

The final solution ended up looking like this:

jQuery.fn.linker = function () {
    $(this).contents()
        .filter(function() { return this.nodeType != Node.TEXT_NODE; })
        .each(function () { $(this).linker(); });
    $(this).contents()
        .filter(function() { return this.nodeType == Node.TEXT_NODE; })
        .each(function () {
            $(this).replaceWith(
                $(this).text().replace(/(00\d+)/g, '<a href="#$1">$1</a>')
            );
        });
}
$(document).ready(function () {
    $('body').linker();
});

See the jsFiddle at work here: http://jsfiddle.net/fr4AL/4/

Thanks to:

Community
  • 1
  • 1
Rusty Fausak
  • 7,355
  • 1
  • 27
  • 38
  • The `.isbnlink()` in your fiddle should be `.linker()` I guess (or you forgot to include that extension), and wrapping whole text Nodes may come with extra white space, so take your time to review those. +1 for the awesome plugin code though. You can [`$.trim`](http://api.jquery.com/jQuery.trim/) the `$.trim($(this).text())` inside the `href` if you need to remove those extra white spaces. – Fabrício Matté Jul 26 '12 at 04:05
  • @FabrícioMatté Thanks for the tips. I actually was working on solving that whitespace issue. I think I have it figured out now. I edited my answer with what I ended up with. I had to construct a new DOM element in order to only alter the matched substring. – Rusty Fausak Jul 26 '12 at 04:23
  • Looks good. `=]` Just one thing though, duplicate IDs (multiple `span`s with `id="link"`) are invalid markup (means it won't validate on w3c validator and you may run in problems if you ever try a `#link` selector with jQuery or CSS), and also jQuery automatically does the parsing for you when you supply a HTML string to [`replaceWith`](http://api.jquery.com/replaceWith/). I removed the duplicated IDs and simplified the DOM parsing part: [fiddle](http://jsfiddle.net/ult_combo/fr4AL/2/). – Fabrício Matté Jul 26 '12 at 05:21
  • @FabrícioMatté Great work with replaceWith parsing the HTML properly! You can actually simplify it even further and remove the `` element entirely. I have updated my answer. – Rusty Fausak Jul 26 '12 at 16:40
  • Yup, very nicely done. I assume the first `$(this).contents().filter` is to make the function recursive or something? Awesome work. – Fabrício Matté Jul 26 '12 at 23:20
0

This from a related answer to a question by bobince:

You're right to not want to be processing HTML with regex. It's also bad news to be assigning huge chunks of .html(); apart from the performance drawbacks of serialising and reparsing a large amount of HTML, you'll also lose unserialisable data like event listeners, form data and JS properties/references.

Here's a plain JavaScript/DOM one that allows a RegExp pattern to match. jQuery doesn't really give you much help here since selectors can only select elements, and the ‘:contains’ selector is recursive so not too useful to us.

// Find text in descendents of an element, in reverse document order
// pattern must be a regexp with global flag
//
function findText(element, pattern, callback) {
    for (var childi= element.childNodes.length; childi-->0;) {
        var child= element.childNodes[childi];
        if (child.nodeType==1) {
            findText(child, pattern, callback);
        } else if (child.nodeType==3) {
            var matches= [];
            var match;
            while (match= pattern.exec(child.data))
                matches.push(match);
            for (var i= matches.length; i-->0;)
                callback.call(window, child, matches[i]);
        }
    }
}

findText(document.body, /\bBuyNow\b/g, function(node, match) {
    var span= document.createElement('span');
    span.className= 'highlight';
    node.splitText(match.index+6);
    span.appendChild(node.splitText(match.index+3));
    node.parentNode.insertBefore(span, node.nextSibling);
});
Community
  • 1
  • 1
Moin Zaman
  • 25,281
  • 6
  • 70
  • 74
0

Give this a whirl.... Much cleaner!! ;)

$('input').each(function() {
    var temp;
       temp = $(this).val();
       $(this).before('<a href="#' + temp +'">' +temp+ '</a>');
});
$('body').contents().filter(function() {return this.nodeType == 3;}).remove();
thesanerone
  • 87
  • 1
  • 10