0

I use this as 1 of the many parts in my language translation jquery script.

This part grab's the text of a node, as I loop through all the nodes on a web page.

However it grab's a lot of the hidden javascript as a text node as well.

So is there a way to modify this, to just get the html side? And plus trim unneeded whitespace?

Here is the original code.

var content = function (node, txt) {
if (txt) {
    if (node.textContent) {
        node.textContent = txt;
    } else if (node.nodeValue) {
        node.nodeValue = txt;
    }
} else {
    return node.textContent ? node.textContent : node.nodeValue;
}

};

Here will help show the context of this code.

// recursive tree walker
(function (parent) {
    var childs = parent.childNodes;
    // if childs object has data
    if (childs && childs.length) {
        var i = childs.length; while (i--) {
            // assign node variable to childs object
            node = childs[i];
            // text node found, do the replacement
            if (node.nodeType == 3) {
                // assign the current value to a variable
                var value = content(node);

            } else {
                arguments.callee(node);
            }
        }
    }
})(document.body);

All of this is the logic my language translation code works, I just want to tweak the input so it grabs the text but no javascript code that is in the source of the page.

crosenblum
  • 1,869
  • 5
  • 34
  • 57
  • could you post some of the HTML you're working with as well? – jpea Mar 01 '11 at 22:26
  • Are you using [`.text()`](http://api.jquery.com/text) or [`.html()`](http://api.jquery.com/html)? You should be using the former; I would expect that to omit ` – Matt Ball Mar 02 '11 at 04:45
  • I am looping thru each text node on a page, and am using the code above to get the value of that text node. But when I look at the results, it grabs any javascript code on the page as well, and i want to skip that part. – crosenblum Mar 02 '11 at 14:47

1 Answers1

0

Not quite sure where the function you've posted is being called from (how you're using it). Checkout this question though, which does something like what you want. The key is:

nodeType == 3

That's how you check if the DOM node is a text node. Beyond that, you may have to handle script tags specially, but you could:

:not(script)

in your jquery selector to get rid of them

Community
  • 1
  • 1
Melv
  • 2,201
  • 16
  • 14
  • `+1` for the `:not(script)` suggestion, but `-0.5` for a lazy answer. – Matt Ball Mar 02 '11 at 04:46
  • I'm sorry you found it lazy, I found it hard to be more complete/specific without some more HTML and script being posted by the original user. It certainly wasn't my intention. – Melv Mar 02 '11 at 05:14
  • How do I apply this? I can't change my selector, i can either change how I get the text, or try to strip out javascript from the results. – crosenblum Mar 02 '11 at 16:21
  • You can test jquery nodes after the fact. If you want to determine if a node is not a script node you can use `if (!node.is("script"))`. Alternatively you can use filter, which would be better performance wise since you can apply it on your entire node set. To do this, `nodes = nodes.filter(":not(script)");`. I'd really recommend against trying to regex parse out javascript or something like that. It will be more difficult than you think and probably very error prone. – Melv Mar 02 '11 at 21:34