-1

I am parsing a website and I get an element like this

<td>
                                <span class="label">Hometown/High School:</span>
"                                                                                                  
                                                                                                            Austin, TX
                                                        /
                                                                                        Westwood
                                                                                                    "</td>

Problem is when I'm manipulating the text nodes, i get a node like--> "

                                                                                                            Austin, TX
                                                        /
                                                                                        Westwood

"

Its parent comes null. And I want to split this text on '/' and replace it with tags like <sometag>Austin,Tx</sometag> <sometag>Westwood</sometag>

But not able to do it coz the text node's parent is coming null, not able to calculate its xpath.

EDIT : code that I'm using to split and replace the textnode

let parent = textnodeStr.parentElement; // textnodeStr == the text node element
        if(parent != null){
            parent.innerHTML = '';
            let elements = [];
            for (var j=0; j< arr.length; j++){ //arr is the array which contains ['Austin, Tx', 'Westwood'] i.e. the substrings I get After I split the above textnode using '/'
                elements[j] = document.createElement("rtechContainer");
                newText = document.createTextNode(arr[j]);
                elements[j].appendChild(newText);
                parent.appendChild(elements[j])
            }
        }

ADDITIONAL INFO I am usinf createTreeWalker to access the text nodes. Here's a log of what I am doing.

  1. Using createTreeWalker accessing the text nodes.
  2. Based on some condition, storing selective textnodes in an array (say array selectedTextNodes).
  3. When treeWalker finishes its execution, call another function through which I access the previously mentioned array (selectedTextNodes).
  4. Now inside the function, I iterate over the array and try to access the parentNode of each item. Here is what happens.

For text node text1 in

<td><span> "text1" </span></td>

I get the parent node in my function.

For text node text2 in

<td>" text2 "</td>

I get parent null in my function.

However, I get the required correct parents when I access parentNode of these two text nodes in createTreeWalker itself.

Tripti Rawat
  • 645
  • 7
  • 19

1 Answers1

0

For doing that you do not need the parent element. You just need the textcontent of your "rtechcontainer" elements. Also parentElement is not supported in every browser (parentNode should be).

Explanations are in comments:

<html>
    <head>
        <script>
            //Just an event bound to load, for testing
            window.onload = function(){
                //Grabbing all elements with the tagnaname 'rtechcontainer'
                for(var tL=document.querySelectorAll('rtechcontainer'), i=0, j=tL.length; i<j; i++){
                    var tText = tL[i].textContent; //Holds the textcontent of the element
                    console.log('textcontent', tText);

                    //What we want: <sometag>Austin,Tx</sometag> <sometag>Westwood</sometag>
                    //First we clear the element
                    tL[i].innerHTML = '';

                    //Second we split the textcontent and loop through it
                    for(var tS=tText.split('/'), m=0, n=tS.length; m<n; m++){
                        var tAnyElement = tL[i].appendChild(document.createElement('sometag'));
                        tAnyElement.textContent = tS[m].trim(); //Assigning the trimmed part of the textcontent
                    }
                }
            }
        </script>
    </head>

    <body>
        <div>
            <span class="label">Hometown/High School:</span>
            <rtechcontainer>Austin, TX / Westwood</rtechcontainer>
        </div>
    </body>
</html>

Update:

For the problem with the parentnode mention in the comment, I need to know how you fetch the node itself. One possible way would be like this:

    //Just an event bound to load, for testing
    window.onload = function(){
        var tNode, //Is going to be the current node
            tWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT, null, false); //All textnodes within the body

        //Check all textnodes in the list
        while(tNode = tWalker.nextNode()){
            console.log('textnode', tNode);
            console.log('content of textnode', tNode.textContent);
            console.log('parent of textnode', tNode.parentNode);

            //We only need the ones containing slashes
            if(tNode.textContent && tNode.textContent.indexOf('/') !== -1){
                console.log('this one we need', tNode);
                //createelements, split, just like above
            }
        }
    };

Update 2:

I adjusted my example according to your edit and it still works fine. Maybe the textnodes in your array get tempered with from the time you push them until the time you access them?

<html>
    <head>
        <script>
            var selectedTextNodes = []; //The textnodes from treewalker get stored here

            //Just an event bound to load, for testing
            window.onload = function(){
                var tNode, //Is going to be the current node
                    tWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT, null, false); //All textnodes within the body

                //Check all textnodes in the list
                while(tNode = tWalker.nextNode()){
                    //Adding the textnode to the list depending on some condition
                    (tNode.textContent && tNode.textContent.trim()) && selectedTextNodes.push(tNode)
                };

                useTextNodes(selectedTextNodes)
            };

            //Functions to use the textnodes in anyway
            function useTextNodes(listOfTextNodes){
                if(listOfTextNodes && listOfTextNodes.length){
                    for(var i=0, j=listOfTextNodes.length; i<j; i++){
                        console.log(i, listOfTextNodes[i].textContent, listOfTextNodes[i].parentNode)
                    }
                }
            }
        </script>
    </head>

    <body>
        <div>
            <span class="label">Hometown/High School:</span>
            Austin, TX / Westwood
        </div>
    </body>
</html>
Lain
  • 3,657
  • 1
  • 20
  • 27
  • Hi lain, thanks for such a thoroughly explained answer. However `rtechContainer` is not already present in the html (see my example code, I am creating it using document.createElement('rtechContainer'). ). Though I have mentioned this in the last code mentioned in the question, my bad I had put up wrong first html code. I have updated it. The first html code you find in the question is how I receive the html. I receive the text node as " Austin, Tx/Westwood". Its parentElement is also null, parentNode is also null (though it is inside ) – Tripti Rawat Apr 24 '18 at 10:07
  • How do you receive or fetch that textnode? I do not mean the content string but the node itself. – Lain Apr 24 '18 at 11:38
  • Made a small update and I can access the parentNode without any problem. At least from the example provided. – Lain Apr 24 '18 at 11:59
  • Hi @Lain, pls refer to the `Additional Info` update in my question where I have explained what's going on on my side. I hope the explanation is clear. – Tripti Rawat Apr 25 '18 at 10:04
  • Are you sure that textnode2 is really "text2". Did you ever log the textContent of it?. TreeWalker also includes other textnodes, like – Lain Apr 25 '18 at 11:05
  • Made another example with your mentioned structure. Yet it still works correctly here. – Lain Apr 25 '18 at 11:18
  • omg you are correct! I was running two for loops for managing and splitting diff types of text nodes (contained in two arrays) I collected in the first treeWalker traversal. I have since then separated the two in two different functions, each having its own traversal before it, and it works perfectly. Thanks for that. I still dont know though why it ws only affecting the text node of second type and not the first type. But thanks anyway. You can add this line in your ans if you want, I'll accept it. – Tripti Rawat Apr 25 '18 at 11:33