-6

I'm playing with the following incoming HTML structure that I don't control:

<div id="someRandom1">
    <div id="someRandom2">
        <div id="someRandom3">
            <div id="someRandom4">
                 ...
                     <p>Actual content</p>
                     <ul>
                         <li>This is a thing I need too</li>
                     </ul>
                     <a href="">And this</a>
                     <p>Some more content</p>
                 ...
            </div>
        </div>
    </div>
    <p id="additionalGarbage">Don't need this</p>
</div>

What I'm trying to accomplish is to end up with the following:

<p>Actual content</p>
<ul>
    <li>This is a thing I need too</li>
</ul>
<a href="">And this</a>
<p>Some more content</p>

I don't know how many divs there will be but I do know there's only one child div and the stuff inside the last div is what I need. Logic should probably be to check for a child div, get the contents and check for a child div. If another child div, do check again or else finally return the content. Every loop I've written so far crashes Chrome so I'm obviously writing it wrong. Please advise.

EDIT: After all the comments, I'll try to make this more concise in some bullets.

  • There's an unknown number of nested divs. (I don't have any control of this).
  • The child div may or may not be the first element inside the parent div.
  • The html structure in the deepest div needs to be kept in tact.

Bonus: minimal lines of code.

Donnie D'Amato
  • 3,832
  • 1
  • 15
  • 40
  • Why are there so many wrappers? – Naftali Apr 14 '16 at 14:57
  • 5
    @Neal does it matter? He said he doesn't have control... – brso05 Apr 14 '16 at 14:57
  • If the IDs are random, how are you targeting the content in the first place? And are you saying you need to actually "unwrap" the inner content, meaning leave it in the DOM, but remove its ancestors? –  Apr 14 '16 at 15:02
  • And you want all the text content of the last `div` or just what's inside the `p`? – Mike B Apr 14 '16 at 15:02
  • If there can be other paragraph tags, we need more information. how do you decipher the one you want from from another? – Kevin B Apr 14 '16 at 15:03
  • 1
    Possible duplicate of [Using jQuery is there a way to find the farthest (deepest, or most nested) child element?](http://stackoverflow.com/questions/19259029/using-jquery-is-there-a-way-to-find-the-farthest-deepest-or-most-nested-child) – brso05 Apr 14 '16 at 15:03
  • The content comes in from an API, that how I get it. – Donnie D'Amato Apr 14 '16 at 15:03
  • http://stackoverflow.com/questions/3787924/select-deepest-child-in-jquery – brso05 Apr 14 '16 at 15:03
  • 1
    an API is sending you html like that @fauxserious ? I am really doubting your name. – Naftali Apr 14 '16 at 15:04
  • @fauxserious: So you're saying that you're getting some HTML to be placed in the DOM, and that's what you need unwrapped? And are you sure about the tag names being divs and the content not being divs? Some of these details are fuzzy. –  Apr 14 '16 at 15:05
  • 2
    I would contact the API devs and tell them to make a better API that sends you actual data instead of nonsense HTML – ndugger Apr 14 '16 at 15:06
  • Are all the "garbage" elements empty? – Kevin B Apr 14 '16 at 15:09
  • No there could be things inside of them. – Donnie D'Amato Apr 14 '16 at 15:10
  • Then make your example match the input you're working with, otherwise you'll get nothing but solutions that don't actually work for your real input. – Kevin B Apr 14 '16 at 15:10
  • I don't understand why that matters, it's not a div. – Donnie D'Amato Apr 14 '16 at 15:10
  • Right, but if you have three paragraph tags, and you want a specific one, we need to know what makes the specific one "special". – Kevin B Apr 14 '16 at 15:11
  • No I don't want a specific paragraph tag, I want ANYTHING in the last div. – Donnie D'Amato Apr 14 '16 at 15:12
  • Does the last div have any siblings? Do any of the divs have siblings? Until you post the real data the questions won't stop. – Kevin B Apr 14 '16 at 15:12
  • `Every loop I've written so far crashes` I don't see these... But you can approach by using CSS queries in `document.querySelectorAll(' ur_css_here ')`. – KarelG Apr 14 '16 at 15:13
  • @fauxserious can you show what javascript code you tried? Also can you show some real examples of this html API response? – Naftali Apr 14 '16 at 15:14
  • Also @fauxserious even if someone came up with a solution, the last elements would be the list items in the unordered list. I am very confused.. – Naftali Apr 14 '16 at 15:17
  • Of course the last div has siblings, but they aren't divs (then it wouldn't be the last div). – Donnie D'Amato Apr 14 '16 at 15:33

2 Answers2

1

Assuming...

  • you have the top level (since you said you're getting it from an API)
  • you only need to remove the outermost divs (by tag name)
  • the divs targeted for removal will be the first div among its siblings (though there may be other elements with different names around it)

...you can do this:

// Assumes you have a handle on the root level
var node = $("#someRandom1");
var div = node.children("div")
while (div.length) {
  node = div.first()
  div = node.children("div")
}

// now node.children will be the content

alert(node.children().map(function(i, n) { return n.nodeName }).toArray())
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<div id="someRandom1">
    <p class="garbage"></p>
    <div id="someRandom2">
        <p class="garbage"></p>
        <div id="someRandom3">
            <p class="garbage"></p>
            <div id="someRandom4">
                 ...
                     <p>Actual content</p>
                     <ul></ul>
                     <a href=""></a>
                     <p></p>
                 ...
            </div>
        </div>
        <p class="garbage"></p>
    </div>
    <p id="additionalGarbage"></p>
</div>

This simply starts with the outermost div, and if it has at least one div div, it traverses down to that. So you end up with node being the innermost consecutive div child and node.children holds its content nodes.

  • I'm not confident that `.firstElementChild` is going to work here, since I'm not sure if the `div` will always be the first element inside the parent `div`. I'd need something to identify the `div`, not assume it's the first element. Unless I'm reading the spec for `.firstElementChild` wrong? – Donnie D'Amato Apr 14 '16 at 15:36
  • @fauxserious: Well that's the problem with your question. As we look at the question, and can find many different possible interpretations that could exist if the DOM structure you gave isn't exactly like that. So you get answers based on what was given but then you add to the criteria. –  Apr 14 '16 at 15:39
  • ...so what is the real criteria? Does it need to be the first `div` found in each? –  Apr 14 '16 at 15:39
  • I didn't realize there would be some many interpretations of this. The only div found in each. – Donnie D'Amato Apr 14 '16 at 15:41
  • @fauxserious: Then at each level, instead of using `.firstElementChild`, you'll need to traverse the `.children` to get to the first div. Actually I think this could be done with `querySelector` in a loop. –  Apr 14 '16 at 15:42
  • Perfect, exactly the solution I was looking for. Thank you. – Donnie D'Amato Apr 14 '16 at 15:55
  • @fauxserious: You're welcome. –  Apr 14 '16 at 15:57
  • @fauxserious: FYI, you may be able to replace the loop with `var node = $("#someRandom1").find("div:not(:has(> div))").first()`. Only thing is that it assumes there's at least one child div in the outermost div. If that's no the case, then you can just use the `.children()` of the outermost div. –  Apr 14 '16 at 16:09
  • Yo, that line is just as good! Great work! – Donnie D'Amato Apr 14 '16 at 16:23
0

This shall do the trick:

document.getElementById( 'someRandom1' ).querySelector( ':not(div)' ).parentNode.innerHTML
jAndy
  • 231,737
  • 57
  • 305
  • 359
  • Wouldn't this potentially grab the garbage element not in the final div? – Donnie D'Amato Apr 14 '16 at 15:32
  • No, `querySelector` works inside-out. – jAndy Apr 14 '16 at 15:32
  • @fauxserious: It retrieves elements in document order, which is a depth-first traversal, so as long as the `

    ` comes after its `div` sibling, jAndy's solution should work. Just note the his and my solutions rely on the `div` elements always being the first element child of their parent.
    –  Apr 14 '16 at 15:37