2

I'm using PyQuery to process this HTML:

<div class="container">
    <strong>Personality: Strengths</strong>
    <br />
    Text
    <br />
    <br />
    <strong>Personality: Weaknesses</strong>
    <br />
    Text
    <br />
    <br />
</div>

Now that I've got a variable e point to .container, I'm looping through its children:

for c in e.iterchildren():
    print c.tag

but in this way I can't get text nodes (the two Text string)

How can I loop an element's children include text nodes?

wong2
  • 34,358
  • 48
  • 134
  • 179

1 Answers1

1

you can do it likes

        for c in e.children():
            p = PyQuery(c)
            print p.__str__()  
            #here re.sub remove html tag

This code could get the raw text of each node. If you want to distinguish the text tag from others :

            raw = p.__str__().strip()
            a = raw.rfind(">")
            if (a+1!=len(raw)) : 
                print 'is text'
eminia
  • 11
  • 1