I have some xml that is formatted like this:
<Paragraph Type="Character">
<Text>
TED
</Text>
</Paragraph>
<Paragraph Type="Dialogue">
<Text>
I thought we had a rule against that.
</Text>
</Paragraph>
<Paragraph Type="Character">
<Text>
ANNIE
</Text>
</Paragraph>
<Paragraph Type="Dialogue">
<Text>
...oh.
I'm trying to extract the data so that it looks like this:
Character Dialogue
TED I thought we had a rule against that.
ANNIE ...oh.
I've been trying with:
soup.find(Type = "Character").get_text()
soup.find(Type = "Dialogue").get_text()
which will return one line at a time. When I try to do more than one, with soup.find_all
, i.e.:
soup.find_all(Type = "Character").get_text()
I get the error:
AttributeError: ResultSet object has no attribute 'get_text'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
I understand that find_all()
returns an array of elements (thanks to this previous answer: https://stackoverflow.com/a/21997788/8742237), and that I should select one element in the array, but I would like to get all of the elements in the array into the format I showed above.