I have an XML file whose structure is similar to the following:
<?xml version="1.0" encoding="UTF-8"?>
<drugbank xmlns="http://www.drugbank.ca" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.drugbank.ca http://www.drugbank.ca/docs/drugbank.xsd" version="5.0" exported-on="2017-12-20">
<drug type="biotech" created="2005-06-13" updated="2017-11-06">
<drugbank-id primary="true">DB00001</drugbank-id>
<drugbank-id>BTD00024</drugbank-id>
<drugbank-id>BIOD00024</drugbank-id>
<cas-number>138068-37-8</cas-number>
<name>Lepirudin</name>
</drug>
<drug type="biotech" created="2005-06-13" updated="2017-11-06">
<drugbank-id primary="true">DB00045</drugbank-id>
<drugbank-id>BTD00054</drugbank-id>
<drugbank-id>BIOD00054</drugbank-id>
<cas-number>205923-56-4</cas-number>
<name>Lyme disease vaccine (recombinant OspA)</name>
</drug>
</drugbank>
I am trying to utilize cElementTree module of Python 3. I would like to extract the name of each drug in this XML, for which I have written the following code:
import xml.etree.cElementTree as ET
tree = ET.parse('fulldatabase.xml')
drugbank = tree.getroot()
print(drugbank.tag)
for drug in drugbank:
print(drug.find('name').text)
The error I get is AttributeError: 'NoneType' object has no attribute 'text'
I have also tried checking this but the answer the OP wrote in it did not work for me. Is there any way to get name
and cas-number
field out of each drug. I have tried some combinations like removing findall()
in the for loop condition, but things did not work for me even then.