So I am writing a Python script in order to obtain data from XML that I get in response to an API request which was sent using POST
and the requests
library.
Currently I am using my request like so and getting a response back like:
req = requests.post(url + '/endpoint', headers = headers, params = {'search': searchQuery}, verify = False)
print(req.text)
This results in req.text
giving a response to me of my XML which is structured like so:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xml" href="/static/atom.xsl"?>
<feed>
<!-- Feed elements>
<entry>
<!-- Other Elements -->
<content type="text/xml">
<s:dict>
<!-- Other keys. -->
<s:key name="sid">DATA I WANT HERE</s:key>
<!-- Other keys. -->
</s:dict>
<!-- Lots of other dicts here. -->
</content>
</entry>
<! -- Other entries -->
</feed>
My goal is to obtain all the data from the s:key
with name
of sid
and print that out. There are hundreds of entries per feed and in each there is only one s:key
with a sid
in it (it's a service identifier I need to obtain).
My issue is I'm not sure how to extract it, cause right now I'm trying to use Element Tree like so, but it is not returning the results I want.
print(req.text)
results = ET.fromstring(req)
for job in results.findall('s:key'):
print(job.get('name'))
I also tried:
for node in results.findall('s:key'):
if node.attrib['name'] == "sid":
print(node)
which also does not give me the info I want.
What am I doing wrong and how do I fix it? I'm somewhat unfamiliar with Python and very new to XML parsing so I would appreciate some insights into this problem.
Addendum:
To add, currently it seems to just print out all the XML lines with s:key
and an attribute of name
in them which I do not want.
For example a sample output at the moment is:
<s:key name="a">74993868</s:key>
<s:key name="b">0</s:key>
<s:key name="c">date</s:key>
<s:key name="d">6000</s:key>
<s:key name="e">600</s:key>
<s:key name="f">text</s:key>
<s:key name="sid">data I actually want</s:key>
<!-- Etc -->