1

How do you get the "sources" values for "group name" 457 for "lab symbol"

<?xml version="1.0" encoding="UTF-8"?>
<tabgroup1 name1="CRDT" revision="19531">
  <client name="123" group="457" />

  <group name="457">
    <lab symbol="xyz/ght" sources="SBS,TCS,DDT" />
    <lab symbol="pqr/xyz" sources="RRT,QQR,XXC" />

      </group>

  <group name="345">
    <lab symbol="xyz/ght" sources="GHB,GRT,BNM" />

      </group>


</tabgroup1>

i tested using the steps in: Python: access nested children in xml file parsed with ElementTree but output prints all attributes.

1 Answers1

1

You can use BeautifulSoup:

data = '''<?xml version="1.0" encoding="UTF-8"?>
<tabgroup1 name1="CRDT" revision="19531">
  <client name="123" group="457" />
  <group name="457">
    <lab symbol="xyz/ght" sources="SBS,TCS,DDT" />
    <lab symbol="pqr/xyz" sources="RRT,QQR,XXC" />
      </group>
  <group name="345">
    <lab symbol="xyz/ght" sources="GHB,GRT,BNM" />
      </group>
</tabgroup1>'''

soup = BeautifulSoup(data, 'lxml')

result = [j['sources'] for i in soup.find_all('group', {'name': '457'}) for j in i.find_all('lab')]
result
#['SBS,TCS,DDT', 'RRT,QQR,XXC']

And this would do the same using xml:

import xml.etree.ElementTree as ET

data = '''<?xml version="1.0" encoding="UTF-8"?>
<tabgroup1 name1="CRDT" revision="19531">
  <client name="123" group="457" />
  <group name="457">
    <lab symbol="xyz/ght" sources="SBS,TCS,DDT" />
    <lab symbol="pqr/xyz" sources="RRT,QQR,XXC" />
      </group>
  <group name="345">
    <lab symbol="xyz/ght" sources="GHB,GRT,BNM" />
      </group>
</tabgroup1>'''

tree = ET.fromstring(data)
result = [j.attrib['sources'] for i in tree.findall('group') if i.attrib['name'] == '457' for j in i.findall('lab')]
result
#['SBS,TCS,DDT', 'RRT,QQR,XXC']
zipa
  • 27,316
  • 6
  • 40
  • 58