0

I'm trying to adapt a recipe to parse an xml feed (http://data.gov.uk/dataset/car-parks)

I can parse the document using objectify but the result comes out like this...

u"{http://www.transportdirect.info/carparking}CarPark = None [ObjectifiedElement]\n {http://www.transportdirect.info/carparking}CarParkRef = 4401 [IntElement]\n {http://www.transportdirect.info/carparking}CarParkName = 'Wallace Street' [StringElement]\n {http://www.transportdirect.info/carparking}Location = 'Galston' [StringElement]\n {http://www.transportdirect.info/carparking}Address = 'Henrietta Street--Galston--East Ayrshire' [StringElement]\n {http://www.transportdirect.info/carparking}Postcode = 'KA4 8HP' [StringElement]\n...

I've then used the code

for elt in root.CarPark:
    el_data = {}
    for child in elt.getchildren():
        el_data[child.tag] = child
    data.append(el_data)

Which returns something like [{'{http://www.transportdirect.info/carparking}AccessPoints': http://www.transportdirect.info/carparking}AccessPoints at 0x1150a8cd0>, '{http://www.transportdirect.info/carparking}Address': 'Nunnery Lane--York--Yorkshire', '{http://www.transportdirect.info/carparking}CarParkAdditionalData': http://www.transportdirect.info/carparking}CarParkAdditionalData at 0x1150a8d20>, '{http://www.transportdirect.info/carparking}CarParkName': 'Nunnery Lane',

...

But when I try to drop it into a dataframe:

{http://www.transportdirect.info/carparking}AccessPoints [[[, , ], [, , ], [, , ]], [[, , ], [, , ], [,... {http://www.transportdirect.info/carparking}Address [[[Nunnery Lane--York--Yorkshire]]] {http://www.transportdirect.info/carparking}CarParkAdditionalData [[[]]] {http://www.transportdirect.info/carparking}CarParkName [[[Nunnery Lane]]] {http://www.transportdirect.info/carparking}CarParkOperator [[[]]] {http://www.transportdirect.info/carparking}CarParkRef [[[3]]] {http://www.transportdirect.info/carparking}DateRecordLastUpdated [[[2013-06-18]]]

What's the extra step I'm missing to clean it up so it works?

mzjn
  • 48,958
  • 13
  • 128
  • 248
elksie5000
  • 7,084
  • 12
  • 57
  • 87
  • What exactly is the problem? What is it that you need to clean up so it works? – mzjn Jul 16 '13 at 16:27
  • you need to make sure that you are adding strings (even though thye print like strings), see: http://stackoverflow.com/questions/16922432/parsing-xml-to-pandas-data-frame-throws-memory-error, seomthing like ``el_data[child.tag] = [ c.text for c in child ]`` – Jeff Jul 16 '13 at 16:35

0 Answers0