8

I'm trying to get relevant information from a SOAP service from the Dutch government land register (WSDL here) with PySimpleSoap. So far I managed to connect and request information about a specific property with the following code:

from pysimplesoap.client import SoapClient
client = SoapClient(wsdl='http://www1.kadaster.nl/1/schemas/kik-inzage/20141101/verzoekTotInformatie-2.1.wsdl', username='xxx', password='xxx', trace=True)

response = client.VerzoekTotInformatie(
    Aanvraag={
        'berichtversie': '4.7',  # Refers to the schema version
        'klantReferentie': klantReferentie,  # A reference we can set ourselves.
        'productAanduiding': '1185',  # a four-digit code referring to whether the response should be in "XML" (1185), "PDF" (1191) or "XML and PDF" (1057).
        'Ingang': {
            'Object': {
                'IMKAD_KadastraleAanduiding': {
                    'gemeente': 'ARNHEM AC',  # municipality
                    'sectie': 'AC',  # section code
                    'perceelnummer': '1234'  # Lot number
                }
            }
        }
    }
)

This "kinda" works. I set trace=True so I get extensive log messages, and in those log messages I see a humongous xml output (paste here) which pretty much includes all info which I request. BUT, I also get this traceback:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
    'perceelnummer': perceelnummer
  File "/Library/Python/2.7/site-packages/pysimplesoap/client.py", line 181, in <lambda>
    return lambda *args, **kwargs: self.wsdl_call(attr, *args, **kwargs)
  File "/Library/Python/2.7/site-packages/pysimplesoap/client.py", line 346, in wsdl_call
    return self.wsdl_call_with_args(method, args, kwargs)
  File "/Library/Python/2.7/site-packages/pysimplesoap/client.py", line 372, in wsdl_call_with_args
    resp = response('Body', ns=soap_uri).children().unmarshall(output)
  File "/Library/Python/2.7/site-packages/pysimplesoap/simplexml.py", line 433, in unmarshall
    value = children and children.unmarshall(fn, strict)
  File "/Library/Python/2.7/site-packages/pysimplesoap/simplexml.py", line 433, in unmarshall
    value = children and children.unmarshall(fn, strict)
  File "/Library/Python/2.7/site-packages/pysimplesoap/simplexml.py", line 433, in unmarshall
    value = children and children.unmarshall(fn, strict)
  File "/Library/Python/2.7/site-packages/pysimplesoap/simplexml.py", line 380, in unmarshall
    raise TypeError("Tag: %s invalid (type not found)" % (name,))
TypeError: Tag: IMKAD_Perceel invalid (type not found)

As far as I understand, this means that the IMKAD_Perceel tag cannot be understood by the simplexml parser which (I'm guessing) is because it could not read/find the definition of this tag in the wdsl file.

So I checked the (enormous amount of) log messages from parsing the wsdl file, and that shows these lines:

DEBUG:pysimplesoap.helpers:Parsing Element element: IMKAD_Perceel
DEBUG:pysimplesoap.helpers:Processing element IMKAD_Perceel element
DEBUG:pysimplesoap.helpers:IMKAD_Perceel has no children!
DEBUG:pysimplesoap.helpers:complexContent/simpleType/element IMKAD_Perceel = IMKAD_Perceel
DEBUG:pysimplesoap.helpers:Parsing Element complexType: IMKAD_Perceel
DEBUG:pysimplesoap.helpers:Processing element IMKAD_Perceel complexType
DEBUG:pysimplesoap.helpers:complexContent/simpleType/element IMKAD_Perceel = IMKAD_OnroerendeZaak
DEBUG:pysimplesoap.helpers:Processing element IMKAD_Perceel complexType

I guess these lines mean that the IMKAD_Perceel definition is empty. So I used SoapUI to introspect the WSDL file, in which I found an url to this .xsd-file in which I find a definition of the IMKAD_Perceel:

<xs:element name="IMKAD_Perceel" 
    substitutionGroup="ipkbo:IMKAD_OnroerendeZaak" 
    type="ipkbo:IMKAD_Perceel"
    />

The tag indeed seems to be closing itself, which means it is empty. Is this the reason that pysimplesoap thinks that IMKAD_Perceel is not defined? Why can't it simply interpret the xml and return it back as a dict? (as said before, the full xml output I receive is in this paste).

Does anybody know how I can make pysimplesoap interpret the xml and convert it to a dict, regardless whether it adheres to the wsdl?

All tips are welcome!

ljk321
  • 16,242
  • 7
  • 48
  • 60
kramer65
  • 50,427
  • 120
  • 308
  • 488
  • We can't reproduce the error since we are not authorized to use this service. I think it's better for you to contact the provider of this web service. – ljk321 May 16 '15 at 01:28
  • @skyline75489 - I added a paste of the xml I receive back: http://pastebin.com/eamQzGSt . Does this help to debug? – kramer65 May 18 '15 at 16:59
  • The response you got seems to be OK, even it complains about `type error`. Is there any problem with the response? – ljk321 May 19 '15 at 03:00
  • @skyline75489 - The response is fine. It is just that I can't do anything with the response because pysimplesoap gives that error. Any idea how I can solve that error? I guess the problem lies somewhere in the simplexml implementation. The line where it goes wrong is this: https://github.com/pysimplesoap/pysimplesoap/blob/master/pysimplesoap/simplexml.py#L380 . I tried setting `strict` to `False`, but then `IMKAD_Perceel` becomes `'\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n'` – kramer65 May 19 '15 at 06:48
  • `I can't do anything with the response` Doing what exactly will cause this error? – ljk321 May 19 '15 at 08:48

1 Answers1

1

It seems that pysimplesoap is not capable of dealing with substitutionGroup in xml schema.

You can see that in the xsd file:

<xs:element name="IMKAD_Perceel" 
substitutionGroup="ipkbo:IMKAD_OnroerendeZaak" 
type="ipkbo:IMKAD_Perceel"
/>

There is this substitutionGroup, which means that IMKAD_Perceel and IMKAD_OnroerendeZaak is the same thing and substitutable for each other.

In the soap schema, this particular part of response is defined as:

<xs:complexType name="BerichtGegevens">
 <xs:annotation>
   <xs:documentation>Inhoud van het bericht.</xs:documentation>    
 </xs:annotation>
 <xs:sequence>
   <xs:element ref="ipkbo:IMKAD_OnroerendeZaak" minOccurs="1" maxOccurs="1"/>
   <xs:element ref="ipkbo:Recht" minOccurs="1" maxOccurs="1"/><xs:element ref="ipkbo:IMKAD_Stuk" minOccurs="0" maxOccurs="unbounded"/>
   <xs:element ref="ipkbo:IMKAD_Persoon" minOccurs="1" maxOccurs="unbounded"/>
   <xs:element ref="ipkbo:GemeentelijkeRegistratie" minOccurs="0" maxOccurs="unbounded"/>
 </xs:sequence>
</xs:complexType>

However, you can see the actual response is like:

<ipkbo:BerichtGegevens>
  <ipkbo:IMKAD_Perceel>...</ipkbo:IMKAD_Perceel>
  <ipkbo:Recht>...</ipkbo:Recht>
  <ipkbo:IMKAD_AangebodenStuk>...</ipkbo:IMKAD_AangebodenStuk>
  <ipkbo:IMKAD_Persoon>...</ipkbo:IMKAD_Persoon>
</ipkbo:BerichtGegevens>

Then pysimplesoap seems to get confused and fail to get correct type of response.

ljk321
  • 16,242
  • 7
  • 48
  • 60
  • Thank you for your thorough analysis and explanation of the source of the error. Would you possibly also have any clue how I could actually solve it? The source of simplexml is included with the pysimplesource (https://github.com/pysimplesoap/pysimplesoap/blob/master/pysimplesoap/simplexml.py). Would you have any idea how I can edit that to get `substitutionGroup`s working? – kramer65 May 19 '15 at 10:20
  • You can try hacking it using `types['IMKAD_Perceel'] = types['IMKAD_OnroerendeZaak']`. Just add this to the `unmarshall` function at the beginning of it . – ljk321 May 19 '15 at 10:30
  • If I try that I get a Key Error: `File "pysimplesoap/simplexml.py", line 325, in unmarshall types['IMKAD_Perceel'] = types['IMKAD_OnroerendeZaak'] KeyError: u'IMKAD_OnroerendeZaak'`. `IMKAD_OnroerendeZaak` is supposedly also not loaded in the types. Any other idea? – kramer65 May 19 '15 at 12:49
  • Well, you can't just do this. Because this function is called for EVERY SimpleXMLElement. I would recommend adding `if 'IMKAD_OnroerendeZaak' in types` – ljk321 May 19 '15 at 12:59
  • Since that results in many things not rendered I decided the just return the raw xml as a string and use [xmltpdict](https://github.com/martinblech/xmltodict) to convert it and then clean it further. It's not pretty, but it works for now. Thanks a million for helping me understand a lot more! – kramer65 May 19 '15 at 16:06