2

I am retrieving a SOAP envelope from an endpoint and then attempting to pull the data from the fields. I'm using simplified_scrapy and this has worked correctly but for empty values, it throws an error. The SOAP/xml string is loaded into a variable and then I am parsing that looking for my specified head tag. I'm writing the values to a csv using the below code if there is a tag that does not have a value in it I get the error:

  File "C:/Users/syorke/PycharmProjects/DMBStart/API/GoodScripts/DebtsGetDebts.py", line 45, in <module>
    , c.SettlementStatus.text
AttributeError: 'List' object has no attribute 'text

In the code below the tag SettlementStatus is the field that has no value:

    fh.write(cols)
    for c in Categories:
        fh.write('%s\n' % [(c.FileNumber.text
                            , c.DebtId.text
                            , c.VendorId.text
                            , c.DebtType.text
                            , c.SettlementStatus.text
                            # , c.AccountStatus.text
                            , c.IsStatementIncluded.text
                            , c.PrimaryName.text
                            , c.ApplicantType.text
                            , c.OriginalBalance.text
                            , c.MinimumPayment.text
                            )])
fh.close()

I've tried adding an if around the tab essentially say if there is a value to write the value else write '' but this did not work. Thanks for any pointers or assistance.

Stephen Yorke
  • 197
  • 1
  • 3
  • 13
  • Library SimplifiedDoc recognizes names suffixed with s as fetching lists. In this case, you can use the method of calling instead of the method of calling properties, such as ele.getElementByTag ('tag ') – dabingsou Jan 30 '20 at 03:38
  • Thank you this worked for the tag above. But does not work for all of the tags, actually I'm the same an error even when there is data within the tags. This doesn't make sense maybe I am not pulling xml correctly or there is a better way? The error I'm getting is the same as before but on a different field, a field that always looks to have a value. File "f.py" line 66, in , c.getElementByTag('VendorId').text AttributeError: 'NoneType' object has no attribute 'text' If there is a better way for pulling data from a SOAP envelope I am open to change. Thanks, again – Stephen Yorke Jan 31 '20 at 16:24
  • c.getElementByTag('VendorId').text AttributeError: 'NoneType' object has no attribute 'text',This means that there is no such data. You can add a judgment:c.VendorId.text if c.VendorId else "" – dabingsou Feb 01 '20 at 00:04
  • 1
    You can also use the select method, which returns none when the value is not reached, but does not report an error.[(c.select('FileNumber>text()') , c.select('DebtId>text()') , c.select('VendorId>text()') , c.select('DebtType>text()') , c.select('SettlementStatus>text()') # , c.select('AccountStatus>text()') , c.select('IsStatementIncluded>text()') , c.select('PrimaryName>text()') , c.select('ApplicantType>text()') , c.select('OriginalBalance>text()') , c.select('MinimumPayment>text()') )] – dabingsou Feb 01 '20 at 00:24
  • Here are more examples of SimplifiedDoc, https://github.com/yiyedata/simplified-scrapy-demo/tree/master/doc_examples – dabingsou Feb 01 '20 at 00:41

1 Answers1

3

I fill in an answer for other users to check.

Use the latest version, or use the select method instead.

[(c.select('FileNumber>text()') , c.select('DebtId>text()') , c.select('VendorId>text()') , c.select('DebtType>text()') , c.select('SettlementStatus>text()') , c.select('AccountStatus>text()') , c.select('IsStatementIncluded>text()') , c.select('PrimaryName>text()') , c.select('ApplicantType>text()') , c.select('OriginalBalance>text()') , c.select('MinimumPayment>text()') )]

Here are more examples of SimplifiedDoc: https://github.com/yiyedata/simplified-scrapy-demo/tree/master/doc_examples

dabingsou
  • 2,469
  • 1
  • 5
  • 8