2

I'm trying to write a Python script that will go through the file and remove the container of a particular node attribute. For instance, my tree looks like:

<collection shelf="New Arrivals">
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
</collection>

Q1

The whole container should be removed if the attribute of the child node <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF"> equals : /AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport

The script I have written is :

import xml.etree.ElementTree as ET
tree = ET.parse('autosar1.xml')
root = tree.getroot()
for child in root.findall(".//ECUC-NUMERICAL-PARAM-VALUE"):
    for z in child.findall(".//DEFINITION-REF[@DEST='ECUC-BOOLEAN-PARAM-DEF']"):
        if z.text == "/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport":
            child.remove(z)         
tree.write('output.xml')

But I am not getting the intended results. The result I am getting is:

<collection shelf="New Arrivals">
<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>

<ECUC-NUMERICAL-PARAM-VALUE>
<SHORT-NAME>RTE_ABC</SHORT-NAME>
</ECUC-NUMERICAL-PARAM-VALUE>
</collection>

The result I want to get :

<collection shelf="New Arrivals">
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
  <ECUC-NUMERICAL-PARAM-VALUE>
    <SHORT-NAME>RTE_ABC</SHORT-NAME>
    <DEFINITION-REF DEST="ECUC-BOOLEAN-PARAM-DEF">/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/xyz</DEFINITION-REF>
  </ECUC-NUMERICAL-PARAM-VALUE>
</collection>

Q2

Instead of hardcoding the node attribute in the if condition, is it possible that by taking user input (in command prompt maybe),suppose as "ComIPduCancellationSupport", (not the whole attribute as "/AUTOSAR/EcucDefs/Com/ComConfig/ComIPdu/ComIPduCancellationSupport"),the desired output is achieved.

Thanks a lot.

Parfait
  • 104,375
  • 17
  • 94
  • 125
Gopala Krishna
  • 117
  • 1
  • 12

1 Answers1

1

Consider the third-party, lxml, the most feature-rich and easy-to-use library for processing XML and HTML in the Python language. You can install with pip or binary file for Windows. The reason for recommendation is the module can run full W3C conformant XPath 1.0 and XSLT 1.0 where the latter XSLT is useful for you.

XSLT is a special-purpose language that can transform XML files like removing nodes conditionally. Specifically in XSLT, we run the Identity Transform (to copy entire document as is) and then run an empty template on the node we intend to remove. Notice the use of contains() to check for a string anywhere in the text of that node. No for loop or if logic needed for this approach.

And with Python's lxml we can build a dynamic XSLT script (which by the way is an XML file) from string and pass a string such as COMPU-METHOD-REF into contains(). Such a string can derive from user input. Notice the {0} placeholder for string .format().

Python

import lxml.etree as et
doc = et.parse('Input.xml')

xsl_str='''<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                                         xmlns:doc="http://autosar.org/3.0.2">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <!-- IDENTITY TRANSFORM -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- EMPTY TEMPLATE -->
  <xsl:template match="INTEGER-TYPE[descendant::COMPU-METHOD-REF/@DEST='COMPU-METHOD' and 
                                    contains(descendant::COMPU-METHOD-REF, '{0}')]">    
  </xsl:template>

</xsl:stylesheet>'''

# LOAD DYNAMIC XSL STRING (PASSING BELOW STRING INTO ABOVE)
xsl = et.fromstring(xsl_str.format('CoolantTemp_T'))

transform = et.XSLT(xsl)
result = transform(doc)

# OUTPUT TO SCREEN
print(result)    
# OUTPUT TO FILE
with open('output.xml', 'wb') as f:
    f.write(result)

Output

<?xml version="1.0"?>
<TOP-LEVEL-PACKAGES>
  <AR-PACKAGE>
    <SHORT-NAME>DataType</SHORT-NAME>
    <ELEMENTS>
      <INTEGER-TYPE>
        <SHORT-NAME>EngineSpeed_T</SHORT-NAME>
        <SW-DATA-DEF-PROPS>
          <COMPU-METHOD-REF DEST="COMPU-METHOD">/DataType/DataTypeSemantics/EngineSpeed_T</COMPU-METHOD-REF>
        </SW-DATA-DEF-PROPS>
        <LOWER-LIMIT INTERVAL-TYPE="CLOSED">0</LOWER-LIMIT>
        <UPPER-LIMIT INTERVAL-TYPE="CLOSED">65535</UPPER-LIMIT>
      </INTEGER-TYPE>
      <INTEGER-TYPE>
        <SHORT-NAME>VehicleSpeed_T</SHORT-NAME>
        <SW-DATA-DEF-PROPS>
          <COMPU-METHOD-REF DEST="COMPU-METHOD">/DataType/DataTypeSemantics/VehicleSpeed_T</COMPU-METHOD-REF>
        </SW-DATA-DEF-PROPS>
        <LOWER-LIMIT INTERVAL-TYPE="CLOSED">0</LOWER-LIMIT>
        <UPPER-LIMIT INTERVAL-TYPE="CLOSED">65535</UPPER-LIMIT>
      </INTEGER-TYPE>
      <INTEGER-TYPE>
        <SHORT-NAME>Percent_T</SHORT-NAME>
        <SW-DATA-DEF-PROPS>
          <COMPU-METHOD-REF DEST="COMPU-METHOD">/DataType/DataTypeSemantics/Percent_T</COMPU-METHOD-REF>
        </SW-DATA-DEF-PROPS>
        <LOWER-LIMIT INTERVAL-TYPE="CLOSED">0</LOWER-LIMIT>
        <UPPER-LIMIT INTERVAL-TYPE="CLOSED">255</UPPER-LIMIT>
      </INTEGER-TYPE>
    </ELEMENTS>
    <SUB-PACKAGES>
      <AR-PACKAGE>
        <SHORT-NAME>DataTypeSemantics</SHORT-NAME>
        <ELEMENTS>
          <COMPU-METHOD>
            <SHORT-NAME>EngineSpeed_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/rpm</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>0</V>
                      <V>1</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>8</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
          <COMPU-METHOD>
            <SHORT-NAME>VehicleSpeed_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/kph</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>0</V>
                      <V>1</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>64</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
          <COMPU-METHOD>
            <SHORT-NAME>Percent_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/Percent</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>0</V>
                      <V>0.4</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>1</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
          <COMPU-METHOD>
            <SHORT-NAME>CoolantTemp_T</SHORT-NAME>
            <UNIT-REF DEST="UNIT">/DataType/DataTypeUnits/DegreeC</UNIT-REF>
            <COMPU-INTERNAL-TO-PHYS>
              <COMPU-SCALES>
                <COMPU-SCALE>
                  <COMPU-RATIONAL-COEFFS>
                    <COMPU-NUMERATOR>
                      <V>-40</V>
                      <V>1</V>
                    </COMPU-NUMERATOR>
                    <COMPU-DENOMINATOR>
                      <V>2</V>
                    </COMPU-DENOMINATOR>
                  </COMPU-RATIONAL-COEFFS>
                </COMPU-SCALE>
              </COMPU-SCALES>
            </COMPU-INTERNAL-TO-PHYS>
          </COMPU-METHOD>
        </ELEMENTS>
      </AR-PACKAGE>
      <AR-PACKAGE>
        <SHORT-NAME>DataTypeUnits</SHORT-NAME>
        <ELEMENTS>
          <UNIT>
            <SHORT-NAME>rpm</SHORT-NAME>
            <DISPLAY-NAME>rpm</DISPLAY-NAME>
          </UNIT>
          <UNIT>
            <SHORT-NAME>kph</SHORT-NAME>
            <DISPLAY-NAME>kph</DISPLAY-NAME>
          </UNIT>
          <UNIT>
            <SHORT-NAME>Percent</SHORT-NAME>
            <DISPLAY-NAME>Percent</DISPLAY-NAME>
          </UNIT>
          <UNIT>
            <SHORT-NAME>DegreeC</SHORT-NAME>
            <DISPLAY-NAME>DegreeC</DISPLAY-NAME>
          </UNIT>
        </ELEMENTS>
      </AR-PACKAGE>
    </SUB-PACKAGES>
  </AR-PACKAGE>
</TOP-LEVEL-PACKAGES>
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Thanks for the answer.But I am getting an error.I tried to install the file lxml‑4.0.0‑cp35‑cp35m‑win_amd64.whl from the link you had mentioned.I got the following message in the command prompt:Requirement already satisfied: lxml==4.0.0 from file */lxml-4.0.0-cp35-cp35m-win_amd64.whl in c:*\python\python35\lib\site-packages. When I run the python script I am getting the following error:AttributeError: module 'xml.etree.ElementTree' has no attribute 'XSLT' – Gopala Krishna Sep 27 '17 at 18:46
  • You did not import the correct etree. Remove the built-in module `import xml.etree.ElementTree` and only use lxml: `import lxml.etree as et`. – Parfait Sep 27 '17 at 18:49
  • That was the error.It is working now.Thanks a. lot.But I tried your solution on another input file.I made some changes in your script.There is no changes in the output file at all.Can you please have a look?Here's the link of input file and script: https://drive.google.com/open?id=0Bxt5bddXF4ctYUxKWDh5amRmOTA – Gopala Krishna Sep 27 '17 at 20:14
  • Unlike this simplified example. The other XML has [default namespace](https://www.w3.org/TR/1999/REC-xml-names-19990114/#defaulting), where in root there is an *xmlns* without a colon-separated name. This changes the setup somewhat. Often I advise posters to always include good example of their XML with *all* namespaces. – Parfait Sep 27 '17 at 22:44
  • I deleted the container which has that xmlns and ran the script still no change in the output. – Gopala Krishna Sep 28 '17 at 17:14
  • See update. Your xpath was not accurate since the text you search is not a direct child to `INTEGER-TYPE` like your simple XML assumed. Use `descendant::*`. And no need to delete root. XSLT now defines the default namespace with a prefix,*doc*, (entirely chosen by me) in its root. Note: source XML is perfectly valid. – Parfait Sep 28 '17 at 17:45
  • Yeah man,this time it worked.Thanks a lot buddy. धन्यवादः महोदय । – Gopala Krishna Sep 28 '17 at 19:14