1

I am calling Xero's API and then, using Get data from XML step. How can I extract Depreciation Expense - 218.8? I've tried /Rows/Row/Cells/Cell/Attributes/. and Rows/Row/Cells/Cell/Value - among other options but they didn't work. And another question, if I have multiple accounts and I need to extract exactly 'Depreciation Expense', I've tried playing with [] to extract Nth element but somehow it didn't work. Is it Pentaho specifics?

          <RowType>Section</RowType>
          <Title>Less Operating Expenses</Title>
          <Rows>
            <Row>
              <RowType>Row</RowType>
              <Cells>
                <Cell>
                  <Value>Depreciation Expense</Value>
                  <Attributes>
                    <Attribute>
                      <Value>f14d778f842543feafca2fdcf0437cf7</Value>
                      <Id>account</Id>
                    </Attribute>
                    <Attribute>
                      <Value>f14d778f842543feafca2fdcf0437cf7</Value>
                      <Id>groupID</Id>
                    </Attribute>
                  </Attributes>
                </Cell>
                <Cell>
                  <Value>218.16</Value>
                  <Attributes>
                    <Attribute>
                      <Value>f14d778f842543feafca2fdcf0437cf7</Value>
                      <Id>account</Id>
                    </Attribute>
                    <Attribute>
                      <Value>f14d778f842543feafca2fdcf0437cf7</Value>
                      <Id>groupID</Id>
                    </Attribute>
                  </Attributes>
                </Cell>
              </Cells>
            </Row>
eponkratova
  • 467
  • 7
  • 20

1 Answers1

0

With complex XML structures like this one, it's often best to use nested Get Data from XML steps in Pentaho.

In your sample (which misses a root element and closing /rows btw) it looks like the XML represents an Excel-like "rows with cells" structure. The cells likely belong to a column depending on their order. For this answer, I'll assume this order is indeed fixed in the XML and there are no missing cells. Verify that!

The first XML step should extract each XML "row" into a Pentaho row and give back the XML node, not just a value. For that, you can use the Loop XPath setting /YourRoot/Rows/Row and get a field with XPath "Cells" and Result type "Single node". Including a rownum field might be nice, select that option if you need it.

The second XML step can then use the output field from the first, extracting from Loop XPath /Cells/Cell and getting all the fields you need using the Get Fields button.

Once you have the fields, use a Select Values step to remove the original XML fields, then use a Row Flattener (only works for fixed Cell order).

Screenshot of a sample transformation for this

Cyrus
  • 2,135
  • 2
  • 11
  • 14