1

I have the following in a SQL Server varchar column (so, have to CAST AS XML I guess):

<List>
    <ObjectAttribute displayName="Unique Identity Number" editMode="ReadOnly" name="Uin" namedColumn="true" type="string"/>
    <ObjectAttribute displayName="Enterprise Account Name" editMode="ReadOnly" name="EnterpriseAccountName" namedColumn="true" type="string">
        <AttributeSource name="enterpriseAccountName">
            <ApplicationRef>
                <Reference class="sailpoint.object.Application" name="Active Directory"/>
            </ApplicationRef>
        </AttributeSource>
    </ObjectAttribute>
    <ObjectAttribute displayName="Termination Date" editMode="ReadOnly" name="TerminationDate" namedColumn="true" type="string">
        <AttributeSource name="TERMINATION_DATE">
            <ApplicationRef>
                <Reference class="sailpoint.object.Application" name="HR APPLICATION 1"/>
            </ApplicationRef>
        </AttributeSource>
        <AttributeSource name="TerminationDate">
            <ApplicationRef>
                <Reference class="sailpoint.object.Application" name="HR APPLICATION 2"/>
            </ApplicationRef>
        </AttributeSource>
    </ObjectAttribute>
    <ObjectAttribute displayName="Country" editMode="ReadOnly" name="Country" namedColumn="true" type="string">
        <AttributeSource name="AppRule: HR APPLICATION 1 Mapping Rule">
            <ApplicationRef>
                <Reference class="sailpoint.object.Application" name="HR APPLICATION 1"/>
            </ApplicationRef>
            <RuleRef>
                <Reference class="sailpoint.object.Rule" name="HR APPLICATION 1 Mapping Rule"/>
            </RuleRef>
        </AttributeSource>
        <AttributeSource name="CountryTwoLetter">
            <ApplicationRef>
                <Reference class="sailpoint.object.Application" name="HR APPLICATION 2"/>
            </ApplicationRef>
        </AttributeSource>
        <AttributeTargets>
            <AttributeTarget name="c" provisionAllAccounts="true">
                <ApplicationRef>
                    <Reference class="sailpoint.object.Application" name="Active Directory"/>
                </ApplicationRef>
            </AttributeTarget>
            <AttributeTarget name="country" provisionAllAccounts="true">
                <ApplicationRef>
                    <Reference class="sailpoint.object.Application" name="Destination App 1"/>
                </ApplicationRef>
            </AttributeTarget>
        </AttributeTargets>
    </ObjectAttribute>
    <ObjectAttribute displayName="Email" editMode="ReadOnly" name="email" standard="true" type="string">
        <AttributeSource name="AppRule: HR APPLICATION 1 Mapping Rule">
            <ApplicationRef>
                <Reference class="sailpoint.object.Application" name="HR APPLICATION 1"/>
            </ApplicationRef>
            <RuleRef>
                <Reference class="sailpoint.object.Rule" name="HR APPLICATION 1 Mapping Rule"/>
            </RuleRef>
        </AttributeSource>
        <AttributeSource name="AppRule: HR APPLICATION 2 Mapping Rule">
            <ApplicationRef>
                <Reference class="sailpoint.object.Application" name="HR APPLICATION 2"/>
            </ApplicationRef>
            <RuleRef>
                <Reference class="sailpoint.object.Rule" name="HR APPLICATION 2 Mapping Rule"/>
            </RuleRef>
        </AttributeSource>
        <AttributeTargets>
            <AttributeTarget name="email" provisionAllAccounts="true">
                <ApplicationRef>
                    <Reference class="sailpoint.object.Application" name="Destination App 1"/>
                </ApplicationRef>
            </AttributeTarget>
            <AttributeTarget name="Email" provisionAllAccounts="true">
                <ApplicationRef>
                    <Reference class="sailpoint.object.Application" name="Destination App 2"/>
                </ApplicationRef>
            </AttributeTarget>
            <AttributeTarget name="Email" provisionAllAccounts="true">
                <ApplicationRef>
                    <Reference class="sailpoint.object.Application" name="HR APPLICATION 2"/>
                </ApplicationRef>
            </AttributeTarget>
            <AttributeTarget name="EmailEffectiveDate" provisionAllAccounts="true">
                <ApplicationRef>
                    <Reference class="sailpoint.object.Application" name="HR APPLICATION 2"/>
                </ApplicationRef>
                <RuleRef>
                    <Reference class="sailpoint.object.Rule" name="HR APPLICATION 2 EmailEffectiveDate TargetTransformation Rule"/>
                </RuleRef>
            </AttributeTarget>
        </AttributeTargets>
    </ObjectAttribute>
</List>

Noteworthy: complex, inconsistent, variable number and type of elements in subordinate nodes.

What I would like to do is extract this into a table/dataset, in the following form:

|-------------|----------|------|-------------|------|----------------------|------------------------------------|-----------------------------------------------------|
| Displayname | editMode | name | namedColumn | type | SourceName           | Source App(s)                      | Target App(s)                                       |
|-------------|----------|------|-------------|------|----------------------|------------------------------------|-----------------------------------------------------|
|             |          |      |             |      |(AttributeSource.name)|(AttributeSource.ApplicationRef.name|AttributeTargets.AttributeTarget.name                |
|             |          |      |             |      |                      | AttributeSource.RuleRef.name       |AttributeTargets.AttributeTarget.ApplicationRef.name |
|             |          |      |             |      |                      |                                    |AttributeTargets.AttributeTarget.RuleRef.name        |
|-------------|----------|------|-------------|------|----------------------|------------------------------------|-----------------------------------------------------|

Basically, one row per ObjectAttribute, and then in the rightmost two columns ( Source App(s), Target App(s) ), "subrows" within the cells for each of the elements within the respective node.

My intuition is that this is extremely difficult, particularly in SQL server, so I have also included python/pandas in the tags (if anyone answers it from a python perspective, feel free to skip the database access part).

Is this possible?

Charlieface
  • 52,284
  • 6
  • 19
  • 43
tbone
  • 5,715
  • 20
  • 87
  • 134

1 Answers1

1

You can use XQuery .nodes in SQL to shred the nodes into separate rows. You use then use .value to get each value that you want.

Since you want to combine values into a single value with newlines, it's slightly more complicated, requiring the use of concat .query and .value, as well as a for iteration over the nodes you want to query

SELECT
  Displayname = x.attr.value('@displayName', 'varchar(100)'),
  editMode = x.attr.value('@editMode', 'varchar(100)'),
  name = x.attr.value('@name', 'varchar(100)'),
  namedColumn = '???',
  type = x.attr.value('@type', 'varchar(100)'),
  SourceName = x.attr.query('for $i in AttributeSource/@name return concat(./@name, "&#xa;")').value('text()[1]','nvarchar(1000)'),
  [Source App(s)] = x.attr.query('for $i in AttributeSource/*/Reference/@name return concat(@name, "&#xa;")').value('text()[1]','nvarchar(max)'),
  [Target App(s)] = x.attr.query('for $i in
      ( for $t in AttributeTargets/AttributeTarget
        return ($t/@name, $t/*/Reference/@name)
      )
      return concat(./@name, "&#xa;")
      ').value('text()[1]','nvarchar(max)')
FROM @xml.nodes('List/ObjectAttribute') x(attr);

db<>fiddle

The final one is the most complicated. Steps are as follows:

  • For each ObjectAttribute node that we pulled out using .nodes...
  • Take all AttributeTargets nodes and retrieve their /AttributeTarget child nodes.
  • Iterate over those in $t, returning for each the @name attribute, as well as descending to any child nodes -> Reference and returning also those @name attributes.
  • Iterating over those results in $i, concat it with a new line
  • Return all of that as XML using .query then convert it to nvarchar using .value.

If you didn't need the newlines, you could for example just do

SourceName = x.attr.query('AttributeSource/@name').value('text()[1]','nvarchar(1000)'),
Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Wow this looks great.....I have it about half working but will have to take another shot at it later today....the parts you note are indeed tricky. – tbone Jul 13 '22 at 17:07
  • 1
    If you wanted to return each of those values as separate rows, rather than a single value with newlines then it would be much easier. – Charlieface Jul 13 '22 at 19:35
  • that would be PERFECTLY adequate for my purposes, I haven't had time to take another shot at this but if you could show me how to do that I would be thrilled! – tbone Jul 16 '22 at 03:38
  • is it possible to ping people on here?? Apparently not, seems like they now strip usernames. :( – tbone Jul 16 '22 at 19:57
  • 1
    Yes you can, you just ping someone like this @tbone . To get it as separate rows you `CROSS APPLY` each `.nodes` call from the previous, in order to further shred the XML, see eg https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=b98cf7eb0640ebf151a27320b3962fbf It's not necessary to do multiple `.nodes` if you only want to access one level of XML, only if you need data from multiple levels at once. In your case it's hard to know what you want, as you have multiple `ApplicationSource` and multiple `ApplicationTarget` nodes and it's not clear how you want to correlate them – Charlieface Jul 16 '22 at 23:53
  • Thanks for that update @Charlieface....agree, how to present this is problematic, I think your latest example shows how to do it for /AttributeTargets, taking that and doing a *seperate* query for /AttributeSource (and then I could join or union the two) I think would achieve the goal. I am going to take a look later today if I have time, but CROSS APPLY is getting beyond my knowledge, let alone the other more complicated things you're doing! – tbone Jul 18 '22 at 19:07
  • 1
    Yes sounds like you should do a separate query for `AttributeTargets`. Using `CROSS APPLY` is quite straightforward: basically you just do `CROSS APPLY previousLevelOfXML.nodes( 'newLevel/descendantWhatever') AS x2(newLevel)` or similar and then you can access nodes at both levels at the same time. (Do not put `/` at the beginning of the XPath or it will jump back to the root.) – Charlieface Jul 18 '22 at 19:16