Using the following XML can anyone tell me how in Groovy (Gpath or Xpath) I perform a select on the left most element and also include a reference back to the correct parent element?
<CompoundEmployee>
<person>
<person_id_external>21554</person_id_external>
<employment_information>
<start_date>2014-02-27</start_date>
<job_information><end_date>2013-04-21</end_date><event>H</event><start_date>2012-09-28</start_date></job_information>
<job_information><end_date>2013-04-26</end_date><event>5</event><start_date>2013-04-22</start_date></job_information>
<job_information><end_date>9999-12-31</end_date><event>R</event><start_date>2014-02-27</start_date></job_information>
</employment_information>
</person>
<person>
<person_id_external>8265</person_id_external>
<employment_information>
<start_date>2000-10-02</start_date>
<job_information><end_date>2014-10-24</end_date><event>5</event><start_date>2014-05-22</start_date></job_information>
<job_information><end_date>2014-05-21</end_date><event>H</event><start_date>2000-10-02</start_date></job_information>
<job_information><end_date>9999-12-31</end_date><event>5</event><start_date>2014-10-25</start_date></job_information>
</employment_information>
</person>
<execution_timestamp>2015-05-05T08:17:51.000Z</execution_timestamp>
<version_id>1502P0</version_id>
</CompoundEmployee>
The select statement written in English is:
"Start Date of Job Information record is less than Employement Information Start Date AND Job Information event type is one of Hire or Rehire"
The elements returned by the query must include person_id_external from employment_information along with start_date from job_information.
So far I have tried.....
def xml = """ xml from above """
def list = new XmlSlurper().parseText(xml)
x = list.'**'.findAll { person ->
person.event.text() in ['H','R'] && person.start_date.text() < list.person.employment_information.start_date.text()
}
x.each { l -> println "Type -> ${l.event}, Start Date -> ${l.start_date}, End Date -> ${l.end_date}" }
which works great when there is only one person in the input file but when there are multiple employees the results are incorrect due to the wrong "list.person.employment_information.start_date" being referenced i.e. the parent/child nodes are not related.
Based on the above an example of the output is:
Type -> H, Start Date -> 2012-09-28, End Date -> 2013-04-21
Type -> R, Start Date -> 2014-02-27, End Date -> 9999-12-31
Type -> H, Start Date -> 2000-10-02, End Date -> 2014-05-21
where in fact it should return only 1 row:
Type -> H, Start Date -> 2012-09-28, End Date -> 2013-04-21
As you can see I am nearly there but I just can't work out how to reference and return the logically correct parent employment_information record.
Any ideas anyone?
Thanks, Greg