I have a list of about 950 integers in a CSV file and an XML file containing complexly nested info (each entry contains multiple nests). Given an integer, i
, in the CSV file, each i
corresponds to an i
in the XML file: i
in <entry><key>i</key>
. I would like to extract a pre-specified set of columns from the XML file for each i
listed in the CSV file.
Here is an example of a set of extraction 'columns', for lack of a better word (targets are surrounded by double asterisks):
<entry>
<key>55</key>
<cd language="**en**">
<title>**Ride The Lightning**</title>
<band>Metallica</band>
</cd>
<tabbook language="**en**">
<title>**Ride The Lightning Tab**</title>
<author>Who J. Ever</author>
</tabbook>
</entry>
Should I just load the CSV file's values into a variable in a script, or is there an existing and better way to do this?
Edit:
Presently I'm trying to use BaseX. For a starter query, I'm trying: for $e in collection("catalog")//entry where //entry/cd/title contains text "lightning" return //title
, which I take to mean (or rather hope means): for a "entry"-titled tag that is the descendant of any tag in the "collection"-titled catalog, if that same entry's "cd"-titled descendant's "title"-titled descendant contains the text "lightning", echo back to me the full title.
Damn, that's confusing.... I have been told to use concat()
rather than return
. The query seems to be incorrect. I'll continue to study and post again when I've come up with proper grammar.