1

Here's an example of code from the database I'm using:

<mondial>
<mountain id="mount-Sajama" country="BOL" type="volcano">
<name>Sajama</name>
<mountains>Andes</mountains>
<located country="BOL" province="prov-BOL-2"/>
<elevation>6542</elevation>
<longitude>-68.9</longitude>
<latitude>-18.1</latitude>
</mountain>
<mountain id="mount-Licancabur" country="BOL RCH" type="volcano">
<name>Licancabur</name>
<mountains>Andes</mountains>
<located country="BOL" province="prov-BOL-3"/>
<located country="RCH" province="prov-Chile-2"/>
<elevation>5920</elevation>
<longitude>-67.9</longitude>
<latitude>-22.8</latitude>
</mountain>
<country car_code="N" area="324220" capital="cty-Norway-Oslo" memberships="org-AfDB org-AsDB org-BIS org-CE org-CBSS org-CCC org-ECE org-EBRD org-EFTA org-CERN org-ESA org-FAO org-IADB org-IAEA org-IBRD org-ICC org-ICAO org-ICFTU org-Interpol org-IDA org-IEA org-IFRCS org-IFC org-IFAD org-ILO org-IMO org-Inmarsat org-IMF org-IOC org-IOM org-ISO org-ICRM org-ITU org-Intelsat org-MTCR org-NAM org-NC org-NIB org-ANC org-NATO org-EN org-NSG org-OECD org-OSCE org-PCA org-UN org-UNAVEM-III org-UNCRO org-UNESCO org-UNIDO org-UNITAR org-UNIFIL org-MINURSO org-UNHCR org-UNPREDEP org-UNPROFOR org-UNTSO org-UPU org-WEU org-WHO org-WIPO org-WMO org-WTrO org-ZC">
<name>Norway</name>
<population>4383807</population>
<population_growth>0.48</population_growth>
<infant_mortality>4.9</infant_mortality>
<gdp_total>106200</gdp_total>
<gdp_agri>2.9</gdp_agri>
<gdp_ind>34.7</gdp_ind>
<gdp_serv>62.4</gdp_serv>
<inflation>2.5</inflation>
<indep_date from="S">1905-10-26</indep_date>
<government>constitutional monarchy</government>
<encompassed continent="europe" percentage="100"/>
<ethnicgroup percentage="82.5">Norwegian</ethnicgroup>
<ethnicgroup percentage="1.5">Sami</ethnicgroup>
<religion percentage="86.7">Protestant</religion>
<religion percentage="1">Roman Catholic</religion>
<religion percentage="1.8">Muslim</religion>
<language percentage="99">Norwegian</language>
<border country="R" length="167"/>
<border country="SF" length="729"/>
<border country="S" length="1619"/>
<province id="lteil-OS-N" capital="cty-Norway-Oslo" country="N">
<name>Oslo</name>
<population>449337</population>
<city id="cty-Norway-Oslo" is_country_cap="yes" is_state_cap="yes" country="N" province="lteil-OS-N">
<name>Oslo</name>
<longitude>10.7333</longitude>
<latitude>59.9333</latitude>
<population year="87">449337</population>
<located_at watertype="sea" sea="sea-Skagerrak"/>
</city>
</province>
<province id="lteil-AK-N" capital="cty-Norway-Oslo" country="N">
<name>Akershus</name>
<population>393217</population>
</province>
</mondial>

You can find the full XML file here: https://www.kth.se/social/files/54f4817bf27654358032133f/mondial.xml

I got myself a bit of a problem because in my database there exists country codes (car_codes in Country and country in Mountain) where multiple countries can be written in the same country code for a mountain as you can see an example of in mount-Licancabur.

So now I want to select the highest mountain for each continent so I wrote the following code:

let $mondial := db:open('mondial')
for $country in $mondial/mondial/country/encompassed/@continent
for $mountains in $mondial/mondial/mountain
where contains($country/../../@car_code, $mountains/@country)

(: Tokenize ev för att lösa berg i två länder :)
let $elevation := $mountains/elevation
group by $country

return ($country, max($elevation))

This gives the output:

europe
5642
asia
8167
africa
5895
australia
5775
america
6962

These can seem correct at first but Mount Everest should be asias biggest mountain with a height of 8848 meters.

So my question is to you, how can I also compare to the country codes with two country codes in their car_code/country? I have been reading up a bit about tokenize but haven't been able to make it work so far.

Hope someone out there could help me, been stuck on this for a while :)

Dreamus
  • 45
  • 1
  • 9

1 Answers1

1

From reading your question and looking at the xml it looks like you are trying to address the issue of mount elements that looks like this:

<mountain id="mount-Matterhorn" country="CH I">
 <name>Matterhorn</name>
 <mountains>Alps</mountains>
 <located country="CH" province="prov-Switzerland-24"/>
 <located country="I" province="prov-Italy-9"/>
 <elevation>4478</elevation>
 <longitude>7.56</longitude>
 <latitude>46</latitude>
</mountain>

Where the @country has more than one country in it.

you can do this with tokenize by changing the where to this

where $mountains[tokenize(@country, " ") eq $country/../../@car_code]

There are some performance concerns with what you are doing here. Each mountain element would be tokenized for each country. This could be written more performant and I can go over that with you if you like.

update:

Here is another way to do it that will give you the highest elevation in each Continent. With the way that the XML is you have to use tokenize on the mountains to compare the counties but at lest this way it only does it to every mountain for each continent.

for $continent in fn:distinct-values($mondial/mondial/country/encompassed/@continent)
let $counties := $mondial/mondial/country[encompassed/@continent eq $continent]/@car_code
let $highest :=
(
    for $elevation in $mondial/mondial/mountain[fn:tokenize(@country, " ") = $counties]/elevation/text()
    order by xs:int($elevation) descending
    return $elevation
)[1]
return ($continent, $highest)

the output is

europe
7010
asia
8848
africa
5895
australia
4884
america
6962
Tyler Replogle
  • 1,339
  • 7
  • 13
  • That would be really good, I'm quite new to Xquery so all the help I can get I greatly appriciate :) But first I would need a slightly changed version of your solution because as it is now, if I just change the where to your suggestion I get following error: Item expected, sequence found which I'm guessing is because it's trying to compare two substring to the @carcode which Xquery can't handle but I'm not sure about it. – Dreamus Mar 25 '15 at 08:57
  • Which Xquery engine are you using? tokenize splits the first parameter into a sequence based on the 2nd parameter. Xquery is built it handle sequences so @carcode should be able be compared to the output of the tokenize function, maybe the engine doesn't like the "eq" so change that to '='. Some engines are able to turn function mapping on and off and maybe thats what's giving you the error. I updated the answer to have alternative way of getting the highest elevation for each continent. – Tyler Replogle Mar 25 '15 at 11:29
  • I'm using Basex and turns out the problem was really simple so the first solution was valid, thanks a lot for the help. Will invite your solution as soon as I reach 15 rep :) – Dreamus Mar 25 '15 at 12:49