First of all, the best way to optimize queries in eXist is to store the XML locally and with indexes. Please use the built in documentation for how to setup indexes.
However, your code suffers from fetching the same data over and over from the network. Let's take care of that and the other issue, your use of in-memory XML querying, another optimization bottleneck.
The most important first step is to have the XML you're querying in the local database. Queries of nodes in the database are faster and use less memory than queries against in-memory XML nodes. (At least, that was that case with versions up to 2.2.)
So, here is a method to cache the data locally, refreshing the cache after the latest update is more than 5 minutes old.
xquery version "3.0";
declare namespace bus="http://docs.gijon.es/sw/busgijon.asmx";
(: Store the XML data in the collection /db/busgijon/data :)
declare variable $COL := "/db/busgijon/data";
declare variable $INFO-FILE := "busgijoninfo.xml";
declare variable $TR-FILE := "busgijontr.xml";
(: Fetch a page from cache or from web site, updating the cache :)
declare function local:fetchPage($filename) {
(: If the page was fetched more than 5 minutes ago, refresh it :)
let $expire := current-dateTime() - xs:dayTimeDuration('PT5M')
let $page := doc($COL || "/" || $filename)/page
return if (exists($page))
then if ($page/xs:dateTime(@timestamp) ge $expire)
then $page
else (update replace $page/* with doc("http://datos.gijon.es/doc/transporte/" || $filename)/*
, update value $page/@timestamp with current-dateTime()
, $page)
else doc(xmldb:store($COL, $filename, <page timestamp="{current-dateTime()}">{doc("http://datos.gijon.es/doc/transporte/" || $filename)/*}</page>))/page
};
declare function local:getBusesPorLinea($linea)
{
(: Get the two pages from the database cache for querying :)
let $info := local:fetchPage($INFO-FILE)/bus:BusGijonInfo
let $tr := local:fetchPage($TR-FILE)/bus:BusGijonTr
let $numero:=$linea
let $nBuses:=count($tr//bus:llegada[bus:idlinea=$numero])
return
if($nBuses=0)
then(<p>No hay ningun bus en esta linea</p>)
else(
<div>
<h2>Numero de buses funcionando en la linea {$numero} : {$nBuses}</h2>
<table class="table table-hover">
<thead>
<tr>
<th>Parada</th>
<th>Minutos hasta la llegada</th>
</tr>
</thead>
<tbody>
{
(: Loop through the TR page - fetched just once from cache :)
for $l in $tr//bus:llegada[bus:idlinea=$numero]
(: Loop through the Info page - fetched just once from cache :)
for $parada in $info//paradas/bus:parada[bus:idparada=$l/bus:idparada]
return <tr>
<td>{$parada/bus:descripcion}</td>
<td>{$l/bus:minutos}</td></tr>
}
</tbody>
</table>
</div>
)
};
local:getBusesPorLinea(1)
The only part that I changed in the local:getBusesPorLinea function was fetching the two documents at the top from the cache and using those in the embedded loops.
The local:fetchPage function is where most of the speedup occurs. Here's what it does:
- Set expiration time to 5 minutes in the past.
- Try to fetch the specified page from the cache.
- If the page exists, compare the fetched timestamp to the expiration timestamp.
- If the page's timestamp is less than 5 minutes ago (greater than the expiration timestamp), return that page.
- If the page's timestamp is greater than 5 minutes ago, refetch it, updating the contents of the page with the refreshed document, update the page's timestamp, and return the new page.
- If the page does not yet exist, save the page to the specified collection with the current timestamp, returning the page element.
The first person to access this XQuery after 5 minutes have passed will have about an extra 5-10 seconds as the cache gets refreshed. This allows the cache to be passive, so you don't have to manually refresh it every five minutes.
Hope this helps.