0

Let's say we have this folders

not_my_files
    collections
        collection1.xml
        collection2.xml
        collection3.xml
        etc...
my_files
    my_documents
        mydoc1.xml
        mydoc2.xml
        mydoc3.xml
        etc...

There are the structure of xml files

collection1.xml (same structure for collection2.xml, collection3.xml, etc...)

<collection xml:id="name_of_collection_1">
    <ref id="id_of_ref_1">
        <title>This is title 1 of first document in this collection</title>
    </ref>
    <ref  id="id_of_ref_2">
        <title>This is title 2 of second document in this collection</title>
    </ref>  
</collection>

mydoc1.xml (same structure for mydoc2.xml, mydoc3.xml, etc...)

<mydoc id="my_doc_id_1">
    <tag1>
        <tag2>
            <reference_tag>
                <my_title>This is title 1 of my documents</my_title>
            </reference_tag>
        </tag2>
    </tag1>
</mydoc>

So: 1) xml files in different folders have different structure and 2) collection1.xml can contain many titles and mydoc1.xml can contain only 1 title in time.

I want to get all titles from both collections/collection1.xml (etc.) AND my_documents/mydoc1.xml (etc.). This is a desired result:

<doc>
    <folder>Not my files</folder>
    <title>This is title 1 of first document in this collection</title>
</doc>
<doc>
    <folder>Not my files</folder>
    <title>This is title 2 of second document in this collection</title>
</doc>
<doc>
    <folder>My files</folder>
    <title>This is title 1 of my documents</title>
</doc>

My current XQuery:

xquery version "3.1";

for $doc_not_my_files in collection("/not_my_files/collections")
   let $folder_not_my_files := "Not my files"

for $ref in $doc_not_my_files//ref
   let $title_not_my_files := $ref/title/text()

for $doc_my_files in collection("/my_files/my_documents")
   let $folder_my_files := "My files"
    let $title_my_files := $doc_my_files//reference_tag/my_title/text()

return
        if ($folder_my_files="My files") 
            then
                <doc>
                    <folder>{$folder_my_files}</folder>
                    <title>{$title_my_files}</title>
                </doc>
        else 
                <doc>
                    <folder>{$folder_not_my_files}</folder>
                    <title>{$title_not_my_files}</title>
                </doc>

My current result:

<doc>
    <folder>My files</folder>
    <title>This is title 1 of my documents</title>
</doc>
<doc>
    <folder>My files</folder>
    <title>This is title 1 of my documents</title>
</doc>
<doc>
    <folder>My files</folder>
    <title>This is title 1 of my documents</title>
</doc>
<doc>
    <folder>My files</folder>
    <title>This is title 1 of my documents</title>
</doc>
**etc... 1000 times** 
<doc>
    <folder>Not my files</folder>
    <title>This is title 1 of first document in this collection</title>
</doc>
<doc>
    <folder>Not my files</folder>
    <title>This is title 1 of first document in this collection</title>
</doc>
<doc>
    <folder>Not my files</folder>
    <title>This is title 1 of first document in this collection</title>
</doc>
**etc... another 1000 times**

So, I looking for some kind of SQL "UNION" alternative in XQuery... I have this feeling like I have some basic stupid question, but I'm new to XQuery, so forgive me:)

ag_1812
  • 158
  • 2
  • 10
  • Does the order of the result matter? Is there any order in the inputs and are the two different inputs related to associate e.g. `mydoc1.xml` with `collection1.xml`? – Martin Honnen Oct 21 '20 at 19:05
  • No, the order of the result doesn't matter. No, unfortunately, there is no any possible association between mydoc1.xml and collection1.xml (so impossible to make a simple join on some criteria…) – ag_1812 Oct 21 '20 at 19:15

1 Answers1

1

It might work with

for-each-pair(
  collection("/my_files/my_documents"),
  collection("/not_my_files/collections"),
  function($doc, $col) {
    $doc//reference_tag/my_title/text() ! <doc>
                    <folder>My files</folder>
                    <title>{.}</title>
                </doc>,
    $col//ref/title/text() ! <doc>
                    <folder>Not my files</folder>
                    <title>{.}</title>
                </doc>
  }
)
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Thank a lot! It works well! Is it possible to add additional data to this function? Let's say I want to add `id`. It will be `let $id := data($doc//ref/@id)` for `collection1.xml` and `let $id := data($col/mydoc/@id)` for `mydoc1.xml`. Any chance to add this code to your function? Many thanks in advance for any help!!! – ag_1812 Oct 21 '20 at 19:39
  • Never mind, I understood how to add this `id’s` to function. But I have one more question: If I want to add `where` clause (to be able to manipulate or choose result) where I have to put this clause in your function? Many thanks in advance! – ag_1812 Oct 21 '20 at 19:55
  • @ag_1812, take your time to try to understand the approach and to adapt it to your needs. If new problems arise, please ask a new question with the details as to which condition you need on which items (input items or result items)). Anywhere the above uses an expression you can certainly use a FLOWR expression e.g. instead of `collection("/my_files/my_documents")` you can use e.g. `for $colDocs in collection("/my_files/my_documents") where ... return $colDocs`. – Martin Honnen Oct 21 '20 at 20:31
  • Thank you, it works fine for `where` clause. You are right, each language need some time to master it. Wish I have enough time to learn them all but right now I have only few days for Xquery queries and after I have to move forward in another project. Hopefully stackoverflow is here to help! Anyway, thanks again for all your help! – ag_1812 Oct 21 '20 at 20:49