0

so i have some noisy xml input data and i would like to cast it to xs:gYear since all of it is dates.

let $dates := 
<date>
    <a>-1234</a>
    <b/>    
    <c>1911</c> 
    <d>786</d>
    <e>-90</e>
    <f>0</f>
    <g>0302</g>
    <h>-0987</h>
</date>

First I thought: let's use cast as:

for $n in $dates/*
return if ($n castable as xs:gYear) then ($n cast as xs:gYear) 
else ("boo")

which returns valid gYear ints as xs:gYear not quite what i wanted:

declare function local:isodate ($string as xs:string)  as xs:string* {
        if (empty($string)) then ()
        else if (starts-with($string, "-")) then (concat('-',(concat (string-join((for $i in (string-length(substring($string,2)) to 3) return '0'),'') , substring($string,2)))))
        else (concat (string-join((for $i in (string-length($string) to 3) return '0'),'') , $string))
    };
   return local:isodate("-1234 ,'', 1911, 786, -90, 0, 0302, -0987")

works except for the year '0'. How do i get that to return "", since 0000 is also no valid year, and while the data contains historical dates, none of if is julian calendar or any other format containing a year 0.

was or was my first idea on track and cast as should actually convert e.g. 123 into 0123?

duncdrum
  • 723
  • 5
  • 13

2 Answers2

2

XSD 1.0 says that there is no year zero; XSD 1.1 falls into line with ISO 8601 and says that there is. This follows the convention used by astronomers rather than the convention used by historians: see https://en.wikipedia.org/wiki/0_(year) for background.

For XQuery it's implemenetation-defined whether the XSD 1.0 or XSD 1.1 rules are used. I don't know which one eXist-DB follows.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Mike, I am not sure it was intentionally thought about in eXist: Trying to construct an xs:gYear("0000") causes an err:FORG0001, but that is really just catching an exception when we try and construct an org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl from the string value. So I guess we are XSD 1.0. I note that Saxon-PE 9.5.1.3 also throws an err:FORG0001, so perhaps we are okay? – adamretter Sep 15 '14 at 22:25
  • @Michael, that actually cleared up some of my confusion, when going through the xsd and w3 iso specs trying to figure out if 0000 should be valid or not. As far as I can tell TEI, points to ISO 8601, but doesn't like 0000. Kind of ironic coming to think of TEI's agenda :) – duncdrum Sep 15 '14 at 22:48
  • With Saxon, it depends on whether you enable XSD 1.1 support or not. – Michael Kay Sep 16 '14 at 16:50
1

How about something like this?

declare function local:as-year($year as xs:string) as xs:gYear? {
    let $y := number($year)
    return
        if($y lt 0)then
            concat("-", substring(string(10000 + $y * -1), 2)) cast as xs:gYear
        else if($y gt 0)then
            substring(string(10000 + $y), 2) cast as xs:gYear
        else()
};


let $dates := 
    <date>
        <a>-1234</a>
        <b/>    
        <c>1911</c> 
        <d>786</d>
        <e>-90</e>
        <f>0</f>
        <g>0302</g>
        <h>-0987</h>
    </date>
return
    for $n in $dates/*
    return   
        local:as-year($n)
adamretter
  • 3,885
  • 2
  • 23
  • 43
  • Thanks @adam works like a charm. What, if any, is the difference between number($year) and xs:integer($year) – duncdrum Sep 15 '14 at 18:05
  • Well I should at least give some credit to @michael-kay here as the padding code is actually an adaption of some code he posted as part of some code-golf a while ago when I was trying to pad the month aspect of a date: https://gist.github.com/adamretter/11361643. – adamretter Sep 15 '14 at 22:27
  • Regards `fn:number` vs `xs:integer`: fn:number returns an *xs:double* as opposed to an integer. – adamretter Sep 15 '14 at 22:28