-1

I have a requirement to print consequently dated time offs of the same type into a single row with total time off being sum of units of each time off row and a start date having the start date of the oldest time off row and the end date being the latest time off row start date based on an XML below.

--XML--

<?xml version='1.0' encoding='UTF-8'?>
<Data>
    <Worker>
        <Worker_ID>12</Worker_ID>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-09-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-08-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-02-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-01-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Statutory Holiday</Type>
            <Date>2018-02-07-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Statutory Holiday</Type>
            <Date>2018-02-06-08:00</Date>
            <Units>1</Units>
        </Time_Off>
    </Worker>
    <Worker>
        <Worker_ID>09</Worker_ID>
        <Time_Off>
            <Type>Sick Leave</Type>
            <Date>2018-02-10-08:00</Date>
            <Units>1</Units>
        </Time_Off>
    </Worker>
    <Worker>
        <Worker_ID>13</Worker_ID>
        <Time_Off>
            <Type>Vacation</Type>
            <Date>2018-02-11-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Vacation</Type>
            <Date>2018-01-10-08:00</Date>
            <Units>1</Units>
        </Time_Off>
    </Worker>
</Data>

-Desired Output--

    EmployeeID,TimeOff Type,TimeOff Start Date,TimeOff End Date,Total Units 
    12,Compassionate Leave,2018-02-08,2018-02-09,2
    12,Compassionate Leave,2018-02-01,2018-02-02,2
    12,Statutory Holiday,2018-02-06,2018-02-07,2
    09,Sick Leave,2018-02-10,2018-02-10,1
    13,Vacation,2018-02-11,2018-02-11,1
    13,Vacation,2018-01-10,2018-01-10,1
Arthi
  • 17
  • 3
  • What have you tried so far? aka as "where's your code"? – zx485 Feb 22 '18 at 01:15
  • 1
    We want to help you accomplish your vision... key word is "help". Whatcha got? What are you having trouble with? – sorak Feb 22 '18 at 01:21
  • What should be the output for a scenario in which `Worker_ID = 12` has another set of `Compassionate Leave` for a different date range say `2018-02-01` to `2018-02-02`? How will the grouping be in that scenario?. The current output shared does not match with the shared XML data. Please check and correct it too. – Aniket V Feb 22 '18 at 05:29
  • Hi Aniket. Thanks for your response. I have updated my post to cover for the scenario you questioned. For consecutive time offs, I need to print one row with the sum of the units of all the consecutive time offs and the start date being the time off date in the oldest time off and the end date being the time off date in the latest time off. So to your question, I need - 12, Compassionate Leave, 2018-02-01(start Date), 2018-02-02(end date), total of both(units). See also last two rows in my output where it is not consecutive. – Arthi Feb 22 '18 at 13:32

1 Answers1

0

A similar question How to group consecutive dates in XSLT? has some suggestions on solving it, both with XSLT or with XQuery.

https://xqueryfiddle.liberty-development.net/pPgCcoj/1 is an attempt to use the window clause in XQuery 3 to solve your problem:

declare function local:date($input as xs:string) as xs:date {
    xs:date(substring($input, 1, 10))
};


string-join((
    'EmployeeID,TimeOff Type,TimeOff Start Date,TimeOff End Date,Total Units',
for $worker in Data/Worker
for $time-off in $worker/Time_Off
group by $type := data($time-off/Type)
return
    let $times := for $time in $time-off
                  order by local:date($time/Date)
                  return $time
    return 
        for tumbling window $line in $times
        start $s when true()
        end $e next $n when empty($n) or local:date($n/Date) - local:date($e/Date) ne xs:dayTimeDuration('P1D')
        return string-join(
            ($worker/Worker_ID, $type, local:date($s/Date), local:date($e/Date), sum($line/Units)), ',')), '&#10;')

With XSLT you could use

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs mf"
    version="3.0">

  <xsl:output method="text"/>
  <xsl:strip-space elements="*"/>

  <xsl:function name="mf:date" as="xs:date">
      <xsl:param name="input" as="xs:string"/>
      <xsl:sequence select="xs:date(substring($input, 1, 10))"/>
  </xsl:function>

  <xsl:function name="mf:line" as="xs:string">
      <xsl:param name="group" as="element(Time_Off)*"/>
      <xsl:value-of 
        select="$group[1]/../Worker_ID, 
                $group[1]/Type, 
                mf:date($group[1]/Date),
                mf:date($group[last()]/Date),
                sum($group/Units)"
                separator=","/>      
  </xsl:function>

  <xsl:template match="Worker">
      <xsl:for-each-group select="Time_Off" group-by="Type">
          <xsl:variable name="sorted-times" as="element(Time_Off)*">
              <xsl:perform-sort select="current-group()">
                  <xsl:sort select="mf:date(Date)"/>
              </xsl:perform-sort>
          </xsl:variable>
          <xsl:for-each-group select="$sorted-times" group-by="mf:date(Date) - xs:dayTimeDuration('P1D') * position()">
              <xsl:value-of select="mf:line(current-group()) || '&#10;'"/>
          </xsl:for-each-group>
      </xsl:for-each-group>
  </xsl:template>

</xsl:stylesheet>

online at https://xsltfiddle.liberty-development.net/pPgCcov/1. For an XSLT 2 processor you would need to use <xsl:value-of select="concat(mf:line(current-group()), '&#10;')"/> instead of <xsl:value-of select="mf:line(current-group()) || '&#10;'"/>.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110