0

How can I get the 5 minute time window of an event 'X' when it reaches its maximum value(say in last one month or last 7 days or in a day) and fetch the time of event also when It was max..

Saahil Gupta
  • 209
  • 2
  • 6
  • Your question isn't entirely clear. Let me try to rephrase: “Given a set of event documents, perhaps filtered by a date range (or keyword or anything else), find the event with the highest value for a given property over the filtered set. Return that event as well as the other events within the 2.5 minutes before and after the time at which the maximum event occurred.” Is that correct? – Justin Makeig Jan 18 '16 at 17:49
  • No Justin.. in simple terms its like I am inserting records in database countinuously..so I need to know that how many records are getting ingested every 5 min. and when the max. no. of records got ingested(i.e, during which 5 min. interval), say doing this task for last one month or a given time range – Saahil Gupta Jan 19 '16 at 08:34

2 Answers2

1

If I understand correctly, you have documents with a "created" element (or whatever it's called). Let's suppose you have a dateTime range index on that element.

What you're looking to do is bucket those times with 5-minute intervals, then find the most populated bucket. The Search and REST APIs each provide bucketing as features. That is typically used with smaller numbers of buckets, but give it a try.

If that doesn't work well, you could do something similar more manually, just building the buckets that you need.

(: untested :)
let $buckets := map:map()
let $populate :=
  for $dt in cts:values(cts:element-reference(xs:QName("created")), (), (), $query)
  let $bucket := local:bucket($dt)
  return 
    map:put(
      $buckets, 
      $bucket, 
      (map:get($buckets, $bucket), 0)[1] + cts:frequency($dt)
let $inverse := -$buckets
let $max-count := fn:max(map:keys($inverse))
let $max-bucket := map:get($inverse, $max-count)

$max-bucket will be a sequence (more than one if there's a tie) of buckets with the highest number of values. You'll use $query to narrow down to your target month (or other time range). You'll need to implement local:bucket() --

declare function local:bucket($dt as xs:dateTime)

Figure out some unique key to identify a 5-minute interval and return that.

Dave Cassel
  • 8,352
  • 20
  • 38
  • thanks.. but can you please provide a simple solution like where we mention start date and end date first, then we use access:dateTime function and provide a 5 minute window and keep on increasing it by 5 minutes putting in a loop till it matches the end date and keeps on counting the docs inserted per iteration. Then it takes the maximum out of those buckets and shows the max. count and that bucket.. It would be rally helpful if you can provide something using built-In bucket functions of marklogic – Saahil Gupta Jan 20 '16 at 08:56
0

Building out buckets of 5-minute durations isn’t MarkLogic-specific. Here is some XQuery that gives you 10,000 5-minute buckets starting at midnight on January 1, 2000. (That’s about 35 days):

let $ranges as xs:dateTime+ := 
  for $d in (0 to 10000) 
  return xs:dateTime("2000-01-01T00:00:00") + (xs:dayTimeDuration("PT300S") * $d)
return $ranges

I’ll leave creating start and end boundaries as an exercise for the user.

Justin Makeig
  • 2,097
  • 15
  • 29
  • I have reached here so far: for $d in (0 to 10000) let $ranges := xs:dateTime("2014-01-01T00:00:00") + (xs:dayTimeDuration("PT300S") * $d) return for $a in $ranges let $b := $a let $c := $b + (xs:dayTimeDuration("PT300S")) – Saahil Gupta Jan 22 '16 at 11:31
  • let $count := fn:count(cts:search(fn:collection(),cts:and-query( ( cts:properties-query(cts:element-range-query(xs:QName("dbts:TxStmp"),">=",xdmp:wallclock-to-timestamp("2015-12-20T00:00:00"))), cts:properties-query(cts:element-range-query(xs:QName("dbts:TxStmp"),">=",xdmp:wallclock-to-timestamp("2016-01-20T00:00:00"))) ) ) ,("unfiltered") ) ) return "Start Time::" || $b || " End Time::" || $c || " Count::" || $count – Saahil Gupta Jan 22 '16 at 11:31
  • and I am getting output like this: – Saahil Gupta Jan 22 '16 at 11:32
  • and I am getting output like this: Start Time::2016-01-14T12:05:00 End Time::2016-01-14T12:10:00 Count::0........................ Start Time::2016-01-14T12:10:00 End Time::2016-01-14T12:15:00 Count::5........................... Start Time::2016-01-14T12:15:00 End Time::2016-01-14T12:20:00 Count::10........ and so on. Can you please tell me how to find the max. count out of it and corresponding start and end time... – Saahil Gupta Jan 22 '16 at 12:05