I am storing XBRL JSON using elasticsearch.
This xBRL-JSON OIM spec describes the oim:period
property:
Otherwise, an ISO 8601 time interval representing the {interval} property, expressed in one of the following forms:
<start>/<end>
<start>/<duration>
<duration>/<end>
Where <start> and <end> are valid according to the xsd:dateTime datatype, and <duration> is valid according to xsd:duration.
Examples from arelle's plugin look like this:
- 2016-01-01T00:00:00/PT0S
- 2015-01-01T00:00:00/P1Y
I notice that arelle's plugin exclusively produces this format:
- <start>/<duration>
My question
Is there a way to save at least the <start>
part as a date type in elasticsearch?
Ideas I had:
elastichsearch only (my preference)
- Use a custom date format which anticipates the
/<duration>
part, but ignores it- I haven't checked Joda yet; will it ignore characters in the date format if they aren't part of the special character? Like the "/" delimiter or the "P" which precedes any duration value (like
PT0S
andP1Y
above)? - EDIT So the single-quote character escapes literals; this works
yyyy'/P'
will accept a value '2015/P'. However, the rest of the duration could be more dynamic - Re: dynamic; will Joda accept regex or wildcard character like "\d" or "+" qualifier so I can ignore all the possible variations following the
P
?
- I haven't checked Joda yet; will it ignore characters in the date format if they aren't part of the special character? Like the "/" delimiter or the "P" which precedes any duration value (like
- Use a character filter to strip out the
/<duration>
part before saving only<start>
as datetime. But I don't know if character filters happen before saving as type: date. If they don't, the '/`part isn't stripped, and I wouldn't be passing valid date strings. - Don't use date type: Use a pattern tokenizer to split on
/
, and at least the two parts will be saved as separate tokens. Can't use date math, though. - Use a transformation; although it seems like this is deprecated. I read about using
copy_to
instead, but that seems to combine terms, and I want to break this term apart - Some sort of plugin? Maybe a plugin which will fully support this "interval" datatype described by the OIM spec... maybe a plugin which will store its separate parts...?
change my application (I prefer to use elasticsearch-only techniques if possible)
- I could edit this plugin or produce my own plugin which uses exclusively
<start>
and<end>
parts, and saves both into separate fields;- But this breaks the OIM spec, which says they should be combined in a single field
- Moreover it can be awkward to express an "instant" fact (with no duration; the
PT0S
examples above); I guess I just use the same value forend
property asstart
property... Not more awkward than a 0-length duration (PT0S
) I guess.