1

I have a time series in a SQLite Database and want to analyze it.

The important part of the time series consists of a column with different but not unique string values. I want to do something like this:

Value  concat  countValue

A      A       1
A      A,A     1
B      A,A,B   1
B      A,B,B   2
B      B,B,B   3
C      B,B,C   1
B      B,C,B   2

I don't know how to get the countValue column. It should count all Values of the partition equal to the current rows Value.

I tried this but it just counts all Values in the partition and not the Values equal to this rows Value.

SELECT 
    Value,
    group_concat(Value) OVER wind AS concat,
    Sum(Case When Value Like Value Then 1 Else 0 End) OVER wind AS countValue
FROM TimeSeries
WINDOW
    wind AS (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
ORDER BY 
    date
;

The query is also limited by these factors:

  1. The query should work with any amount of unique Values
  2. The query should work with any Partition Size (ROWS BETWEEN n PRECEDING AND CURRENT ROW)

Is this even possible using only SQL?

GMB
  • 216,147
  • 25
  • 84
  • 135

1 Answers1

0

Here is an approach using string functions:

select 
    value,
    group_concat(value) over wind as concat,
    (
        length(group_concat(value) over wind) - length(replace(group_concat(value) over wind, value, ''))
    ) / length(value) cnt_value
from timeseries
window wind as (order by date rows between 2 preceding and current row)
order by date;
GMB
  • 216,147
  • 25
  • 84
  • 135
  • This does the trick, thank you. I think I understand why this works. It works by counting the number of occurrences of the Value string in the concat string right? But is there no way to reference the current row Value inside a window function? – Mini Cooper 246 Oct 25 '20 at 15:56