suppose I have a dataset like this (in an Oracle 11g db environment)
CHANGE_DATE VALUE
------------------ ----------
03-NOV-13 06.56.01 3027.97
03-NOV-13 06.57.01 3030.59
03-NOV-13 06.58.01 3032.33
03-NOV-13 06.59.01 3047.41
03-NOV-13 07.00.02 3045.82
03-NOV-13 07.01.01 3046.63
03-NOV-13 07.02.01 3020.29
03-NOV-13 07.03.02 3019.38
03-NOV-13 07.04.01 3020.76
03-NOV-13 07.05.01 3008.53
what I would be interested in is a select statement which only displays values on a sufficiently large change, e.g. 0.1%. In the above dataset the desired output would be
03-NOV-13 06.56.01 3027.97
03-NOV-13 06.58.01 3032.33
03-NOV-13 06.59.01 3047.41
03-NOV-13 07.04.01 3020.29
03-NOV-13 07.05.01 3008.53
EDIT: to explain the aim: the first row is the first reference value. Any subsequent row value should be compared to this. If the change with respect to the reference values does not exceed x%, continue. If the value does exceed the threshold, select this row and keep this new value as the reference to compare the next rows to.
I know how I can achieve something like this in case I am only flipping between integer values along the lines of what has been discussed here: Select rows where column value has changed
I tried to implement something along those lines using:
with t as (
select to_date('03-NOV-13 06.56.01','dd/mm/yyyy hh24:mi:ss') change_date, 3027.97 value from dual union all
select to_date('03-NOV-13 06.57.01','dd/mm/yyyy hh24:mi:ss'), 3030.59 from dual union all
select to_date('03-NOV-13 06.58.01','dd/mm/yyyy hh24:mi:ss'), 3032.33 from dual union all
select to_date('03-NOV-13 06.59.01','dd/mm/yyyy hh24:mi:ss'), 3047.41 from dual union all
select to_date('03-NOV-13 07.00.02','dd/mm/yyyy hh24:mi:ss'), 3045.82 from dual union all
select to_date('03-NOV-13 07.01.01','dd/mm/yyyy hh24:mi:ss'), 3046.63 from dual union all
select to_date('03-NOV-13 07.02.01','dd/mm/yyyy hh24:mi:ss'), 3020.29 from dual union all
select to_date('03-NOV-13 07.03.02','dd/mm/yyyy hh24:mi:ss'), 3019.38 from dual union all
select to_date('03-NOV-13 07.04.01','dd/mm/yyyy hh24:mi:ss'), 3020.76 from dual union all
select to_date('03-NOV-13 07.05.01','dd/mm/yyyy hh24:mi:ss'), 3008.53 from dual )
, x as ( select value, ROUND(value,-1) round_value, change_date, ROW_NUMBER() OVER (ORDER BY change_date) as rn from t order by change_date) select x.value, x.change_date from x join x y on x.rn = y.rn+1 and x.round_value <> y.round_value;
which gives
3047.41 03-NOV-13
3020.29 03-NOV-13
3008.53 03-NOV-13
which is not too far off the mark but the comparison is always made just to previous value not the first not-suppressed value. Obviously this simply does a rounding and does not look for any percentage change.
I also tried to play with lag like this
with t as (
select to_date('03-NOV-13 06.56.01','dd/mm/yyyy hh24:mi:ss') change_date, 3027.97 value from dual union all
select to_date('03-NOV-13 06.57.01','dd/mm/yyyy hh24:mi:ss'), 3030.59 from dual union all
select to_date('03-NOV-13 06.58.01','dd/mm/yyyy hh24:mi:ss'), 3032.33 from dual union all
select to_date('03-NOV-13 06.59.01','dd/mm/yyyy hh24:mi:ss'), 3047.41 from dual union all
select to_date('03-NOV-13 07.00.02','dd/mm/yyyy hh24:mi:ss'), 3045.82 from dual union all
select to_date('03-NOV-13 07.01.01','dd/mm/yyyy hh24:mi:ss'), 3046.63 from dual union all
select to_date('03-NOV-13 07.02.01','dd/mm/yyyy hh24:mi:ss'), 3020.29 from dual union all
select to_date('03-NOV-13 07.03.02','dd/mm/yyyy hh24:mi:ss'), 3019.38 from dual union all
select to_date('03-NOV-13 07.04.01','dd/mm/yyyy hh24:mi:ss'), 3020.76 from dual union all
select to_date('03-NOV-13 07.05.01','dd/mm/yyyy hh24:mi:ss'), 3008.53 from dual )
select value, change_date, case when abs( lag(value,1,0) over(order by change_date) - value ) / value > 0.001 then value else lag(value,1,0) over(order by change_date) end start_of_group from t;
which results in
VALUE CHANGE_DA START_OF_GROUP
---------- --------- --------------
3027.97 03-NOV-13 3027.97
3030.59 03-NOV-13 3027.97
3032.33 03-NOV-13 3030.59
3047.41 03-NOV-13 3047.41
3045.82 03-NOV-13 3047.41
3046.63 03-NOV-13 3045.82
3020.29 03-NOV-13 3020.29
3019.38 03-NOV-13 3020.29
3020.76 03-NOV-13 3019.38
3008.53 03-NOV-13 3008.53
which also seems to be a step in the right direction but has the same problem that the comparison is not done to the 'start_of_group' column but the 'value' column
I would appreciate any hints on how to achieve this. Please let me know if the question is sufficiently clear or if I should add any information.
P.S. first time poster, hope I managed to post the question in a meaningful way