0

let's say we have a column with a number that increases a bit on a daily basis, but cannot predict the increase with good precision. For example (the value on day_x is):

day_1 = 10, 
day_2 = 20, 
day_3 = 35, 
day_4 = 22, (a sudden decrease here) 
day_5 = 41 
...etc 

So we know in general that there is an upward trend with different percentage every time. How can we get the current ratio, or even better "predict" the next increase? Can deequ train itself with some accuracy?

Thank you!

1 Answers1

0

use the anomaly detection feature in deequ.

Multiple "anomaly detection algorithm" can be used. Most likely the "RelativeRateOfChangeStrategy", where min/max percentage change can be applied.

Below is an example of such detector: https://github.com/awslabs/deequ/blob/master/src/main/scala/com/amazon/deequ/examples/anomaly_detection_example.md

If you're looking for unexpected low value, decrease by more than 20%

maxRateDecrease=0.8
Deb
  • 587
  • 4
  • 12
Gkaisin
  • 1
  • 1