I'm trying to implement a rolling window value-at-risk aggregation, and wonder if it's possible in Atoti: my main problem is I don't know how to filter vectors by index for each of the "observation windows".
- my data is 10 years of historical simulated PL
- I'd like to compute percentile for each observation window, each observation is 250 consecutive historical days.
In the documentation I found how to create sub-vectors based on static indexes, here: https://docs.atoti.io/0.3.1/tutorial/07-arrays.html#Sub-arrays - but I need the indexes to change depending on which "observation window" I'm looking at.
My input data will look as follows, where each vector contains 2500 values, and we need to compute percentile for overlapping sub-vectors each having 250 consecutive values:
Book Vectors
A [877.30;137.33;-1406.62;-156.48;-915.56;1702.2...
B [2182.98;394.09;-845.23;-422.25;-2262.86;-2010...
C [9.94;972.31;1266.79;178.33;-102.00;508.13;-23...
And I want to be able to display VaR for each of the observation windows, for example:
WindowIndex VaR
0 -98.8
1 -1000.9
2 -500.88
... ...
2250 -088.7
Or, better:
WindowStartDate VaR
2011-05-17 -98.8
2011-05-18 -1000.9
2011-05-19 -500.88
... ...
2019-12-31 -088.7
This code reproduces the use case - the "VaR Vector" is where I'm struggling to pass the indexes:
# sample data
import pandas as pd
import random
history_size = 2500 # 10 years of data
var_window_length = 250
df =pd.DataFrame(data = {
'Book': ['A', 'B', 'C'],
'Vectors': [[';'.join(["{0:.2f}".format(random.gauss(0,1000)) for x in range(history_size)])] for y in range(3)]
})
# atoti part
import atoti as tt
session = tt.create_session()
store = session.read_pandas(
df, keys=["Book"], store_name="Store With Arrays", array_sep=";"
)
cube = session.create_cube(store, "Cube")
lvl = cube.levels
m = cube.measures
# historical dates:
historical_dates = pd.bdate_range(periods = history_size - var_window_length + 1, end = pd.Timestamp('2019-12-31'), freq='B')
historical_dates
# This measure aggreates vectors across positions:
cube.query(m["Vectors.SUM"])
# This measure will display vectors for a given window - but how can I pass the right indexes for each observation window?
m["VaR Vector"] = m["Vectors.SUM"][???]
# This measure will compute VaR from each subvector:
m["95 percentile"] = tt.array.percentile(m["VaR Vector"], 0.05, "simple")