i am having trouble understand the cutoff_dates concept. what i am really looking for is calculating different features by a time window that is let's say 60 days back (without the current transaction) , the cutoff_dates looks like hard coded dates in the examples. i am using time index for each row (A_time below), and according to the docs in here what_is_cutoff_datetime :
The time index is defined as the first time that any information from a row can be used. If a cutoff time is specified when calculating features, rows that have a later value for the time index are automatically ignored.
so it is not clear if i don't put the cutoff date the feature will be calculated until the time index value or not.
here is my entityset definition:
es = ft.EntitySet('payment')
es = es.entity_from_dataframe(entity_id='tableA',
dataframe=tableA_dfpd,
index='paymentIndex',
time_index='A_time')
es.normalize_entity(base_entity_id='tableA',
new_entity_id='tableB',
index='B_index',
additional_variables=['B_x','B_time'],
make_time_index='B_time')
es.normalize_entity(base_entity_id='tableA',
new_entity_id='tableC',
index='C_index',
additional_variables=["C_x","C_date"],
make_time_index="C_date")
es.normalize_entity(base_entity_id='tableA',
new_entity_id='tableD',
index='D_index',
additional_variables=["D_x"],
make_time_index=False)
Entityset: payment
Entities:
tableA [Rows: 310083, Columns: 8]
tableB [Rows: 30296, Columns: 3]
tableC [Rows: 206565, Columns: 3]
tableD [Rows: 18493, Columns: 2]
Relationships:
tableA.B_index -> tableB.B_index
tableA.C_index -> tableC.C_index
tableA.D_index -> tableD.D_index
how exactly i can do the window calculation? do i need to pass the cutoff dates or not ? to dfs method ? i want to use all window calculations based on A_time variable, for a 60 days window up to current transaction, so actually the cutoff date for every transaction is the time_A value of that transaction. , isn't it ?