I'm trying to use featuretools to generate a feature matrix to train on past data and predict some future data. So this is my setup:
import featuretools as ft
import pandas as pd
df_hotel = pd.DataFrame({
'hotel_id': [1, 2],
})
df_bookings = pd.DataFrame({
'bookings_id': [1, 2, 3, 4, 5, 6, 7, 8],
'time': [1, 2, 3, 4, 1, 2, 3, 4],
'hotel_id': [1, 1, 1, 1, 2, 2, 2, 2],
'bookings': [1, 2, 3, 4, 5, 6, 7, 8]
})
es = ft.EntitySet()
es = es.entity_from_dataframe(
entity_id='c',
dataframe = df_bookings,
index='bookings_id',
time_index='time'
)
es = es.entity_from_dataframe(
entity_id='hotels',
dataframe=df_hotel,
index='hotel_id'
)
es = es.add_relationship(
ft.Relationship(
es['hotels']['hotel_id'],
es['bookings']['hotel_id'],
)
)
And I generate a feature matrix as follows:
feature_matrix, feature_defs = ft.dfs(
entityset=es,
target_entity='bookings',
cutoff_time=3,
agg_primitives=["mean"]
)
feature_matrix
However what this gives me are two rows (where time is 4, after the curoff) where all values are NAN. The desired behaviour is to fill the values of these rows as well (but computing the aggregations based only on past data). Is this possible with featuretools?