Below is the example code from the official docs
import featuretools as ft
es = ft.demo.load_mock_customer(return_entityset=True)
feature_matrix, feature_defs = ft.dfs(
entityset=es,
target_entity="customers",
agg_primitives=["sum", "mode"],
trans_primitives=["cum_max", "month", "cum_count"],
max_depth=2
)
feature_defs
>>
[<Feature: zip_code>,
....
<Feature: MODE(sessions.device)>,
<Feature: MODE(transactions.sessions.device)>,
...
]
After analyzing the calculation of graph_feature()
, it looks like MODE(sessions.device)
and MODE(transactions.sessions.device)
are same even though they are calculated in different way. If I'm right, why does dfs calculate this redundantly?