I want to calculate a feature of second order (depth = 2). Because of the entity structure the feature matrix calculation need to calculate so many combination that the calculation takes "years".
Can one more specify via a rule settings the list of features to be calculated?
I have a Customer Table (cid, ...) with fixed data (i.e. birth date) related to each customer. Additional I have two (A and B) different tables (cid, MonthlyReportPeriod, ...) derived from customer monthly behavior (i.e. #orders) on different products. The wanted features of second order is for instance the wighted sum in time of orders for each behavior table.
When I calculate the features using just one behavior table (A) > 1000 feature are calculated. I think when I use the two in one shot I have two many combination.
es = ft.EntitySet(id = 'ES_PRA')
es = es.entity_from_dataframe(entity_id = 'Customer', dataframe = Customer,
index = 'cid')
es = es.entity_from_dataframe(entity_id = 'A',
dataframe = A,
make_index = True,
index = 'AID',
time_index = 'MonthlyReportPeriod')
es = es.entity_from_dataframe(entity_id = 'B',
dataframe = B,
make_index = True,
index = 'BID',
time_index = 'MonthlyReportPeriod')
#
r_Customer_A = ft.Relationship(es['Customer']['cid'],
es['A']['cid'])
es = es.add_relationship(r_Customer_A)
#
r_Customer_B = ft.Relationship(es['Customer']['cid'],
es['A']['cid'])
es = es.add_relationship(r_Customer_B)
#
seed_features=[
ft.Feature(es["A"]['MonthlyReportPeriod'], primitive=WeightTimeUntil),
ft.Feature(es["B"]['MonthlyReportPeriod'], primitive=WeightTimeUntil)
]
features, feature_names = ft.dfs(entityset = es, target_entity = 'Customer',
agg_primitives = ['sum','last'],
seed_features=seed_features,
trans_primitives = [MultiplyNumeric],
max_depth=2)
I would like to specify the aggregated primitive per column in Table A and B:
cid: - MonthlyReportPeriod: -
(#order): sum, last, sum(WeightTimeUntil/MultiplyNumeric)
(#sales): sum, last, mean, ...