lets say I have 3 entities: parent1 <- child -> parent2. I used dfs()
and got feature I can't understand MEAN(child.parent2.MEAN(child.num_feature))
. Reading documentation I thought about any_entity.MEAN
features as "group by entity then apply MEAN" but now this approach doesn't work
Asked
Active
Viewed 41 times
0

Sergey Skripko
- 336
- 1
- 8
1 Answers
0
Deep Feature Synthesis creates new features by "stacking" existing features. To understand this feature, let’s go through how this is calculated step-by-step.
- Calculate feature
MEAN(child.num_feature)
and add it toparent2
. - Join that feature (defined on
parent2
) intochild
. This creates a new featureparent2.MEAN(child.num_feature)
defined onchild
. Rows ofchild
that have the same value for `parent2 will have the same value for this feature. - Group the
child
byparent
and take the mean of that feature. This createsMEAN(child.parent2.MEAN(child.num_feature))
To help clarify, let's go through a concrete example
Imagine parent1
was a table of customers, child
was a table of transactions
by your customers with the column amount
, and parent2
was a table of each unique product you sell.
The feature MEAN(transactions.product.SUM(amount))
created for the customers entity could be interpreted as “what is the average total sales of products this customer purchased” e.g “does this customer buy products that have sold a lot”.

Max Kanter
- 2,006
- 6
- 16