0

lets say I have 3 entities: parent1 <- child -> parent2. I used dfs() and got feature I can't understand MEAN(child.parent2.MEAN(child.num_feature)). Reading documentation I thought about any_entity.MEAN features as "group by entity then apply MEAN" but now this approach doesn't work

Sergey Skripko
  • 336
  • 1
  • 8

1 Answers1

0

Deep Feature Synthesis creates new features by "stacking" existing features. To understand this feature, let’s go through how this is calculated step-by-step.

  1. Calculate feature MEAN(child.num_feature) and add it to parent2.
  2. Join that feature (defined on parent2) into child. This creates a new feature parent2.MEAN(child.num_feature) defined on child. Rows of child that have the same value for `parent2 will have the same value for this feature.
  3. Group the child by parent and take the mean of that feature. This creates MEAN(child.parent2.MEAN(child.num_feature))

To help clarify, let's go through a concrete example

Imagine parent1 was a table of customers, child was a table of transactions by your customers with the column amount, and parent2 was a table of each unique product you sell.

The feature MEAN(transactions.product.SUM(amount)) created for the customers entity could be interpreted as “what is the average total sales of products this customer purchased” e.g “does this customer buy products that have sold a lot”.

Max Kanter
  • 2,006
  • 6
  • 16