2

I want to try featuretools, but I need a hint, how can I use for my dataset. I have data in the pandas dataframe, and it is a regression problem.

Here is an example of my dataset: enter image description here

What did I try:

import featuretools as ft
es = ft.EntitySet(id = 'train_X')
es = es.entity_from_dataframe(entity_id="train_X",
                                  dataframe=X,
                                  index="Index",
                        variable_types={
 "Market": ft.variable_types.Categorical,
 "Stock": ft.variable_types.Categorical,}
                                 )

feature_matrix_customers, features_defs = ft.dfs(entities=es,
                                           target_entity="y")

And got an error:

 KeyError: 'Entity 0 does not exist in train_X').
Vadim
  • 4,219
  • 1
  • 29
  • 44
  • have you tried the getting started guide in the documentation? https://docs.featuretools.com/#minute-quick-start. let us know if there's any specific place you're hitting issues or have confusions. – Max Kanter Apr 02 '18 at 12:58
  • @MaxKanter sure, I updated my question. – Vadim Apr 03 '18 at 08:11
  • the target entity should be the ID you provide above. Try setting it to `train_X`. – Max Kanter Apr 19 '18 at 14:45

2 Answers2

3

The problem here is likely that you’re trying to use a pandas dataframe directly as the input rather than an loading your data into an EntitySet. You should instead create an EntitySet and build features for that. You can also use EntitySet.enormalize_entity(...) with that EntitySet to create other entities to aid feature engineering.

As a note: You will probably want to look into use cutoff_times with this data type, which will allow you to specify which data can and can’t be used for generating features.

Max Kanter
  • 2,006
  • 6
  • 16
2

try this.

feature_matrix_customers, features_defs = ft.dfs(entityset=es, entities=es, target_entity="train_X")