0

this is my data set while trying to use featuretools

data
    Unit Price  Customer Name   Product Category    Region  Profit  Quantity ordered new    Sales   Order ID
0   2.88    Janice Fletcher Office Supplies Central 1.320000    2   5.90    88525
1   2.84    Bonnie Potter   Office Supplies West    4.560000    4   13.01   88522
2   6.68    Bonnie Potter   Office Supplies West    -47.640000  7   49.92   88523
3   5.68    Bonnie Potter   Office Supplies West    -30.510000  7   41.64   88523
4   205.99  Bonnie Potter   Technology  West    998.202300  8   1446.67 88523

9426 rows × 8 columns

returns
    Order ID    Status
0   65  Returned
1   612 Returned
2   614 Returned
3   678 Returned
4   710 Returned

1634 rows × 2 columns

users
    Region  Manager
0   Central Chris
1   East    Erin
2   South   Sam
3   West    William
entities = {
"data" : (data, "Order ID"),
"returns" : (returns, "Status"),
"users" : (users, "Manager")

}

relationships = [
('data', 'Order ID', 'returns', 'Order ID'),
('data', 'Region', 'users', 'Region')

]

combined_table, features_defs = ft.dfs(entities = entities,
                                  relationships = relationships,
                                  target_entity = "Unit Price")

combined_table

This the error message I'm getting

AssertionError: Index is not unique on dataframe (Entity data)

can anyone tell me what I've not done correctly? enter image description here

scavesvor
  • 11
  • 1

1 Answers1

0

The values of your indices on each entity must be unique. On your data entity your indize is empty for all Order ID values.

Furthermore:

target_entity = "Unit Price"

will not work because you must provide a entity (data, returns or users) not a column of a table/entity. Featurtools only generates features on one table/entity per run not on all.

Wuuzzaa
  • 56
  • 6