2

How can I get a list of column names in featuretools.

in pandas data frames I just type this code

dataframe.columns

that returns a list of columns names

however, I tried to do it in an entity set and failed. should I convert the entity set to data frame?

I'm now doing it by converting variables to string and do regular expressions to obtain the name of a variable. However, I believe that there is a better way to do it.

thank you,

bugfreerammohan
  • 1,471
  • 1
  • 7
  • 22

1 Answers1

2

An entity set contains multiple dataframes, on for each entity.

In [1]: import featuretools as ft

In [2]: es = ft.demo.load_retail()

In [3]: es
Out[3]: 
Entityset: demo_retail_data
  Entities:
    order_products [Rows: 401604, Columns: 7]
    customers [Rows: 4372, Columns: 2]
    products [Rows: 3684, Columns: 3]
    orders [Rows: 22190, Columns: 5]
  Relationships:
    order_products.product_id -> products.product_id
    order_products.order_id -> orders.order_id
    orders.customer_name -> customers.customer_name

If you want the variables for the "orders" entity run

In [4]: es["orders"].variables
Out[4]: 
[<Variable: order_id (dtype = index)>,
 <Variable: country (dtype = categorical)>,
 <Variable: cancelled (dtype = boolean)>,
 <Variable: first_order_products_time (dtype: datetime_time_index, format: None)>,
 <Variable: customer_name (dtype = id)>]

If you want to acess the underlying dataframe itself run

In [5]: es["orders"].df.head(5)
Out[5]: 
       order_id         country  cancelled first_order_products_time  customer_name
536365   536365  United Kingdom      False       2010-12-01 08:26:00   Andrea Brown
536366   536366  United Kingdom      False       2010-12-01 08:28:00   Andrea Brown
536367   536367  United Kingdom      False       2010-12-01 08:34:00  Krista Maddox
536368   536368  United Kingdom      False       2010-12-01 08:34:00  Krista Maddox
536369   536369  United Kingdom      False       2010-12-01 08:35:00  Krista Maddox
Max Kanter
  • 2,006
  • 6
  • 16