Questions tagged [featuretools]

Featuretools is a Python library for automated feature engineering on relational datasets using a technique called Deep Feature Synthesis.

Featuretools is an open source python library for automated feature engineering for tabular relational datasets.

Resources

221 questions
1
vote
1 answer

How to get trans_primitives of highest entity in featuretools?

In the classic mock customer dataset example in featuretools, if I have to derive trans_primitives like month, day, year etc. of transaction_time attribute of transactions entity. How do I do that? import featuretools as ft es =…
Milind Dalvi
  • 826
  • 2
  • 11
  • 20
1
vote
1 answer

How to implement custom naming for multioutput primitives in FeatureTools

As of version v0.12.0, FeatureTools allows you to assign custom names to multi-output primitives: https://github.com/alteryx/featuretools/pull/794. By default, the when you define custom multi-output primitives, the column names for the generated…
1
vote
1 answer

featuretools: how can I apply `time_since`, `time_since_first` primitives on integer type of time index?

When the time index is integer(e.g. starting from 0 for each user), running dfs shows warnings: UnusedPrimitiveWarning: Some specified primitives were not used during DFS: agg_primitives: ['avg_time_between', 'time_since_first', 'time_since_last',…
user3595632
  • 5,380
  • 10
  • 55
  • 111
1
vote
1 answer

featuretools: manual derivation of the features generated by dfs?

Code example: import featuretools as ft es = ft.demo.load_mock_customer(return_entityset=True) # Normalized one more time es = es.normalize_entity( new_entity_id="device", base_entity_id="sessions", index="device", ) feature_matrix,…
user3595632
  • 5,380
  • 10
  • 55
  • 111
1
vote
1 answer

calculate time-windowed profiles with featuretools dfs

i am having trouble understand the cutoff_dates concept. what i am really looking for is calculating different features by a time window that is let's say 60 days back (without the current transaction) , the cutoff_dates looks like hard coded dates…
user1450410
  • 191
  • 1
  • 13
1
vote
1 answer

Use FeatureTools to aggregate monthly data from daily

I'm trying to use FeatureTools to create a dataset for use in customer churn analysis. I have a raw dataset of orders that include fields like: customer_id, order_id, order_month, order_datetime, order_cost I'd like to create a dataset that returns…
kevin.w.johnson
  • 1,684
  • 3
  • 18
  • 37
1
vote
1 answer

featuretools: why does dfs() do redundant calculation?

Below is the example code from the official docs import featuretools as ft es = ft.demo.load_mock_customer(return_entityset=True) feature_matrix, feature_defs = ft.dfs( entityset=es, target_entity="customers", agg_primitives=["sum",…
user3595632
  • 5,380
  • 10
  • 55
  • 111
1
vote
1 answer

How do I create features in featuretools for rows with the same id and a time index?

I have a Dataframe like this data = {'Customer':['C1', 'C1', 'C1', 'C2', 'C2', 'C2', 'C3', 'C3', 'C3'], 'NumOfItems':[3, 2, 4, 5, 5, 6, 10, 6, 14], 'PurchaseTime':["2014-01-01", "2014-01-02", "2014-01-03","2014-01-01", "2014-01-02",…
1
vote
1 answer

FeatureTools - How to Add 2 Columns Together?

I'm stuck. Using Featuretools, all I want to do is create a new column that sums two columns together from my dataset, creating a "stacked" feature of sorts. Do this for all columns in my dataset. My code looks like this: # Define the function def…
wildcat89
  • 1,159
  • 16
  • 47
1
vote
1 answer

Use FeatureTools on specific columns only

I am trying to use feature tools to generate some new features using only some specified columns for the Titanic dataset. In my case I want to do a transform 'add_numeric' and 'multiply_numeric' on Age, Pclass and log10splitfare. I have followed the…
Leo Torres
  • 673
  • 1
  • 6
  • 18
1
vote
1 answer

Featuretools: skip the target feature

When using Featuretools is it possible to skip the target feature? For example, consider the iris dataset data =…
user13411021
1
vote
1 answer

Featuretools: Using features calculated in train data on new data

I was wondering how to use features developed in train time for prediction on new data. The dataset in question is the appointment cancellation dataset from Predict appointment no show, Github Consider the feature locations.PERCENT_TRUE(no_show):…
1
vote
1 answer

What is the proper way of using featuretools for single table data?

Assume that I have a dataset consisting of single table, for instance you can consider titanic dataset on kaggle. Now what is a proper way of using feature tools to get most benefit from it? as featuretools is specially for relational data. now by…
1
vote
1 answer

How to create new variables by multiple ids in featuretools?

I have a dataset that has one row per member and per transaction, and there are different stores the purchase could have came from 'brand_id'. I want to use featuretools to make output that would have one row per member, with an aggregate of…
1
vote
1 answer

How to add multiple relationships to a new entity?

I am trying to add multiple relationships simultaneously to an entityset that i created. I use the following code: import featuretools as ft data = ft.demo.load_mock_customer() customers_df = data["customers"] sessions_df =…
figs_and_nuts
  • 4,870
  • 2
  • 31
  • 56