Questions tagged [featuretools]

Featuretools is a Python library for automated feature engineering on relational datasets using a technique called Deep Feature Synthesis.

Featuretools is an open source python library for automated feature engineering for tabular relational datasets.

Resources

221 questions
2
votes
1 answer

Custom Aggregation Primitives With Additional Arguments?

The transform primitive works fine with additional arguments. Here is an example def string_count(column, string=None): ''' ..note:: this is a naive implementation used for clarity ''' assert string is not None, "string to count…
2
votes
1 answer

How to use FeatureTools to generate new features by crossing features in a table?

Feature crossing is a very common technique to find the nonlinear relationships in a dataset. How to use FeatureTools to generate new features by crossing features in a table?
Mark Lin
  • 55
  • 4
2
votes
2 answers

featuretools: How to properly generate features for a regression task

I want to try featuretools, but I need a hint, how can I use for my dataset. I have data in the pandas dataframe, and it is a regression problem. Here is an example of my dataset: What did I try: import featuretools as ft es = ft.EntitySet(id =…
Vadim
  • 4,219
  • 1
  • 29
  • 44
2
votes
1 answer

Featuretools dfs runtime error

Working through the featuretools "predict_next_purchase" demo against my own data. I've created the entity set, and have also created a new pandas.dataframe comprised of the labels and times. I'm to the point of using ft.dfs for deep feature…
Nick Bernini
  • 121
  • 4
2
votes
1 answer

Generate labels for predictive model using featuretools

I'm currently working through the feature tools demo (https://github.com/Featuretools/predict_next_purchase/blob/master/Tutorial.ipynb) using my own data. I've created an entity set, and am trying to first create the labels. The notebook…
Nick Bernini
  • 121
  • 4
2
votes
1 answer

Using dfs and calculate_feature_matrix?

You could use ft.dfs to get back feature definitions as input to ft.calculate_feature_matrix or you could just use ft.dfs to compute the feature matrix. Is there a recommended way of using ft.dfs and ft.calculate_feature_matrix for best practice?
Jeff Hernandez
  • 2,063
  • 16
  • 20
2
votes
1 answer

What kind of feature vectors does featuretools / DFS generate?

Are the feature vectors generated by featuretools/DFS dense or sparse or does it depend on something?
Henry Thornton
  • 4,381
  • 9
  • 36
  • 43
1
vote
1 answer

How to show every primitives in featuretools

I want to list every built-in primitive in Featuretool without skip("..."). I know I can use list_primitives() but I don't know how to show everything. import featuretools as ft print(ft.primitives.list_primitives()) #show list of primitives with…
iSdWiSoWt
  • 15
  • 5
1
vote
1 answer

Can we use Feature Engineering tools without any IDENTIFIER?

My target feature(frame strength) is not an unique value. I have train and test dataset. How can I approach to use Ft? My datasets feature are temperature, hive size, some percentile values, some entropy, different Pixel, Frame size etc.. I tried to…
HMI
  • 11
  • 1
1
vote
1 answer

Featuretools group by issue

I have a set of dataframes/entity set for rugby league/sports data: players, teams, venues, games, team_stats and player_stats players: player_id, player_name teams: team_id, team_name games: game_id, venue_id venues: venue_id,…
apapa2234
  • 11
  • 1
1
vote
1 answer

Error trying to apply app_types to logical_types of add_dataframe at once

app_types {'TARGET': Boolean, 'FLAG_MOBIL': Boolean, 'FLAG_EMP_PHONE': Boolean, 'FLAG_WORK_PHONE': Boolean, 'FLAG_CONT_MOBILE': Boolean, 'FLAG_PHONE': Boolean, 'FLAG_EMAIL': Boolean, 'REG_REGION_NOT_LIVE_REGION': Boolean, …
I guaranteed
  • 259
  • 1
  • 9
1
vote
1 answer

Use primitive_options on Featuretools to calc feature_matrix

I have a dataset with more than 30.000 rows like the picture below and need to generate some features with the featuretools library. import pandas as pd import featuretools as ft # Read in the full dataset df_data =…
Peter29
  • 21
  • 4
1
vote
1 answer

Enable/force featuretools to use 2 or more columns to group by

How to enable or force featuretools to create featuretools groupby features using 2 or more columns as group bys. For example I have columns x, y, z how to set groupby_primitives_options or something to get feature func(func(x) groupby y, z) ?
1
vote
1 answer

Features Created by FeatureTools Build Inconsistent Models

I have an imbalanced dataset which has 200 million data from class 0 and 8000 data from class 1. I followed two different approaches to build a model. Randomly sample a new dataset which has a ratio of 1:4. Meaning 32000 from class 0 and 8000 from…
1
vote
0 answers

Featuretools taking too long to build features without using CPU cores

I'm using featuretools Deep Feature Sintesys to build features for a dataset of 40k rows and 200 columns. I choose about 40 transformation primitivies, as you can see in the code bellow: feature_matrix, feature_defs = ft.dfs(entityset=es,…