Questions tagged [featuretools]

Featuretools is a Python library for automated feature engineering on relational datasets using a technique called Deep Feature Synthesis.

Featuretools is an open source python library for automated feature engineering for tabular relational datasets.

Resources

221 questions
1
vote
1 answer

Manually define "where clause" of a seed features?

When using ft.dfs to get feature definitions, the where_primitives parameter filters values based on interesting variables of an entity. Is it possible to also manually define the "where clause" of a seed feature?
Jeff Hernandez
  • 2,063
  • 16
  • 20
1
vote
1 answer

Data updates with featuretools / DFS

In the ML 2.0 and AI PM papers it implies update data - which could be either existing data or new data - happens dynamically (in real-time). For example, in the AI PM paper it says, "Rather, we have demonstrated a complete system that works in the…
Henry Thornton
  • 4,381
  • 9
  • 36
  • 43
0
votes
1 answer

Featuretools failed to load plugin tsfresh from library featuretools_tsfresh_primitives.__init__

I'm trying to make featuretools and featuretools_tsfresh_primitives in my Jupyter notebook environment. I installed both library using conda conda install -c conda-forge featuretools conda install -c conda-forge…
J. Maria
  • 362
  • 3
  • 14
0
votes
1 answer

AttributeError: Cutoff time DataFrame must contain a column with either the same name as the target dataframe index or a column named "instance_id"

I'm learning how to use Featuretools with this tutorial and I've made it to a snippet which is right below this paragraph: from featuretools.tsfresh import CidCe import featuretools as ft fm, features = ft.dfs( entityset=es, …
J. Maria
  • 362
  • 3
  • 14
0
votes
1 answer

featuretools basic aggegration on time measures

I am using featuretools (1.1x version), I read the docs,and also searched here but still struggle to find how to do simple things like SELECT MIN(datetime_field_1).. I also checked list_primitives() those related to time seem not what I need, I can…
0
votes
1 answer

How to implement Featuretools into my ML Process?

I am exploring the possibility of implementing Featuretools into my pipeline, to be able to create new features from my Df. Currently I am using a GridSearchCV, with a Pipeline embedded inside it. Since Featuretools is creating new features with…
0
votes
0 answers

Featuretools deep feature synthesis doesn't generate features

I'm using 3 datasets to create EntitySet using featuretools and use deep feature synthesis to generate additional features: entity_set = ft.EntitySet("basketball_players") entity_set.add_dataframe(dataframe_name="player_data", …
Rikki Tikki Tavi
  • 3,089
  • 5
  • 43
  • 81
0
votes
1 answer

IndexError: Index contains null values when adding dataframe to featuretools EntitySet

I have my dataframe which I want to add to EntitySet: Unnamed: 0 Year name Pos Age Tm G GS \ 24672 24672 2017.0 Troy Williams SF 22.0 TOT 30.0 16.0 24675 24675 2017.0 Kyle Wiltjer …
Rikki Tikki Tavi
  • 3,089
  • 5
  • 43
  • 81
0
votes
1 answer

Create integer unique keys in 3 dataframes for rows with same names to generate automatic features using featuretools

I have three different data frames with basketball players' data. In all three dataframes there are basketball players' names. I want to join all three dataframes into one EntitySet to use automatic feature generation using featuretools. As I…
Rikki Tikki Tavi
  • 3,089
  • 5
  • 43
  • 81
0
votes
2 answers

featuretools: got an error "AttributeError: 'DataFrame' object has no attribute 'ww'"

when i try to use featuretools[spark] on pyspark dataframe my code are bellow: import featuretools as ft import pyspark.pandas as ps from woodwork.logical_types import Double, Integer ps.set_option("compute.default_index_type", "distributed") id =…
hailee
  • 9
  • 1
0
votes
1 answer

How to compare "raw" joins to the output of deep feature synthesis in Featuretools?

Is it possible to get the results someone would get from deep feature synthesis, but without any aggregations? I have some small datasets, and I want to be able to compare the "processed" outputs of deep feature synthesis with the "raw" joined…
jmatsen
  • 103
  • 3
  • 7
0
votes
1 answer

How do I flatten a featuretools entity set to get wide input format?

I have an entity set with relations defined. Is there a method to get a left joined version of all the data frames in entities as we already have relations? I can merge the dataframes outside using pandas but would like to leverage well defined…
0
votes
2 answers

How to apply featuretools to output of featuretools?

I want to create complex features like [(a-b)/c or (a-b)/a] This can be achieved by running feature tools multiple times so that first one creates features like a-b or a+b or a/b and then next run would create more complex features. As I try to do…
Hyphen
  • 500
  • 1
  • 5
  • 15
0
votes
1 answer

How to create features for multiple datetime columns in featuretools?

Sorry to put three question in one issue. How to create features for multiple datatime columns? I got a dataframe with multiple datetime columns, and hoped to create features like TimeSinceFirst, TimeSinceLast for all of them. But with only one of…
dehiker
  • 454
  • 1
  • 8
  • 21
0
votes
1 answer

when combining features and then aggregating them featuretools returns some variables that don't make sense, how can this be avoided?

I've got a dataset that contains invoices, with a unique identifier, and customers with a unique identifier. Each customer can have 1 or more invoices. I set up the entity sets as follows: es = ft.EntitySet(id="data") es = es.add_dataframe( …
Sole Galli
  • 827
  • 6
  • 21