2

I'm trying to write a seed feature that produces reward if place == 1 else 0.

place and reward are both ft.variable_types.Numeric:

Entity: results
  Variables:
    id (dtype: index)
    place (dtype: numeric)
    reward (dtype: numeric)

I've tried the following alternatives with no luck:

Alternative 1

roi = (ft.Feature(es['results']['reward'])
       if (ft.Feature(es['results']['place']) == 1)
       else 0).rename('roi')

produces AssertionError: Column "roi" missing frome dataframe when generating the features.

Alternative 2

roi = ((ft.Feature(es['results']['place']) == 1) *
       ft.Feature(es['results']['reward'])).rename('roi')

produces AssertionError: Provided inputs don't match input type requirements when assigning the seed feature.

Alternative 2 should work since in Python:

>>> True * 3.14
3.14
>>> False * 3.14
0.0

The full stack trace:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-211-94dd07d98076> in <module>()
     23 
     24
---> 25 roi = ((ft.Feature(es['results']['place']) == 1) * ft.Feature(es['results']['reward'])).rename('roi')

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __mul__(self, other)
    287     def __mul__(self, other):
    288         """Multiply by other"""
--> 289         return self._handle_binary_comparision(other, primitives.MultiplyNumeric, primitives.MultiplyNumericScalar)
    290 
    291     def __rmul__(self, other):

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in _handle_binary_comparision(self, other, Primitive, PrimitiveScalar)
    230     def _handle_binary_comparision(self, other, Primitive, PrimitiveScalar):
    231         if isinstance(other, FeatureBase):
--> 232             return Feature([self, other], primitive=Primitive)
    233 
    234         return Feature([self], primitive=PrimitiveScalar(other))

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __new__(self, base, entity, groupby, parent_entity, primitive, use_previous, where)
    755                                                primitive=primitive,
    756                                                groupby=groupby)
--> 757             return TransformFeature(base, primitive=primitive)
    758 
    759         raise Exception("Unrecognized feature initialization")

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, base_features, primitive, name)
    660                                                relationship_path=RelationshipPath([]),
    661                                                primitive=primitive,
--> 662                                                name=name)
    663 
    664     @classmethod

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, entity, base_features, relationship_path, primitive, name, names)
     56         self._names = names
     57 
---> 58         assert self._check_input_types(), ("Provided inputs don't match input "
     59                                            "type requirements")
     60 

AssertionError: Provided inputs don't match input type requirements
Timothy
  • 27
  • 1
  • 5
  • alternative 2 should work. can you do `print(es["results"])` and share the output? – Max Kanter Sep 18 '19 at 16:49
  • also share the full stack trace please – Max Kanter Sep 18 '19 at 16:50
  • @MaxKanter I've included both the stack trace and the type information. – Timothy Sep 18 '19 at 23:27
  • the issue here is actually that the multiply primitive only supports Numeric inputs, while it should probably also allow booleans. I've created an issue in Featuretools here for us to fix this. Should be done by the next release. https://github.com/Featuretools/featuretools/issues/752 – Max Kanter Sep 22 '19 at 20:29

1 Answers1

1

This should work on featuretools v0.11.0. Here is an example using a demo dataset. Both unit_price and total are numeric.

import featuretools as ft

es = ft.demo.load_retail(nrows=100)
es['order_products']
Entity: order_products
  Variables:
    ...
    unit_price (dtype: numeric)
    total (dtype: numeric)
    ...

I create the seed feature.

unit_price = ft.Feature(es['order_products']['unit_price'])
total = ft.Feature(es['order_products']['total'])
seed = ((total == 1) * unit_price).rename('seed')

Then, calculate the feature matrix.

fm, fd = ft.dfs(target_entity='customers', entityset=es, seed_features=[seed])
fm.filter(regex='seed').columns.tolist()[:5]
['SUM(order_products.seed)',
 'STD(order_products.seed)',
 'MAX(order_products.seed)',
 'SKEW(order_products.seed)',
 'MIN(order_products.seed)']

In your case, this would be the seed feature.

place = ft.Feature(es['results']['place'])
reward = ft.Feature(es['results']['reward'])
roi = ((reward == 1) * place).rename('roi')

Let me know if that helps.

Jeff Hernandez
  • 2,063
  • 16
  • 20