0

Here is the code to reproduce this issue, but it can be avoided by removing "orders" entity.

import featuretools as ft
import pandas as pd
import numpy as np
df = pd.DataFrame({'member_id': ['AAA', 'AAA',  'AAA', 'AAA', 'AAA',  'JJJ', 'JJJ', 'JJJ'],
                   'order_id': ['0001','0001','0001','0002','0002','1111','1111','1111'],
                   'order_datee': ['2011-01-01','2011-01-01','2011-01-01','2014-01-01','2014-01-01','2013-01-01','2013-01-01','2013-01-01'],
                   'member_join_datee': ['2011-01-01','2011-01-01','2011-01-01','2011-01-01','2011-01-01','2012-01-01','2012-01-01','2012-01-01'],
                   'goods_no':['id1','id2','id3','id4','id5','id6','id7','id8'],
                   'amount': [1,     2,      4,     8,      16,     32,   64,     128],
                   'order_amount': [7,     7,      7,     24,      24,     224,   224,     224],
                   'member_lv': [1,     1,      1,     1,      1,     2,   2,     2]})
df


es = ft.EntitySet(id="abc")
es.entity_from_dataframe("purchases",
                         dataframe = df,
                         index = "purchases_index",
                         time_index = 'order_datee',
                         variable_types = {'order_datee': ft.variable_types.Datetime,
                                           'member_join_datee': ft.variable_types.Datetime,
                                           'amount': ft.variable_types.Numeric,
                                           'order_amount': ft.variable_types.Numeric,
                                           'member_lv': ft.variable_types.Numeric,
                                           })

es.normalize_entity(new_entity_id='members',
                    base_entity_id='purchases',
                    index='member_id',
                    make_time_index = 'member_join_datee',
                    additional_variables=['member_join_datee','member_lv'])

es.normalize_entity(new_entity_id='orders',
                    base_entity_id='purchases',
                    index='order_id',
                    make_time_index = 'order_datee',
                    additional_variables=['order_datee','order_amount'])

fm,features = ft.dfs(entityset=es, target_entity='members')

Traceback (most recent call last):
  File "/.../python3.6/site-packages/featuretools/entityset/entityset.py", line 1204, in _import_from_dataframe
    raise LookupError('Time index not found in dataframe')
bunbun
  • 2,595
  • 3
  • 34
  • 52
alan
  • 13
  • 1
  • Please read how to write a [Minimal, Verifiable, and Complete Example](https://stackoverflow.com/help/mcve) and [How to Ask](https://stackoverflow.com/help/how-to-ask). – 3ocene Dec 13 '18 at 02:15

2 Answers2

2

The problem is the line additional_variables=['order_datee','order_amount']). This moves the order_datee column from the purchases entity to the orders entity. To copy it to the purchase entity without removing from the orders entity, you should use copy_variables. For example

es.normalize_entity(new_entity_id='orders',
                    base_entity_id='purchases',
                    index='order_id',
                    make_time_index = 'order_datee',
                    copy_variables=["order_datee"],
                    additional_variables=['order_amount'])

After I make that change your code runs for me.

Max Kanter
  • 2,006
  • 6
  • 16
-1

This problem is gone after deleting 'order_datee' from time_index and additional_variables for entity 'orders'.

alan
  • 13
  • 1