I am trying to reproduce the featuretools tutorial (See link below). I am using the mocking data provided in the package. They include a customers table and a sessions table. Every customer has many sessions. Every session has a session_start timestamp. I compute the mean of the primitive time_since_previous of the feature session_start a) using featuretools and b) manually. But I get different results, where am I wrong?
a) Calculation using featuretools:
import featuretools as ft
es = ft.demo.load_mock_customer(return_entityset=True)
feature_matrix, features_defs = ft.dfs(
entityset=es,
target_entity='customers',
agg_primitives=['mean'],
trans_primitives=['time_since_previous'])
The MEAN(sessions.TIME_SINCE_PREVIOUS(session_start)) for customer 3 is 888.333333
b) Manual calculation:
time_since_previous(sessions[sessions.customer_id == 3].session_start).tolist()
[nan, 10075.0, 3900.0, 1625.0, 8710.0, 1170.0]
statistics.mean([ 10075.0, 3900.0, 1625.0, 8710.0, 1170.0])
5096.0
https://docs.featuretools.com/en/stable/automated_feature_engineering/primitives.html