I am trying to create a custom TransformPrimitive in Featuretools to calculate rolling statistics like the rolling sum or mean.
This article explains well how to go about such task using Pandas. It shows how to get things running when using the 'window' parameter to represent the number of observations used for calculating the statistic.
However, I intend to provide a string input to calculate an offset in days. Below line calculates correctly what I need, conceptually.
transactions.groupby('ID').rolling(window='10D', on='TransactionDate')[['Quantity','AmountPaid']].sum()
The TransformPrimitive looks as follows:
class RollingSum(TransformPrimitive):
"""Calculates the rolling sum.
Description:
Given a list of values, return the rolling sum.
"""
name = "rolling_sum"
input_types = [NaturalLanguage,NaturalLanguage]
return_type = Numeric
uses_full_entity = True
description_template = "the rolling sum of {}"
def __init__(self, window=None, on=None):
self.window = window
self.on = on
def get_function(self):
def rolling_sum(values):
"""method is passed a pandas series"""
return values.rolling(window=self.window, on=self.on).sum()
return rolling_sum
I tried to pass the TransactionDate variable from the entityset:
features_defs = ft.dfs(
entityset=es,
max_depth=2,
target_entity='CUSTOMER',
agg_primitives=['sum'],
groupby_trans_primitives=[
RollingSum(window='10D', on=es['TRANSACTION']['TransactionDate'])
],
cutoff_time = label_times,
cutoff_time_in_index=False,
include_cutoff_time=False,
features_only=True
)
But without success. I am getting the Unused Primitive Warning:
Some specified primitives were not used during DFS: groupby_trans_primitives: ['rolling_sum'] This may be caused by a using a value of max_depth that is too small, not setting interesting values, or it may indicate no compatible variable types for the primitive were found in the data. warnings.warn(warning_msg, UnusedPrimitiveWarning)
Many thanks for your suggestions!