2

I'm trying to figure how to implement a weighted cum sum primitive for Featuretools. The weighting shall depend on time_since_last like

cum_sum (amount) = sum_{i} exp( -a_{i} ) * amount_{i}

where i are rolling 6 Month periods....


above you find the original question. after a while of try and error I came up with this code for my purpose:

using the data and initial setup for entity and relation from here

    def weight_time_until(array, time):
        diff = pd.DatetimeIndex(array) - time
        s = np.floor(diff.days/365/0.5)
        aWidth = 9
        a = math.log(0.1) / ( -(aWidth -1) )

        w = np.exp(-a*s) 

        return w

    WeightTimeUntil = make_trans_primitive(function=weight_time_until,
                                     input_types=[Datetime],
                                     return_type=Numeric,
                                     uses_calc_time=True,
                                     description="Calc weight using time until the cutoff time",
                                     name="weight_time_until")


features, feature_names = ft.dfs(entityset = es, target_entity = 'clients', 
                                 agg_primitives = ['sum'],
                                 trans_primitives = [WeightTimeUntil, MultiplyNumeric]) 

when I does above I came close to the feature I want but at the end I did not get it right which I do not understand. So I got feature

SUM(loans.WEIGHT_TIME_UNTIL(loan_start))

but not

SUM(loans.loan_amount * loans.WEIGHT_TIME_UNTIL(loan_start))

What did I miss here???


I tried further....

My guess was a type miss match! but the "types" are the same. Anyway I tried the following:

1) es["loans"].convert_variable_type("loan_amount",ft.variable_types.Numeric) 2) loans["loan_amount_"] = loans["loan_amount"]*1.0

For (1) as well for (2) I get the more promising resulting feature:

loan_amount_ * WEIGHT_TIME_UNTIL(loan_start)

and also

loan_amount * WEIGHT_TIME_UNTIL(loan_start)

but only when I have the target value = loans instead of clients which actually was not my intention.

1 Answers1

2

This primitive doesn't currently exist. However, you can create your own custom primitive to accomplish this calculation.

Here is an example calculating the rolling sum, which can be updated to do a weighted sum using the appropriate pandas or python method

from featuretools.primitives import TransformPrimitive
from featuretools.variable_types import Numeric

class RollingSum(TransformPrimitive):
    """Calculates the rolling sum.

    Description:
        Given a list of values, return the rolling sum.
    """

    name = "rolling_sum"
    input_types = [Numeric]
    return_type = Numeric
    uses_full_entity = True

    def __init__(self, window=1, min_periods=None):
        self.window = window
        self.min_periods = min_periods

    def get_function(self):
        def rolling_sum(values):
            """method is passed a pandas series"""
            return values.rolling(window=self.window, min_periods=self.min_periods).sum()

        return rolling_sum
Max Kanter
  • 2,006
  • 6
  • 16
  • 1
    Thank you for sharing the code. Actually while I was getting more used to featuretools I found that my question did not aim my target feature. But anyway, the class above is that what make_trans_primitive does? – Filip Floegel Jun 27 '19 at 11:37
  • yep it is like `make_trans_primitive`, but it is easier to define the parameters like `window` and `min_periods` using the class approach – Max Kanter Jun 27 '19 at 14:01