-1

Could you help me? when I used feturestools, I use iris dataset, it has 4 features as follows: f1, f2, f3, f4, when I use ft.dfsI have 3 tow questions. 1. I found that feature_matrix has too much features. the 'divide_by_feature' and 'modulo_numeric' didn't act on original features individually. It firstly act divide_by_feature' then got 4 features newly, and then act 'modulo_numeric' on original features and new features. I hope the two primitives can act on original features individually. How should I do? 2. I use transform primitives like trans_primitives = ['subtract_numeric_scalar', 'modulo_numeric']. I found that subtract_numeric_scalar can pass an value, however, I don't know how to pass? 3. I wonder how to use all transform primitives? default, trans_primitives=None, by now, I can solve it like this: trans_primitives = ['is_null','diff',...], however, I think that it's trouble.

could you give me some advice? Thank you!

enter image description here

1 Answers1

0
  1. You can use max_depth to control the complexity of the features. When max_depth=1, the primitives will use only original features.

    features = ft.dfs(
        entityset=es,
        target_entity='data',
        trans_primitives=['divide_by_feature', 'modulo_numeric'],
        features_only=True,
        max_depth=1,
    )
    
    [<Feature: f1>,
    <Feature: f2>,
    <Feature: f3>,
    <Feature: f4>,
    <Feature: 1 / f3>,
    <Feature: 1 / f1>,
    <Feature: 1 / f2>,
    <Feature: 1 / f4>,
    <Feature: f1 % f2>,
    <Feature: f4 % f3>,
    <Feature: f4 % f2>,
    <Feature: f1 % f3>,
    <Feature: f2 % f4>,
    <Feature: f4 % f1>,
    <Feature: f3 % f2>,
    <Feature: f3 % f1>,
    <Feature: f2 % f1>,
    <Feature: f3 % f4>,
    <Feature: f2 % f3>,
    <Feature: f1 % f4>]
    
  2. You can create an instance of a primitive with the parameters. This is how you can pass a value to subtract_numeric_scalar.

    from featuretools.primitives import SubtractNumericScalar
    
    ft.dfs(
        ...
        trans_primitives=[SubtractNumericScalar(value=2)]
    )
    
  3. You can use all transform primitives by extracting the names from the primitive list.

    primitives = ft.list_primitives()
    primitives = primitives.groupby('type')
    transforms = primitives.get_group('transform')
    transforms = transforms.name.values.tolist()
    
    ['less_than_scalar',
    'divide_numeric',
    'latitude',
    'add_numeric',
    'week',
    'greater_than_equal_to_scalar',
    'and',
    'multiply_numeric_scalar',
    'not',
    'second',
    'greater_than_scalar',
    'modulo_numeric_scalar',
    'scalar_subtract_numeric_feature',
    'diff',
    'day',
    'cum_min',
    'divide_by_feature',
    'less_than_equal_to',
    'time_since',
    'time_since_previous',
    'cum_count',
    'year',
    'is_null',
    'num_characters',
    'equal_scalar',
    'is_weekend',
    'less_than_equal_to_scalar',
    'longitude',
    'add_numeric_scalar',
    'month',
    'less_than',
    'or',
    'multiply_boolean',
    'percentile',
    'minute',
    'not_equal_scalar',
    'greater_than_equal_to',
    'modulo_by_feature',
    'multiply_numeric',
    'negate',
    'hour',
    'cum_max',
    'greater_than',
    'modulo_numeric',
    'subtract_numeric_scalar',
    'isin',
    'cum_mean',
    'divide_numeric_scalar',
    'num_words',
    'absolute',
    'cum_sum',
    'not_equal',
    'weekday',
    'equal',
    'haversine',
    'subtract_numeric']
    

    Let me know if this helps.

Jeff Hernandez
  • 2,063
  • 16
  • 20