0

I am trying to bring together the following tutorials:

  1. Creating decision tree by hand
  2. Custom layers via subclassing
  3. Composing Decision Forest and Neural Network models

The goal is to 1. Create a custom tree, 2. Embed it into a custom layer and 3. combine it in a model with other layers.

The problem is that in step 1. by using the RandomForestBuilder, the model is serialized and deserialized resulting in object of type keras.saving.saved_model.load.CoreModel

However, the tutorial in step 3. embeds the tree layer via tfdf.keras.RandomForestModel

Ideally, the custom layer would create the custom tree by calling RandomForestBuilder in its constructor, however, this is not straightforward given the exporting and loading of the model.

The below gives error for the structure of the input layer and if the former is omitted gives error for no matching concrete function to call loaded from the SavedModel:

Step 1:

builder = tfdf.builder.RandomForestBuilder(
    path="/tmp/manual_model",
    objective = tfdf.py_tree.objective.RegressionObjective(label='tree_result')
)

Tree = tfdf.py_tree.tree.Tree
SimpleColumnSpec = tfdf.py_tree.dataspec.SimpleColumnSpec
ColumnType = tfdf.py_tree.dataspec.ColumnType
RegressionValue = tfdf.py_tree.value.RegressionValue

NonLeafNode = tfdf.py_tree.node.NonLeafNode
LeafNode = tfdf.py_tree.node.LeafNode
NumericalHigherThanCondition = tfdf.py_tree.condition.NumericalHigherThanCondition
CategoricalIsInCondition = tfdf.py_tree.condition.CategoricalIsInCondition

tree = Tree(
    NonLeafNode(
        condition=CategoricalIsInCondition(
            feature=SimpleColumnSpec(name='feature_name', type=ColumnType.CATEGORICAL),
            mask=['class_1'],
            missing_evaluation=False
        ),
        pos_child = LeafNode(value=RegressionValue(value=0.5)),
        neg_child = LeafNode(value=RegressionValue(value=0.6))
    )
)

builder.add_tree(tree)
builder.close()
custom_tree = tf.keras.models.load_model("/tmp/manual_model")

Step 2:

class CustomTree(tf.keras.layers.Layer):
  def __init__(self, custom_tree):
    super(CustomTree, self).__init__()
    self.custom_tree = custom_tree

  def call(self, inputs):
    return self.custom_tree(inputs)


input_layer = tf.keras.layers.Input(shape=(None,), name='feature_name', dtype=tf.string)
output_layer = CustomTree(custom_tree)(input_layer)

model = tf.keras.models.Model(input_layer, output_layer, name='SomeModel')

model.predict(tf.data.Dataset.from_tensor_slices(
    {'feature_name': ['class_1','class_2']}
).batch(1))
Konstantin
  • 396
  • 3
  • 19

1 Answers1

0

Found the following solution:

  1. Export the custom tree in yggdrasil format:

     builder = tfdf.builder.RandomForestBuilder(
         model_format = tfdf.builder.ModelFormat.YGGDRASIL_DECISION_FOREST,
         path=self._temp_directory,
         objective = objective)
    
  2. After saving the model by closing the builder load the tree as CoreModel:

     builder.close()
    
     inspector_lib = tfdf.component.inspector.inspector
     inspector = inspector_lib.make_inspector(self._temp_directory, file_prefix=None)
    
     custom_tree = tfdf.keras.CoreModel(
         task=objective.task,
         learner='MANUAL',
         ranking_group=None,
         temp_directory=self._temp_directory)
    
     custom_tree._set_from_yggdrasil_model(
         inspector,
         self._temp_directory,
         file_prefix=None, 
         input_model_signature_fn=builder._input_model_signature_fn)
    
  3. Create a model subclass of tfdf.keras.CartModel with custom fit function that assigns the custom tree structure to the model:

    class CustomDT(tfdf.keras.CartModel):
        def __init__(self, name=None, **kwargs):
            super(CustomDT, self).__init__(name=name, **kwargs)
    
        def fit(self):
            custom_tree = self.create_custom_tree()
            self._model = custom_tree._model
            self._is_trained.assign(True)
            self._semantics = custom_tree._semantics
    
Konstantin
  • 396
  • 3
  • 19