Change Keras Distiller() class so that the student and teacher model can have two different inputs

Question

I am doing my Master's thesis in Multimodal Emotion Recognition, more precisely, I want to apply knowledge distillation from a multimodal teacher model to an unimodal student model. I am using Keras Distiller() class (https://keras.io/examples/vision/knowledge_distillation/) and following the instructions presented in the final of the page. However, my teacher model receives as input the resulting tensor of the concatenation between text features (text_shape = (1038, 33, 600)) and audio features (audio_shape = (1038, 33, 600)), resulting in the final_shape = (1038, 33, 1200) while my student model receives as input a tensor of shape (1038, 33, 100) and my label shape is label_shape= (1038, 33, 3). As you can see, my inputs are different on my two models and the Distiller() class is predefined to work with the same input for both models and that is what I am trying to change.

The first thing I tried to change in the keras class was to pass in the beggining of def train_step from this:

def train_step(self, data):
        # Unpack data
        x, y = data

to this:

def train_step(self, data):
        # Unpack data
        x, y = data
        x_teacher = x[0]
        x_student = x[1]

and then change the variable x in the rest of the code for x_teacher in the parts about the teacher model and for x_student when it was about student model. Then, I run the following code to train my student model:

text_distilled = Distiller(student=text_student, teacher=bimodal_teacher)
     text_distilled.compile(
        optimizer=keras.optimizers.Adam(),
        metrics=[keras.metrics.SparseCategoricalAccuracy()],
        student_loss_fn=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        distillation_loss_fn=keras.losses.KLDivergence(),
        alpha=0.1,
        temperature=10,
)

    both_inputs = [x_teacher, x_student]

    text_distilled.fit(both_inputs, y_train, epochs=3)

and I got this error:

ValueError: Can not squeeze dim[2], expected a dimension of 1, got 3 for '{{node Squeeze}} = Squeeze[T=DT_FLOAT, squeeze_dims=[-1]](IteratorGetNext:2)' with input shapes: [?,33,3].

After that, I decided to resort to Chatgpt to what it told me to change the beggining of def train_Step to this:

def train_step(self, data):
        # Unpack data
        x_teacher, x_student, y = data

Then, I called the fit() function:

text_distilled = Distiller(student=text_student, teacher=bimodal_teacher)
     text_distilled.compile(
         optimizer=keras.optimizers.Adam(),
         metrics=[keras.metrics.SparseCategoricalAccuracy()],
         student_loss_fn=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
         distillation_loss_fn=keras.losses.KLDivergence(),
         alpha=0.1,
         temperature=10,
)

text_distilled.fit((x_teacher, x_student, y_train), epochs=3)

But I got this error:

ValueError: not enough values to unpack (expected 3, got 1)

I don't know what to try anymore neither how to solve this errors for both situations. Any help?

Change Keras Distiller() class so that the student and teacher model can have two different inputs

0 Answers0