0

I am doing my Master's thesis in Multimodal Emotion Recognition, more precisely, I want to apply knowledge distillation from a multimodal teacher model to an unimodal student model. I am using Keras Distiller() class (https://keras.io/examples/vision/knowledge_distillation/) and following the instructions presented in the final of the page. However, my teacher model receives as input the resulting tensor of the concatenation between text features (text_shape = (1038, 33, 600)) and audio features (audio_shape = (1038, 33, 600)), resulting in the final_shape = (1038, 33, 1200) while my student model receives as input a tensor of shape (1038, 33, 100) and my label shape is label_shape= (1038, 33, 3). As you can see, my inputs are different on my two models and the Distiller() class is predefined to work with the same input for both models and that is what I am trying to change.

The first thing I tried to change in the keras class was to pass in the beggining of def train_step from this:

def train_step(self, data):
        # Unpack data
        x, y = data

to this:

def train_step(self, data):
        # Unpack data
        x, y = data
        x_teacher = x[0]
        x_student = x[1]

and then change the variable x in the rest of the code for x_teacher in the parts about the teacher model and for x_student when it was about student model. Then, I run the following code to train my student model:

text_distilled = Distiller(student=text_student, teacher=bimodal_teacher)
     text_distilled.compile(
        optimizer=keras.optimizers.Adam(),
        metrics=[keras.metrics.SparseCategoricalAccuracy()],
        student_loss_fn=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        distillation_loss_fn=keras.losses.KLDivergence(),
        alpha=0.1,
        temperature=10,
)

    both_inputs = [x_teacher, x_student]

    text_distilled.fit(both_inputs, y_train, epochs=3)
and I got this error:

ValueError: Can not squeeze dim[2], expected a dimension of 1, got 3 for '{{node Squeeze}} = Squeeze[T=DT_FLOAT, squeeze_dims=[-1]](IteratorGetNext:2)' with input shapes: [?,33,3].

After that, I decided to resort to Chatgpt to what it told me to change the beggining of def train_Step to this:

def train_step(self, data):
        # Unpack data
        x_teacher, x_student, y = data

Then, I called the fit() function:

text_distilled = Distiller(student=text_student, teacher=bimodal_teacher)
     text_distilled.compile(
         optimizer=keras.optimizers.Adam(),
         metrics=[keras.metrics.SparseCategoricalAccuracy()],
         student_loss_fn=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
         distillation_loss_fn=keras.losses.KLDivergence(),
         alpha=0.1,
         temperature=10,
)

text_distilled.fit((x_teacher, x_student, y_train), epochs=3)

But I got this error:

ValueError: not enough values to unpack (expected 3, got 1)

I don't know what to try anymore neither how to solve this errors for both situations. Any help?

0 Answers0