Federated Learning - Why does Global Model goes to 10k+?

Question

I am currently working on a Personalized Federated Learning Project. Everything has been going well. I am getting around 95% of average local model accuracy and 93% for global model on a binary classification task. In addition, recall and precision are both returning good numbers. The exact details of the model can be found in this paper.

However! When considering the loss, the local models are at around 0.05~0.1, but the global model goes as high as 10k! Each time the global model aggregates the weights from local models, the loss goes higher and higher. Why is that?

This is my current model architecture:

def createModel(shape=(100,100,3)):
  tf.random.set_seed(3333)
  model = Sequential()
  model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=shape))
  model.add(layers.MaxPooling2D((2, 2)))
  model.add(layers.Conv2D(64, (3, 3), activation='relu'))
  model.add(layers.MaxPooling2D((2, 2)))
  model.add(layers.Conv2D(64, (3, 3), activation='relu'))
  model.add(layers.Flatten())
  model.add(layers.Dense(64, activation='relu'))
  model.add(layers.Dropout(0.3))
  model.add(layers.Dense(1, activation='sigmoid'))

  model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=[tf.keras.metrics.Recall(),
                       tf.keras.metrics.Precision(),
                       tf.keras.metrics.AUC(
                                                num_thresholds=200, 
                                                curve='ROC',
                                                summation_method='interpolation', 
                                                name=None, 
                                                dtype=None,
                                                thresholds=None, 
                                                multi_label=False, 
                                                label_weights=None
                                            ),
                       'accuracy']) 
  return model

Dataset details: Set of around 20,000 pictures of people either wearing masks or without masks.

Is this normal to happen in personalized federated learning environment? If not, any suggestions on where to look? Could it possibly be the model?

I have tried playing around with hyperparameters, for around 5 hours, they all returned same results.

Federated Learning - Why does Global Model goes to 10k+?

0 Answers0