0

I have been facing following error while iterating through a a range of neurons and activation functions. The error is encoutered only in case of Prelu and none of the other activation functions result in the same affect. Neither am I able to deduce any logical reasoning behind this phenomena nor am I able to find any related question on online forums.

If I write this algorithm with only Prelu as a choice of activation function and dont iterate through rest of the activation functions list, i am able to run the entire code without any problems. The first iteration happens smoothly but this error is faced only during second iteration irrespective of the range of neurons I define while iterating thorugh the list of activation functions. Code seems to be working properly if I dont loop through the activation functions (While looping, it works fine with ELU, LeakyRelu and Relu in iteration but for Prelu, if I do fresh algorithm only then it works well.

PLEASE HELP IN POINTING OUT WHAT IS IT I AM MISSING HERE.

The code was written for a personal project where I am trying to figure out the optimum number of neurons with their activation function to increase model accuracy (Using only Dense network) using MNIST database to identify the handwritten numbers and is run after dev split from test set, normalizing the x data and reshaping y data.

act_func= ['relu', 'leakyrelu', 'elu', 'prelu']

train_acc_list= []
val_acc_list= []

for i in range(len(act_func)):
  print()
  print(f'{act_func[i]}:'.upper())

  if act_func[i]== 'relu':
    act_layer= ReLU()

  elif act_func[i]== 'leakyrelu':
    act_layer= LeakyReLU()

  elif act_func[i]== 'prelu':
    act_layer= PReLU()

  else:
    act_layer= ELU()

  for j in range(2,10):
    input_layer= Input(shape= x_train_model.shape[1:])
    x= Flatten()(input_layer)

    x= Dense(units= 2**j)(x)
    x= BatchNormalization()(x)
    x= act_layer(x)
    output_layer= Dense(units= len(set(list(y_train))), activation= 'softmax')(x)

    model= Model(input_layer, output_layer)
    model.compile(optimizer= 'adam', loss= 'categorical_crossentropy', metrics= ['accuracy'])

    model.fit(x_train_model, y_train_cat, epochs= 5, verbose= 0, shuffle= True, 
validation_data=(x_dev_model, y_dev_cat), initial_epoch= 0)

train_acc= model.history.history['accuracy'][-1]
val_acc= model.history.history['val_accuracy'][-1]
train_loss= model.history.history['loss'][-1] 
val_loss= model.history.history['val_loss'][-1]

print(f'  -- With {2**j} neurons:')
print(f'    -- Training Accuracy: {train_acc}; Validation Accuracy: {val_acc}')
print(f'    -- Training Loss: {train_loss}; Validation Loss: {val_loss}')
print()

Following is the output I recieve:

RELU:
  -- With 4 neurons:
    -- Training Accuracy: 0.8163666725158691; Validation Accuracy: 0.859000027179718
    -- Training Loss: 0.602790892124176; Validation Loss: 0.4827583134174347

  -- With 8 neurons:
    -- Training Accuracy: 0.9039833545684814; Validation Accuracy: 0.9287999868392944
    -- Training Loss: 0.32007038593292236; Validation Loss: 0.24358506500720978

  -- With 16 neurons:
    -- Training Accuracy: 0.9382333159446716; Validation Accuracy: 0.953000009059906
    -- Training Loss: 0.2060956060886383; Validation Loss: 0.1526220738887787

  -- With 32 neurons:
    -- Training Accuracy: 0.960183322429657; Validation Accuracy: 0.9685999751091003
    -- Training Loss: 0.1313430815935135; Validation Loss: 0.10673953592777252

  -- With 64 neurons:
    -- Training Accuracy: 0.9721500277519226; Validation Accuracy: 0.978600025177002
    -- Training Loss: 0.0909196212887764; Validation Loss: 0.0732741728425026

  -- With 128 neurons:
    -- Training Accuracy: 0.9785000085830688; Validation Accuracy: 0.9810000061988831
    -- Training Loss: 0.06933506578207016; Validation Loss: 0.06716640293598175

  -- With 256 neurons:
    -- Training Accuracy: 0.9824666380882263; Validation Accuracy: 0.9815999865531921
    -- Training Loss: 0.05626334995031357; Validation Loss: 0.07091742753982544

  -- With 512 neurons:
    -- Training Accuracy: 0.9833499789237976; Validation Accuracy: 0.9810000061988831
    -- Training Loss: 0.052122920751571655; Validation Loss: 0.06138516217470169


LEAKYRELU:
  -- With 4 neurons:
    -- Training Accuracy: 0.8113499879837036; Validation Accuracy: 0.8500000238418579
    -- Training Loss: 0.6125895977020264; Validation Loss: 0.5022389888763428

  -- With 8 neurons:
    -- Training Accuracy: 0.9067999720573425; Validation Accuracy: 0.9309999942779541
    -- Training Loss: 0.3234350383281708; Validation Loss: 0.24852009117603302

  -- With 16 neurons:
    -- Training Accuracy: 0.9354000091552734; Validation Accuracy: 0.9491999745368958
    -- Training Loss: 0.22288773953914642; Validation Loss: 0.1732458770275116

  -- With 32 neurons:
    -- Training Accuracy: 0.9517499804496765; Validation Accuracy: 0.9584000110626221
    -- Training Loss: 0.16254572570323944; Validation Loss: 0.1416129320859909

  -- With 64 neurons:
    -- Training Accuracy: 0.9621000289916992; Validation Accuracy: 0.9696000218391418
    -- Training Loss: 0.12464425712823868; Validation Loss: 0.10034885257482529

  -- With 128 neurons:
    -- Training Accuracy: 0.9683499932289124; Validation Accuracy: 0.9710000157356262
    -- Training Loss: 0.10318134725093842; Validation Loss: 0.09446839243173599

  -- With 256 neurons:
    -- Training Accuracy: 0.9714999794960022; Validation Accuracy: 0.9729999899864197
    -- Training Loss: 0.0917435809969902; Validation Loss: 0.09299325197935104

  -- With 512 neurons:
    -- Training Accuracy: 0.9722333550453186; Validation Accuracy: 0.9728000164031982
    -- Training Loss: 0.08596694469451904; Validation Loss: 0.08849668502807617


ELU:
  -- With 4 neurons:
    -- Training Accuracy: 0.8322833180427551; Validation Accuracy: 0.8740000128746033
    -- Training Loss: 0.5627966523170471; Validation Loss: 0.45966026186943054

  -- With 8 neurons:
    -- Training Accuracy: 0.9043333530426025; Validation Accuracy: 0.9229999780654907
    -- Training Loss: 0.32320892810821533; Validation Loss: 0.26275262236595154

  -- With 16 neurons:
    -- Training Accuracy: 0.9359166622161865; Validation Accuracy: 0.9517999887466431
    -- Training Loss: 0.2132144421339035; Validation Loss: 0.15689446032047272

  -- With 32 neurons:
    -- Training Accuracy: 0.9559000134468079; Validation Accuracy: 0.9679999947547913
    -- Training Loss: 0.14700110256671906; Validation Loss: 0.10875140130519867

  -- With 64 neurons:
    -- Training Accuracy: 0.9681500196456909; Validation Accuracy: 0.977400004863739
    -- Training Loss: 0.10520979762077332; Validation Loss: 0.08250507712364197

  -- With 128 neurons:
    -- Training Accuracy: 0.972266674041748; Validation Accuracy: 0.9733999967575073
    -- Training Loss: 0.0878886952996254; Validation Loss: 0.08187192678451538

  -- With 256 neurons:
    -- Training Accuracy: 0.9743833541870117; Validation Accuracy: 0.9787999987602234
    -- Training Loss: 0.08186693489551544; Validation Loss: 0.07007770985364914

  -- With 512 neurons:
    -- Training Accuracy: 0.9734333157539368; Validation Accuracy: 0.9735999703407288
    -- Training Loss: 0.08449249714612961; Validation Loss: 0.08925070613622665

PRELU:
  -- With 4 neurons:
    -- Training Accuracy: 0.8422999978065491; Validation Accuracy: 0.876800000667572
    -- Training Loss: 0.5239758491516113; Validation Loss: 0.43654322624206543

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-33-6c882528d2f3> in <module>()
----> 1 get_ipython().run_cell_magic('time', '', "\nact_func= ['relu', 'leakyrelu', 
'elu', 'prelu']\n\ntrain_acc_list= []\nval_acc_list= []\n\nfor i in 
range(len(act_func)):\n  print()\n  print(f'{act_func[i]}:'.upper())\n\n  if 
act_func[i]== 'relu':\n    act_layer= ReLU()\n\n  elif act_func[i]== 'leakyrelu':\n    
act_layer= LeakyReLU()\n\n  elif act_func[i]== 'prelu':\n    act_layer= PReLU()\n\n  
else:\n    act_layer= ELU()\n\n  for j in range(2,10):\n    input_layer= Input(shape= 
x_train_model.shape[1:])\n    x= Flatten()(input_layer)\n\n    x= Dense(units= 2**j) 
(x)\n    x= BatchNormalization()(x)\n    x= act_layer(x)\n    output_layer= Dense(units= 
len(set(list(y_train))), activation= 'softmax')(x)\n\n    model= Model(input_layer, 
output_layer)\n    model.compile(optimizer= 'adam', loss= 'categorical_crossentropy', 
metrics= ['accuracy'])\n\n    model.fit(x_train_model, y_train_cat, epochs= 5, verbose= 
0, shuffle= True, validation_data=(x_dev_model, y_dev_cat), initial_epoch= 0)\n\n    
train_acc= model.history.history['accuracy'][-1]\n    val_acc= 
model.history.history['val_accuracy'][-1]\n    train_loss= model.history.history['loss'] 
[-1] \n    val_loss= model.history.history['val_loss'][-1]\n\n    print(f'  -- With 
{2**j} neurons:')\n    print(f'    -- Training Accuracy: {train_acc}; Validation 
Accuracy: {val_acc}')\n    print(f'    -- Training Lo...

4 frames
<decorator-gen-53> in time(self, line, cell, local_ns)

<timed exec> in <module>()

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in 
_create_c_op(graph, node_def, inputs, control_inputs, op_def)
   2011   except errors.InvalidArgumentError as e:
   2012     # Convert to ValueError for backwards compatibility.
-> 2013     raise ValueError(e.message)
   2014 
   2015   return c_op

ValueError: Exception encountered when calling layer "p_re_lu_15" (type PReLU).

Dimensions must be equal, but are 4 and 8 for '{{node p_re_lu_15/mul}} = Mul[T=DT_FLOAT] 
(p_re_lu_15/Neg, p_re_lu_15/Relu_1)' with input shapes: [4], [?,8].

Call arguments received:
  • inputs=tf.Tensor(shape=(None, 8), dtype=float32)

1 Answers1

0

This exception happens with Orelu since it is different from other activation functions and tries to optimize its weight (Multiplier with which the negative input is to be multiplied) to get the optimum result. Other activation functions dint have any weights associated to them and hence dont face this issue. For this reason, Prelu needs to be instantiated as many times as the number of iterations. Based on this logic, following slight change in code fixes the algorithm:

act_func= ['prelu','relu', 'leakyrelu', 'elu']

train_acc_list= []
val_acc_list= []

for i in range(len(act_func)):
  print()
  print(f'{act_func[i]}:'.upper())

  if act_func[i]== 'relu':
    act_layer= ReLU

  elif act_func[i]== 'leakyrelu':
    act_layer= LeakyReLU

  elif act_func[i]== 'prelu':
    act_layer= PReLU

  else:
    act_layer= ELU

  for j in range(5,10):
    input_layer= Input(shape= x_train_model.shape[1:])
    x= Flatten()(input_layer)

    x= Dense(units= 2**j)(x)
    x= BatchNormalization()(x)
    x= act_layer()(x)
    output_layer= Dense(units= len(set(list(y_train))), activation= 
'softmax')(x)

    model= Model(input_layer, output_layer)
    model.compile(optimizer= 'adam', loss= 'categorical_crossentropy', 
metrics= ['accuracy'])

    model.fit(x_train_model, y_train_cat, epochs= 5, verbose= 0, shuffle= 
True, validation_data=(x_dev_model, y_dev_cat), initial_epoch= 0)

    train_acc= model.history.history['accuracy'][-1]
    val_acc= model.history.history['val_accuracy'][-1]
    train_loss= model.history.history['loss'][-1] 
    val_loss= model.history.history['val_loss'][-1]

    print(f'  -- With {2**j} neurons:')
    print(f'    -- Training Accuracy: {train_acc}; Validation Accuracy: {val_acc}')
    print(f'    -- Training Loss: {train_loss}; Validation Loss: {val_loss}')
    print()

  print('*'*100)