0

so I am training an Efficientnet B0 model on Keras. The goal is to implement this model on an asic chip to do the inference on hardware. The Efficientnet model has a Squeeze & Excitation block that has learnable parameters that tell us about the importance of each channel. As the goal is to do the training on the pyhton model and the inference on the hardware, I want to extract these channel-wise importance weights. The idea is that on the asic I will just multiply each channel by the corresponding learned weight.

So for example for the block 2a I have this in the model summary :

block2a_se_squeeze (GlobalAver  (None, 96)          0           ['block2a_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block2a_se_reshape (Reshape)   (None, 1, 1, 96)     0           ['block2a_se_squeeze[0][0]']     
                                                                                                  
 block2a_se_reduce (Conv2D)     (None, 1, 1, 4)      388         ['block2a_se_reshape[0][0]']     
                                                                                                  
 block2a_se_expand (Conv2D)     (None, 1, 1, 96)     480         ['block2a_se_reduce[0][0]']      
                                                                                                  
 block2a_se_excite (Multiply)   (None, 56, 56, 96)   0           ['block2a_activation[0][0]',     
                                               

My question is how to get those weights? I tried se_excite.get_weights I get an empty list. And se_expand.get_weights just gives me the filter kernels and biases. Thank you.

Albert Einstein
  • 7,472
  • 8
  • 36
  • 71
  • The SE output is input-dependent, there are no fixed "importance weights". – xdurch0 Jul 21 '23 at 04:46
  • Ah okay thanks I thought these were learnable parameters like the other weights. So actually my biggest priority is the surface area of the hardware which means that the less blocks I have the better. Do you think it's worth it to not have SE blocks at all when I implement the model on an ASIC chip to do the inference? I mean when the training is done with SE blocks and the inference without SE blocks does it greatly impact the performance of the model? – Wassim Chaabani Jul 21 '23 at 12:29

0 Answers0