How to estimate the number of neurons (units) in the BERT model? Note this is different from the number of model parameters.

- 115,346
- 109
- 446
- 738

- 653
- 8
- 31
-
1Does this answer your question? [Check the total number of parameters in a PyTorch model](https://stackoverflow.com/questions/49201236/check-the-total-number-of-parameters-in-a-pytorch-model) – cronoik Mar 26 '23 at 18:27
-
1Clearly not. The number of parameters basically corresponds to the neuron (units) weights. For example, in a simple linear layer with three input units and two output units (consequently five units or neurons), we have 3x5 parameters without counting the biases. – Celso França Mar 26 '23 at 19:48
-
Work backwards from the no. of parameters, e.g. https://www.beren.io/2022-08-06-The-scale-of-the-brain-vs-machine-learning/. Roughly 1000 parameters: 1 neuron. – alvas Mar 29 '23 at 05:16
1 Answers
Depending on which field you come from the "neurons" definition might differ.
In general, people in computer science conflates num_neurons = num_parameters
. But this might not be the case if one is interested in more neurological/biological perspective.
Q: Why do computer scientist care about no. of parameters not neurons?
Because they determine the effectiveness of the model in terms of FLOPs, see https://www.lesswrong.com/posts/jJApGWG95495pYM7C/how-to-measure-flop-s-for-neural-networks-empirically
Q: How many neurons is one parameter, or vice versa?
For that we can only estimate, naively, we can treat it as 1 neuron = 1000 parameters
Reference: https://www.beren.io/2022-08-06-The-scale-of-the-brain-vs-machine-learning/
Q: How many neurons BERT have?
Depends on which flavor of BERT you are referring to.
Using snippets from Check the total number of parameters in a PyTorch model
from transformers import AutoModel
model = AutoModel.from_pretrained("bert-base-cased")
sum(p.numel() for p in model.parameters())
[out]:
108310272
So working backwards, given 1000:1 ratio, 108,310,272 parameters -> 0.1M neurons.

- 115,346
- 109
- 446
- 738