1

How to estimate the number of neurons (units) in the BERT model? Note this is different from the number of model parameters.

alvas
  • 115,346
  • 109
  • 446
  • 738
Celso França
  • 653
  • 8
  • 31
  • 1
    Does this answer your question? [Check the total number of parameters in a PyTorch model](https://stackoverflow.com/questions/49201236/check-the-total-number-of-parameters-in-a-pytorch-model) – cronoik Mar 26 '23 at 18:27
  • 1
    Clearly not. The number of parameters basically corresponds to the neuron (units) weights. For example, in a simple linear layer with three input units and two output units (consequently five units or neurons), we have 3x5 parameters without counting the biases. – Celso França Mar 26 '23 at 19:48
  • Work backwards from the no. of parameters, e.g. https://www.beren.io/2022-08-06-The-scale-of-the-brain-vs-machine-learning/. Roughly 1000 parameters: 1 neuron. – alvas Mar 29 '23 at 05:16

1 Answers1

0

Depending on which field you come from the "neurons" definition might differ.

In general, people in computer science conflates num_neurons = num_parameters. But this might not be the case if one is interested in more neurological/biological perspective.

Q: Why do computer scientist care about no. of parameters not neurons?

Because they determine the effectiveness of the model in terms of FLOPs, see https://www.lesswrong.com/posts/jJApGWG95495pYM7C/how-to-measure-flop-s-for-neural-networks-empirically

Q: How many neurons is one parameter, or vice versa?

For that we can only estimate, naively, we can treat it as 1 neuron = 1000 parameters

Reference: https://www.beren.io/2022-08-06-The-scale-of-the-brain-vs-machine-learning/

Q: How many neurons BERT have?

Depends on which flavor of BERT you are referring to.

Using snippets from Check the total number of parameters in a PyTorch model

from transformers import AutoModel
model = AutoModel.from_pretrained("bert-base-cased")
sum(p.numel() for p in model.parameters())

[out]:

108310272

So working backwards, given 1000:1 ratio, 108,310,272 parameters -> 0.1M neurons.

alvas
  • 115,346
  • 109
  • 446
  • 738