How many neurons (units) are there in the BERT model?

Question

How to estimate the number of neurons (units) in the BERT model? Note this is different from the number of model parameters.

Does this answer your question? [Check the total number of parameters in a PyTorch model](https://stackoverflow.com/questions/49201236/check-the-total-number-of-parameters-in-a-pytorch-model) — cronoik, Mar 26 '23 at 18:27
Clearly not. The number of parameters basically corresponds to the neuron (units) weights. For example, in a simple linear layer with three input units and two output units (consequently five units or neurons), we have 3x5 parameters without counting the biases. — Celso França, Mar 26 '23 at 19:48
Work backwards from the no. of parameters, e.g. https://www.beren.io/2022-08-06-The-scale-of-the-brain-vs-machine-learning/. Roughly 1000 parameters: 1 neuron. — alvas, Mar 29 '23 at 05:16

score 0 · Answer 1 · answered Mar 29 '23 at 05:25

Depending on which field you come from the "neurons" definition might differ.

In general, people in computer science conflates num_neurons = num_parameters. But this might not be the case if one is interested in more neurological/biological perspective.

Q: Why do computer scientist care about no. of parameters not neurons?

Because they determine the effectiveness of the model in terms of FLOPs, see https://www.lesswrong.com/posts/jJApGWG95495pYM7C/how-to-measure-flop-s-for-neural-networks-empirically

Q: How many neurons is one parameter, or vice versa?

For that we can only estimate, naively, we can treat it as 1 neuron = 1000 parameters

Reference: https://www.beren.io/2022-08-06-The-scale-of-the-brain-vs-machine-learning/

Q: How many neurons BERT have?

Depends on which flavor of BERT you are referring to.

Using snippets from Check the total number of parameters in a PyTorch model

from transformers import AutoModel
model = AutoModel.from_pretrained("bert-base-cased")
sum(p.numel() for p in model.parameters())

[out]:

108310272

So working backwards, given 1000:1 ratio, 108,310,272 parameters -> 0.1M neurons.

How many neurons (units) are there in the BERT model?

1 Answers1

Q: Why do computer scientist care about no. of parameters not neurons?

Q: How many neurons is one parameter, or vice versa?

Q: How many neurons BERT have?