0

I am trying to generate some Bell Shape data (Normal Distribution). There are some math formula to achieve that, but I am hoping to emulate it by natural, daily events that happen in real life.

For example, I am saying, for 50 students, assuming they have a 70% chance of getting a question in a multiple choice exam correct, for 100 questions. So what score does each student get? I have the code in JavaScript:

students = Array.from({ length: 50 });

students.forEach((s, i, arr) => {
  let score = 0;
  for (let i = 0; i < 100; i++) {
    if (Math.random() >= 0.3) score++;
  }
  arr[i] = score;
});

console.log(students);

But the result doesn't seem like a normal distribution. For example, I got:

[
  69, 70, 67, 64, 71, 72, 77, 70, 71, 64, 74,
  74, 73, 80, 69, 68, 67, 72, 69, 70, 61, 72,
  72, 75, 63, 68, 71, 69, 76, 70, 69, 69, 67,
  63, 65, 80, 70, 62, 68, 63, 73, 69, 64, 79,
  79, 72, 72, 70, 70, 66
]

There is no student who got a score of 12 or 20, and there is no student who got a score of 88 or 90 or 95 (the students who can get an A grade). Is there a way to emulate a real life event to generate normal distribution data?

Stefanie Gauss
  • 425
  • 3
  • 9

1 Answers1

1

Two issues:

  • 100 students may be a bit too small a sample to produce such a pattern; 10000 students will give a better view.
  • You can better visualise the statistics by counting the number of students that have a given score. So you would get a count per potential score (0..100).

And now you can see the Bell curve:

let students = Array.from({ length: 10000 });
let studentsWithScore = Array(101).fill(0); 

students.forEach(() => {
  let score = 0;
  for (let i = 0; i < 100; i++) {
    if (Math.random() >= 0.3) score++;
  }
  studentsWithScore[score]++;
});

console.log(studentsWithScore);
trincot
  • 317,000
  • 35
  • 244
  • 286
  • still... there is no student that got a score of 12 or 20, or 90 or 95, while in a class of 50 students, I definitely see such cases – Stefanie Gauss Apr 04 '21 at 12:00
  • Yes, but in your class there are students that only have a 50% probability of passing a test, while others have 80% to pass a test. That is not the case here: here we assume that *every* student has an equal *expected* success rate of 70% on every test. This is not realistic. – trincot Apr 04 '21 at 12:07
  • yes, I was thinking about that as well. Then how can we emulate the "real case". I used a "caliber", which is how good a student is, which is just `Math.random()` and if the student's caliber is 0.90, that means his chance of getting the answer right is 90%. But then this caliber distribution is quite even... so I think it won't get the desired result. They key here, how can it emulate natural events to get a normal distribution. – Stefanie Gauss Apr 04 '21 at 13:06
  • I am experimenting: if from that array in original question, we get a range of numbers from 60 to 80, then let that be the caliber of a student, and then, generate multiple choices questions with the difficulty of 60 to 80... so students with caliber of 60 would get most questions wrong, while students with caliber of 80, would get most question right... something like that. It is like emulating all the brain cells to achieve an IQ, and then assuming the multiple choice question to require an IQ of a certain level to answer right – Stefanie Gauss Apr 04 '21 at 13:09
  • I think I found the answer. By [Central limit theorem](https://en.wikipedia.org/wiki/Central_limit_theorem), the original result is a normal distribution. I just need to stretch it out, so that a score of 60 means 0, and a score of 80 means 100, etc. (scale them out linearly). See https://math.stackexchange.com/questions/4089051/are-the-scores-of-an-exam-of-many-multiple-choice-questions-a-normal-distributio – Stefanie Gauss Apr 04 '21 at 14:06