1

I am trying to plot the PDF and CDF for a biased die roll for 10^4 samples using Central Limit Theorem.(CLT)

The die is biased or unfair where even sides are twice as likely as odd sides. Here is the diefaces = [1,3,2,4,6,2].What can I use in Matlab to find probability of an odd number in this case with CLT where Sn = X1 + X2 + ... + Xn , n = 40.

Here is what I have tried so far. The part that I am struggling with is passing the samples which in this case is 10^4 and n=40. Appreciate any help....

clear
% Roll the dice "numberOfRolls" times
numberOfRolls = 10000; % Number of times you roll the dice.
% Biased die with even sides twice as likely as a odd number
diefaces = [1,3,2,4,6,2];
n = 1; % Number of dice.
maxFaceValue = 6;
% Pick a random number from diefaces along with the number of rolls which
% is not working :(
x = diefaces(randperm(numel(diefaces),1),10000)


    S1 = cumsum(x)
    hist1= histogram(S1(1,:),'Normalization','pdf','EdgeColor', 'blue', 'FaceColor',  'blue')
SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
ESLearner
  • 87
  • 1
  • 14
  • This should get you on the right track: https://stackoverflow.com/questions/13914066/generate-random-number-with-given-probability-matlab – Cris Luengo Dec 13 '19 at 02:32
  • Is `Sn` the total number of pips (dots) on `n` rolls of a 6-sided die? Is it the number of odd numbers in `n` rolls? Can you define Sn? Also, Sn = S1 + S2 + ... + Sn seems odd. Did you mean Sn = X1 + X2 + ... + Xn? Can you then define X_i (i = 1, 2,..., n)? With clear definitions for these random variables, this question becomes very answerable. – SecretAgentMan Dec 13 '19 at 03:50
  • @SecretAgentMan. (Thanks again) Just updated the question. I did mean Sn = X1 + X2 + .. + Xn. Xi representing a toss of an unfair 6-sided die with even sides twice as likely as odd sides. Where Xi can take values of n = 1, 2, 3, 4, 5, 10, 20, and 40. – ESLearner Dec 13 '19 at 04:04
  • I get the 6 sides of the custom die and the definition of Sn. You say Xi represents the toss of the die. Is Xi the outcome (1,2,2,3,4,6)? I don't see how Xi can be 1,2,3,4,5,10,20,and 40. Can you clarify? Please [edit] the question and explicitly define all random variables (answer these questions). I think we'll be able to proceed from there. – SecretAgentMan Dec 13 '19 at 05:31
  • Let X1,X2,... be a sequence of iid RV with finite mean "mu" and finite variance sigma^2, and let Sn be the sum of the 1st n random variables in the sequence: Sn = X1 + X2 + ::: + Xn. a) Let Xi the be a uniform continuous random variable taking values in the interval (0; 3). Consider n = 1; 2; 3; 4; 5; 10; 20; 40 . b) Repeat part (a) with Xi representing a toss of an unfair 6-sided die with even sides twice as likely as odd sides. Part (b) is what I am solving for. – ESLearner Dec 13 '19 at 05:43
  • 1
    @SecretAgentMan - Posted the actual question. – ESLearner Dec 13 '19 at 05:44
  • `diefaces = [1,3,2,4,6,2]`? Two 2 and no 5? – Cris Luengo Dec 13 '19 at 13:46
  • I read it as the die has faces `[1 2 3 4 5 6]` with probability `[1/9 2/9 1/9 2/9 1/9 2/9]`. Though the approach I've taken is general enough to customize both the die faces and their probabilities. – SecretAgentMan Dec 13 '19 at 14:15

2 Answers2

3

Define your die and obtain a valid probability mass function (PMF) for your custom die. You can verify the PMF by ensuring sum(Prob) equals 1. Note that a fair die is obtained by setting RelChance to [1 1 1 1 1 1].

The die face probabilities below are [1/9 2/9 1/9 2/9 1/9 2/9].

Die = [1 2 3 4 5 6];
RelChance = [1 2 1 2 1 2];            % Relative Chance
Prob = RelChance./sum(RelChance);     % probability mass function for die

You can use datasample() to simulate the outcome of rolling the die (requires Statistics toolbox). This is easy enough to hard code if absolutely necessary through several methods.

The code below reads sample NumRolls many times from Die with the probability Prob(ii) representing the probability of Die(ii).

% MATLAB R2019a
NumRolls = 13;                        % Number of rolls 
Rolls = datasample(Die,NumRolls,'Weights',Prob);

Now, to use this to accomplish the stated goal. Similar to this post, create an array X that has the first row as realizations of X1, the second row realizations of X2, and so on. This doesn't have to be fancy.

And again, use cumsum() to get the cumulative sum along the columns. This means the first row is realizations from S1=X1, second row is realizations from S2=X1+X2, and the 40th row is empirical samples from S40 = X1 + X2 + ... + X40.

n_max = 40;
NumRolls = 10000;
X = zeros(n_max,NumRolls);
for n = 1:n_max
    X(n,:) = datasample(Die,NumRolls,'Weights',Prob);
end
Sn = cumsum(X);

How to Plot? At this point, since our variable names match the process, the remaining steps (plotting) are identical to this post but with a few minor modifications. Since this is discrete (not continuous) data, I generated the plot below using the 'Normalization','probability' option for histogram(). References to a probability density function (PDF) have been relabeled probability mass function (PMF) accordingly.

Images showing the discrete analog to the Central Limit Theorem's convergence.


Continuous version of the Central Limit Theorem is posted here.

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
1

You can generate the result of 10000 unfair die rolls with the probability for even sides twice as large as the odd sides like this.

First, let's define the odd and even sides of the die

odd = [1 3 5];
even = [2 4 6];

Draw 10000 uniformly distributed numbers.

r = rand(10000,1);

allocate some result variable

rolls = zeros(10000,1);

If you now split the random numbers at 1/3, you have the probability for odd and even numbers with a ration of 1:2. Since (I assume that) within the odd and even numbers, the probabilities are uniform (that is, the probability to get a 3 is the same as to get a 1 and so on), use uniform random numbers to assigne a respective value.

Use logical indexing

rolls(r>1/3) = even(randi(3,sum(r>1/3),1));
rolls(r<=1/3) = odd(randi(3,sum(r<=1/3),1));

Plot the result

histogram(rolls)

enter image description here

For the generation of the PDF and CDF with the CLT, use the answer to your previous question, but use a distribution generated like above.

Patrick Happel
  • 1,336
  • 8
  • 18