10

I want to use the gaussian function in python to generate some numbers between a specific range giving the mean and variance

so lets say I have a range between 0 and 10

and I want my mean to be 3 and variance to be 4

mean = 3, variance = 4

how can I do that ?

Lily
  • 816
  • 7
  • 18
  • 38
  • What have you tried so far? Have you looked at the [random](http://docs.python.org/2/library/random.html#random.gauss) module? – Henry Keiter May 09 '13 at 21:54

6 Answers6

19

Use random.gauss. From the docs:

random.gauss(mu, sigma)
    Gaussian distribution. mu is the mean, and sigma is the standard deviation. This is slightly
    faster than the normalvariate() function defined below.

It seems to me that you can clamp the results of this, but that wouldn't make it a Gaussian distribution. I don't think you can satisfy all the constraints simultaneously. If you want to clamp it to the range [0, 10], you could get your numbers:

num = min(10, max(0, random.gauss(3, 4)))

But then the resulting distribution of numbers won't be truly Gaussian. In this case, it seems you can't have your cake and eat it, too.

Dan Lecocq
  • 3,383
  • 25
  • 22
  • so is not there anyway for specifying the range, I have this dataset and I want to sample it so I need to make sure the number are within the range – Lily May 09 '13 at 21:57
  • 6
    But what I'm saying is that the Gaussian distribution is fully determined by the mean and variance. Thus the added constraint of being between 0 and 10 would change that distribution. If you want to _clamp_ the results to a range, you could, but then it wouldn't be a Gaussian distribution. – Dan Lecocq May 09 '13 at 21:59
6

There's probably a better way to do this, but this is the function I ended up creating to solve this problem:

import random

def trunc_gauss(mu, sigma, bottom, top):
    a = random.gauss(mu,sigma))
    while (bottom <= a <= top) == False:
        a = random.gauss(mu,sigma))
    return a

If we break it down line by line:

import random

This allows us to use functions from the random library, which includes a gaussian random number generator (random.gauss).

def trunc_gauss(mu, sigma, bottom, top):

The function arguments allow us to specify the mean (mu) and variance (sigma), as well as the top and bottom of our desired range.

a = random.gauss(mu,sigma))

Inside the function, we generate an initial random number according to a gaussian distribution.

while (bottom <= a <= top) == False:
a = random.gauss(mu,sigma))

Next, the while loop checks if the number is within our specified range, and generates a new random number as long as the current number is outside our range.

return a

As soon as the number is inside our range, the while loop stops running and the function returns the number.

This should give a better approximation of a gaussian distribution, since we don't artificially inflate the top and bottom boundaries of our range by rounding up or down the outliers.

I'm quite new to Python, so there are most probably simpler ways, but this worked for me.

JimmyLamothe
  • 61
  • 2
  • 2
  • This looks like it does what you are thinking, but if the gauss command is making random numbers that are *supposed* to be outside of your range, and you are omitting them, that it seems to me that you will get a more narrow sigma than what you requested. An interesting exercise would be to consider what happens if the range were comparatively narrow compared to the sigma.... or if the mean were outside the range a little... Still, for a quick sample where the mean and sigma are far away and you just want to be sure you have valid numbers for an external use, this probably will work fine. – Hghowe Jul 16 '21 at 01:56
3

I was working on some numerical analytical computation and I ran into this python tutorial site - http://www.python-course.eu/weighted_choice_and_sample.php

Now, this is what I proffer as a solution should anyone be too busy as to not hit the site. I don't know how many gaussian values you need so I'll go with 100 as n, mu you gave as 3 and variance as 4 which makes sigma = 2. Here's the code:

from random import gauss
n = 100
values = []
frequencies = {}
while len(values) < n:
    value = gauss(3, 2)
    if 0 < value < 10:
        frequencies[int(value)] = frequencies.get(int(value), 0) + 1
        values.append(value)
print(values)

I hope this helps. You can get the plot as well. It's all in the tutorials.

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Danjoe
  • 55
  • 1
  • 5
1

If you have a small range of integers, you can create a list with a gaussian distribution of the numbers within that range and then make a random choice from it.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
0

You can use minimalistic code for 150 variables:

import numpy as np
s = np.random.normal(3,4,150)             #<= mean = 3, variance = 4
print(s)

Normal distribution is another like random, stochastic distribution. So, we can check it by:

import seaborn as sns
import matplotlib.pyplot as plt

AA1_plot  = sns.distplot(s, kde=True, rug=False)
plt.show()
Community
  • 1
  • 1
Wojciech Moszczyński
  • 2,893
  • 21
  • 27
0
import numpy as np 
from random import uniform
from scipy.special import erf,erfinv
import math 

def trunc_gauss(mu, sigma,xmin=np.nan,xmax=np.nan):
    """Truncated Gaussian distribution.

    mu is the mean, and sigma is the standard deviation.  

    """

    if np.isnan(xmin):
        zmin=0
    else:
        zmin = erf((xmin-mu)/sigma)
    if np.isnan(xmax):
        zmax=1
    else:
        zmax = erf((xmax-mu)/sigma)

    y = uniform(zmin,zmax)
    z = erfinv(y)
    
    # This will not come up often but if y >= 0.9999999999999999      
    # due to the truncation of the ervinv function max z = 5.805018683193454

    while math.isinf(z):
        z = erfinv(uniform(zmin,zmax))
    return mu + z*sigma
Timus
  • 10,974
  • 5
  • 14
  • 28