the mean and standard deviation aren't the same as those of the input data i provided after sampling

Question

I have a log-normal mean and a standard deviation. after i converted them to the underlying normal distribution's parameters mu and sigma, I sampled from the log-normal distribution however when i take the mean and standard deviation of this sampled data i don't get the results i plugged in at first. This only happens when the log-normal mean is way smaller than the log-normal standard deviation otherwise it works. how do i prevent this from happening and get the input parameters i plugged in at first?

import scipy.stats as stats
from statistics import mean
m = 1.46578E-07
siglog = 1.51
sigma= np.sqrt(np.log(1 + (siglog/m)**2))#normal std
mu= np.log(m) - sigma**2 / 2 #normal mean
x = np.random.lognormal(mu,sigma,1000000)
print(mean(x), np.std(x)) 
[out]: 3.867912662470812e-08 1.0677187655685002e-05

You are squaring (siglog / m) = 10301682. The formulas are correct otherwise. — dipetkov, Apr 24 '22 at 19:52
@dipetkov any idea how to go around this ? i want to get the same lognormal mean and standard deviation i plugged in at first ? — codebreaker12, Apr 24 '22 at 23:07
You can try the [inverse sampling method](https://en.wikipedia.org/wiki/Inverse_transform_sampling) using the inverse of the cumulative distribution function; that should be the *ppf* function of [scipy.stats.lognorm](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html). — dipetkov, Apr 24 '22 at 23:21
You might still run into numerical issues at which point you should reconsider how plausible it is to sample from the lognormal with mean 1.46578E-07. — dipetkov, Apr 24 '22 at 23:21
@dipetkov thanks for the suggestion. However implausible the mean i'm working with is, I can only work with it that way since I got it from a database that I'm sure is solid. — codebreaker12, Apr 25 '22 at 07:36
You can also try to sample from the normal distribution and then exponentiate. That actually seems easier as they are probably more algorithms for sampling from the normal. — dipetkov, Apr 25 '22 at 08:17
I tried that but it gives really bad results, worse than not using it. — codebreaker12, Apr 25 '22 at 10:05
When I plot the density of the lognormal distribution you want to sample from, it's pretty clear it's an edge case. Unless we both got the math wrong. So the "standard" ways to sample from a distribution won't work. Have you considered asking this question on Cross Validated? — dipetkov, Apr 25 '22 at 10:17

the mean and standard deviation aren't the same as those of the input data i provided after sampling

0 Answers0