Fitting a lognormal distribution to negative values with scipy

Question

I have a 40 year time-series of surge levels in the ocean to which I'm trying to fit a lognormal distribution using scipy.stats. However, as far as I know (and read) a lognormal distribution cannot have negative values by definition. The scipy implementation uses a generalized version with three parameters, shape, location and scale, enabling to 'shift and scale' the distribution, which makes it possible to fit to negative values. However, can it then still be considered a lognormal distribution?

The surge data in the example below (grey histogram) has around half its values below 0, and the computed lognorm fit is actually very good (orange line; shape = 0.27, loc = -0.57, scale = 0.56). However, if I am trying to use a lognorm with the mu / sigma parameterization (i.e. mu = log(scale), sigma = shape, and loc fixed at 0), see also Wikipedia, it returns an error (due to the negative values).

What I don't really understand is if a 'shifted' 3 parameter lognorm still classifies as a lognormal distribution? I prefer to use the standard parameterization, however for many timeseries this will not be possible and generally the obtained fit is worse.

What is a "surge level in the ocean"? What does a negative value mean? Why are are you trying to fit a log-normal distribution to this data instead of, say, the skew normal distribution ([`skewnorm`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skewnorm.html)), the Weibull distribution ([`weibull_max`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.weibull_max.html)), or some other distribution? (Sorry, all I have are questions, not answers.) — Warren Weckesser, Nov 17 '22 at 02:56
@WarrenWeckesser in this case the surge is essentially the residual waterlevel when you take away the tide. So you have a time series of water levels and you remove the tidal fluctuation and the remaining difference compared to mean sea level is the surge (mostly due to the wind). We normally associate surges with high water levels, but if the wind is blowing offshore, you can also get a negative surge. I want to use a lognorm because I also use it to fit other data, so if possible I'd like to have the consistency. But I'll look into the distributions you mentioned, thanks! — jchristiaanse, Nov 17 '22 at 06:47

Fitting a lognormal distribution to negative values with scipy

0 Answers0