I want to fit some data to a Pareto distribution using the scipy.stats
library. I am not sure if the issue might be numerical, so just to be safe; I have values measured for the dependent variable (let's call them 'pushes') for the independent variable ('minutes') starting at a few thousand minutes and every ten minutes thereafter (with the exception of a few points that were removed during data cleaning).
e.g.
2780.0 362.0
2800.0 376.0
2810.0 393.0 ...
The best info I can find says something like
from scipy.stats import pareto
result = pareto.fit(data)
and I have no idea how this data is to be formatted in this case. I've tried the following but all result in errors.
result = pareto.fit(zip(minutes, pushes))
result = pareto.fit(pushes)
The error is usually
Warning: invalid value encountered in double_scalars
would greatly appreciate some guidance, thank you.