1

I also have to keep in mind the skewness and the kurtosis of the distribution and these have to be reflected in the simulated values.

My empirical values are past stock returns (non-standard normal distribution).

Is there an existing package that will do this for me? All the packages I see online have only the first two moments.

horchler
  • 18,384
  • 4
  • 37
  • 73
Akshay Sakariya
  • 107
  • 1
  • 9
  • By definition if it's skewed it isn't normal, if it's normal it isn't skewed. So which is it? Also, have you considered just bootstrapping your empirical data? – pjs Jun 10 '16 at 19:28
  • It's skewed. I can find out the four moments perfectly from my sample but I'm having difficulty finding a function that'll take the four moments as arguments and give me an array of simulated values. Thanks! – Akshay Sakariya Jun 10 '16 at 20:06

1 Answers1

1

What you're describing is using the method of moments to define a distribution. Such methods have generally fallen out of favor in statistics. However, you can check out pearsonrnd, which may work fine depending on your data.

Instead, I'd suggest directly finding the empirical CDF for the data using ecdf and use that in conjunction with inverse sampling to generate random variates. Here's a basic function that will do that:

function r=empiricalrnd(y,varargin)
%EMPIRICALRND  Random values from an empirical distribution
%   EMPIRICALRND(Y) returns a single random value from distribution described by the data
%   in the vector Y.
%   
%   EMPIRICALRND(Y,M), EMPIRICALRND(Y,M,N), EMPIRICALRND(Y,[M,N]), etc. return random arrays.

[f,x] = ecdf(y);
r = rand(varargin{:});
r = interp1(f,x,r,'linear','extrap');

You can play with the options for interp1 if you like. And here's a quick test:

% Generate demo data for 100 samples of log-normal distribution
mu = 1;
sig = 1;
m = 1e2;
rng(1); % Set seed to make repeatable
y = lognrnd(mu,sig,m,1);

% Generate 1000 random variates from data
n = 1e3;
r = empiricalrnd(y,n,1);

% Plot CDFs
ecdf(y);
hold on;
ecdf(r);
x = 0:0.1:50;
plot(x,logncdf(x,mu,sig),'g');
legend('Empirical CDF','CDF of sampled data','Actual CDF');
horchler
  • 18,384
  • 4
  • 37
  • 73