1

I have a dataset with 3-hourly precipitation amounts for the month of January in the period 1977-1983 (see attachment). However, I want to generate precipitation data for the period 1984-1990 based upon these data. Therefore, I was wondering if it would be possible to create a custom made probability density function of the precipitation amounts (1977-1983) and from this, generate random numbers (precipitation data) for the desired period (1984-1990).

Is this possible in Matlab and could someone help me by doing so?

Thanks in advance!

Click to see an example of the data

EBH
  • 10,350
  • 3
  • 34
  • 59
Yoni Verhaegen
  • 111
  • 1
  • 11
  • If the data from 1977-1983 is gaussian you can calculate the mean and sample standard deviation then use `data = normrnd(mu, sigma, m, n)` to get a m x n array of randomly distributed data points with mean = mu and standard deviation = sigma. – sn8wman Dec 19 '17 at 21:22

3 Answers3

1

A histogram will give you an estimate of the PDF -- just divide the bin counts by the total number of samples. From there you can estimate the CDF by integrating. Finally, you can choose a uniformly distributed random number between 0 and 1 and estimate the argument of the CDF that would yield that number. That is, if y is the random number you choose, then you want to find x such that CDF(x) = y. The value of x will be a random number with the desired PDF.

AnonSubmitter85
  • 933
  • 7
  • 14
0

If you have 'Statistics and Machine Learning Toolbox', you can evaluate the PDF of the data with 'Kernel Distribution' method:

Percip_pd = fitdist(Percip,'Kernel');

Then use it for generating N random numbers from the same distribution:

y = random(Percip_pd,N,1);
EBH
  • 10,350
  • 3
  • 34
  • 59
0

Quoting @AnonSubmitter85:

"estimate the CDF by integrating. Finally, you can choose a uniformly distributed random number between 0 and 1 and estimate the argument of the CDF that would yield that number. That is, if y is the random number you choose, then you want to find x such that CDF(x) = y. The value of x will be a random number with the desired PDF."

%random sampling
 N=10; %number of resamples

 pdf = normrnd(0, 1, 1,100); %your pdf
 s = cumsum(pdf); %its cumulative distribution

 r = rand(N,1); %random numbers between 0 and 1
 for ii=1:N
   inds = find(s>r(ii));
   indeces(ii)=inds(1); %find first value greater than the random number
 end
 resamples = pdf(indeces) %the resamples
shamalaia
  • 2,282
  • 3
  • 23
  • 35