-1

I have a non-stationary (Periodicity+Trend) time series (Ts) data of one dimension which contains nan values. I want to generate 10000 pseudo-random values of the Ts based on its probability distribution. DATA_LINK distribution of normalized data .If there is any issue in downloading, a part of the data I pasted here.

NaN=np.nan; Ts=np.array([384.540,378.233,376.858,378.497,NaN,NaN,NaN,NaN,NaN,NaN,NaN,390.409,386.174,382.2768,382.082,383.721,NaN,NaN,NaN,NaN,NaN,NaN,NaN,391.841,389.513,382.835,381.387,384.404,NaN,NaN,NaN,NaN,NaN,NaN,NaN,393.871,391.176,385.041,385.270,385.570,NaN,NaN,NaN,NaN,NaN,NaN,NaN,398.377,395.187,390.173,387.628,388.129,NaN,NaN,NaN,NaN,NaN,NaN,NaN,395.886,390.830,389.398,391.617,NaN,NaN,NaN,NaN,NaN,NaN,399.943,390.400,391.019,393.635,NaN,NaN,NaN,NaN,NaN,NaN,403.128,399.594,394.948,394.561,395.420,NaN,NaN,NaN,NaN,NaN,NaN,NaN,405.345,403.449,398.429,395.195,397.791,NaN]);

This I have tried...

rr=(Ts-np.nanmean(Ts))/np.nanstd(Ts); # normalization

mu, sigma = np.nanmean(rr), np.nanstd(rr) # mean and standard dev 


q=np.random.uniform(mu, sigma, rr.shape[0]);` # uniform 
                                      distribution considering   

I want to know how to create pseudo random values of same dimension of Ts(contains NaN+non-NaN) using Monte Carlo.

Thanks in advance.

Shubho
  • 1
  • 1
  • 1
    Welcome to [Stack Overflow.](https://stackoverflow.com/ "Stack Overflow") This is not a code-writing or tutoring service. We help solve specific, technical problems, not open-ended requests for code or advice. Please edit your question to show what you have tried so far, and what specific problem you need help with. See the [How To Ask a Good Question](https://stackoverflow.com/help/how-to-ask "How To Ask a Good Question") page for details on how to best help us help you. – itprorh66 Nov 07 '22 at 13:34
  • @itprorh66 I have edited the questions and what I have done so far. Looking forward to answers – Shubho Nov 08 '22 at 10:28
  • It is not possible for us to see your data without login credentials. This site requires that you type your data into the site so that it can be copied and pasted into test environments. Also, please limit your query to a single question, – itprorh66 Nov 08 '22 at 13:27
  • @itprorh66 Thanks for your reply. Sorry if you didn't able to download the data. Now i have pasted a part of the data and trim to only one query. – Shubho Nov 08 '22 at 14:45
  • It is still unclear to me what your question is! Do you want to know how to create an array of random numbers with a defined distribution? ( A programming problem) Do you want to know how to determine the probability distribution of a series of numbers?(A statistics problem) Do you want to know how to employ a Monte Carlo method? (a simulation problem) – itprorh66 Nov 08 '22 at 18:07
  • @itprorh66 Sorry if it's still unclear. I want to know how to use Monte-carlo when we don't know the actual distribution of a data like in my non-stationary data Ts. Basically I want to know how to generate random pseudo values for this type of non-stationary data( contains lot of NaNs)? – Shubho Nov 08 '22 at 18:32
  • You are asking multiple questions which is against the rules for this site. Edit your questions down to multiple single questions and post them to appropriate sites. – itprorh66 Nov 08 '22 at 20:16
  • @itprorh66 It's just one simple question how to create random pseudo samples data for Ts. – Shubho Nov 09 '22 at 10:06

1 Answers1

0

I am still not sure I fully understand what you are trying to do, but If you want to create a set of random data with a similar distribution to that of the Ts array, this may be a solution:

def createData(arr: np.array) -> np.array:
    vals = np.extract(~np.isnan(arr), arr)
    rslt = np.random.normal(np.average(vals), np.std(vals), len(arr))
    nans =  np.random.choice(len(arr), np.count_nonzero(np.isnan(arr)))
    for i in nans:
        rslt[i] = np.nan
    return rslt

This results in a set of normally distributed data points with a random number of points being set to np.nan.

Given your sample data, you will get a new array of the form:

array([391.52548673, 385.7549112 , 387.69378264, 384.81241066,
                nan, 387.25417899, 395.67521542,          nan,
       383.10427362, 385.35759395, 395.2064689 , 390.99123783,
       393.85425828, 396.3634796 ,          nan, 399.10358563,
       400.21286847, 405.18748279,          nan,          nan,
                nan, 378.57877152, 385.35011411, 384.26531498,
       402.96559037,          nan,          nan, 385.07668198,
       393.63731872,          nan, 389.27679814,          nan,
                nan,          nan, 386.76116921, 399.12329659,
                nan,          nan, 397.2075096 ,          nan,
                nan,          nan, 390.39172827,          nan,
                nan,          nan,          nan,          nan,
       386.7316295 , 376.60542753, 400.31706119,          nan,
       398.54294406,          nan, 381.02411026, 392.23370125,
                nan, 384.15577098,          nan,          nan,
                nan, 388.85805677, 399.42860135, 385.17369209,
       406.96473509,          nan, 397.89035509,          nan,
       394.95093528,          nan, 415.34340577, 380.26836289,
                nan, 386.90350994, 390.79316328,          nan,
       397.40019258,          nan,          nan, 395.47333884,
       389.50048944, 393.43239451,          nan,          nan,
                nan,          nan, 384.61917053, 395.55390706,
                nan,          nan,          nan,          nan,
       381.37982898,          nan,          nan,          nan,
                nan])
 
itprorh66
  • 3,110
  • 4
  • 9
  • 21