0

As an example, I created a dataset with 1 column numbered 1-10. I used the following SAS code:

proc surveyselect
data=sample
out=sample2
method=urs
n = 10
seed=1 outhits;
run;

This resulted in the following sample 5, 6, 7, 8, 9, 9, 9, 10, 10, 10. I ran the following code in R and got the following sample.

set.seed(1)
test <- 1:10
sample(test,size = 10, replace = TRUE)

3 4 6 10 3 9 10 7 7 1

For context, I am trying to replicate a bootstrapping analysis done in SAS in R, so it is important we get the same samples. Is there an easy way to generate the same sample in R?

Mohini
  • 1
  • I'm not sure if I understand your question but it sounds like you want to reliably be able to generate the same samples? It seems like you can do that already. Just reset the seed to whatever you sampled from before. – svenhalvorson Jan 17 '20 at 18:37
  • Yes I want R and SAS to generate the same samples. As you can see, I'm using seed = 1 in both the R and SAS code, but they result in different samples. – Mohini Jan 17 '20 at 18:45
  • 2
    Oh... that's probably going to be pretty hard to do. I suspect that they way they generate random numbers are VERY different and crossing that over will not be easy. – svenhalvorson Jan 17 '20 at 19:09
  • I tried SURVEYSELECT's RANUNI option but that does not produced the same results either. Does R's SAMPLE have option(s) to select the random number generator? – data _null_ Jan 17 '20 at 19:42
  • There's some info [here](https://stackoverflow.com/questions/30763582/replicating-random-normal-generated-in-sas-rancor-in-r-based-on-the-same-seed). But it's very unlikely you'll be able to generate the same random numbers between different programs like that. You don't just need to match seeds, you need to match algorithms. it would be better to use one software to generate the random number and then you can read those same numbers into R and SAS. Or implement your own random number generator in both languages. – MrFlick Jan 17 '20 at 20:08
  • Thanks! I agree that it's likely that the two programs use different algorithms. It is probably easiest to standardize how the numbers are generated across the two programs, or generate them in one and copy over to the other. – Mohini Jan 17 '20 at 22:55

0 Answers0