1

At first, I thought I understood what will the rt() function in R generate - I thought it generates random t-valuesfrom the specified t-distribution.

For example, this function tdist <- rt(10000,19) generates, I interpreted, 10,000 t-values from a t-distribution that based on n=20 (df=19), with mean=0 and standard deviation=1.

Is that the case, or does it generate average scores (means) that are to be found under the specified t-distribution?

If the latter is the case, how can I generate from a t-distribution 10,000 times a n=20 sample with specifications mean=0, sd=1?

Thank you in advance!

  • Why do you think it is not generating 10K samples or t-distribution? Any reason to question that? – Severin Pappadeux Jul 23 '20 at 16:54
  • And it would be nice to state your system (Windows, Linux), R version etc. Maybe there is some installation specific problem – Severin Pappadeux Jul 23 '20 at 16:55
  • Simple, because I am a beginner and am not sure entirely about the output of these functions. I use OS X 10.15.6 with RStudio 1.2.5042 and R 3.6.3. – Stanciu Adrian Jul 24 '20 at 10:53
  • I am a beginner and am not sure entirely about the output of these functions. I use OS X 10.15.6 with RStudio 1.2.5042 and R 3.6.3. ```rt()```generates values from a t-distribution, what exactly. I assume these are values and not average scores. I still don't understand the elements of the function: ```n``` is the number of observations to be drawn and ```df```is the number of degrees of freedom associated with a t-distribution. So, say if I want to draw ```n=20``` observations then I'd specify ```df=19```? Or do I misunderstood this? Thanks a lot for the help. – Stanciu Adrian Jul 24 '20 at 10:59
  • 1
    No, if you want 20 observation you set `n` to 20 and that is it. If you want 1000 samples, set `n` to 1000. `df` is a parameter of the t-distribution, and it is NOT related in any way to `n`. – Severin Pappadeux Jul 24 '20 at 15:02

1 Answers1

1

Yes, rt(n, df) generates n random samples from t-distribution with df degrees of freedom.

with mean=0 and standard deviation=1.

True about distribution mean (though sampled mean would be different), but standard deviation for t-distribution is ALWAYS not a 1, but a bit larger, equal to sqrt(df/(df-2)), sqrt(19/17)=1.057 in your case.

Lets put some code (MS R open 3.5.3, Win10 x64)

q <- rt(10000, 19)
mean(q)

prints

-0.008859063

sd(q)

prints

1.049836

and

hist(q)

plots

enter image description here

Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
  • thank you. Though, I'm confused about this part still: say if I want to draw n=20 observations then I'd specify df=19? – Stanciu Adrian Jul 24 '20 at 14:19
  • @StanciuAdrian What degrees-of-freedom parameter has to do with number of samples from the distribution? You could draw as many samples as you want. – Severin Pappadeux Jul 24 '20 at 14:53
  • @SeverinPapadeux sorry it took me a while to reply. If I want to run ```rt()``` by providing only the element ```n``` (no of observations) I get an error message. Also the rt() function returns values and not samples. – Stanciu Adrian Jul 28 '20 at 17:41
  • @StanciuAdrian You cannot just ask for number of samples, you have to give distribution functions(s) (like sapling, PDF, etc) all necessary parameters to set it up. For t-distribution it requires dof parameter. – Severin Pappadeux Jul 28 '20 at 17:47
  • 1
    @StanciuAdrian I think you have some mental block, thinking about `n` and `df` parameters while calling `rt(n, df)` that they are somehow related, that `n` should be equal to `df+1`. They are NOT related to each other at all, they are completely independent, `n` is number of samples you need, `df` is distribution parameter. That is is. And yes, `rt(n,df)` returns vector of sampled values – Severin Pappadeux Jul 28 '20 at 17:52
  • I think so too, the mental block I mean:) Now it is clear as day light, they are not related. Thanks a lot for your patience! – Stanciu Adrian Jul 28 '20 at 18:03
  • 1
    @StanciuAdrian You're welcome. Note, that this **R** function could sample from (generalized) t-distribution with non-integer `df` parameter. Easy to check with large number of samples (say, 100K), e.g. for df=13.5 I've got sampled sd 1.084 while distribution sd=sqrt(13.5/11.5)=1.083 – Severin Pappadeux Jul 28 '20 at 18:25
  • @SeverinPappadeux, What happened if I used a large df value? Is it like normal distribution in this case? – jeza Feb 03 '21 at 11:10
  • 1
    @jeza Yes, if df is large and going to infinity, then t-distribution will be reaching Gaussian N(0, 1). Basically, the difference between t and normal is use of sampled variance (in case of t) vs population variance (in case of normal). As soon as df->infinity, sample variance would go to population variance and t would be the same as normal – Severin Pappadeux Feb 03 '21 at 14:31
  • @SeverinPappadeux, thanks, any idea of this "https://stackoverflow.com/questions/65992658/optimising-nested-for-loops-in-r?noredirect=1#comment116705896_65992658" – jeza Feb 03 '21 at 18:19