Integrating the PDF of a t-distribution in R via integrate()

Question

I am trying to replicate the t-test with mathematical formulas and it somehow wont work with may current attempt below:

# PDF t-distribution:
t_distr = function(x,df){
  t_1 = gamma((df+1)/2)/(sqrt(df*pi)*gamma(df/2))
  t_2 = (1 + (seq^2/df))^(-(df+1)/2)
  t_distr = t_1*t_2
  return(t_distr)
} # End of function

# Initialize Sequence: 
seq = seq(-4,4,by = .1)

# Integrating up to t-value of 1 (lower tail t-test):
integrate(t_distr, df = length(seq)-1, lower = -Inf, upper = 1 )

# For df = length(seq)-2 (e.g. for a lin. reg):
integrate(t_distr, df = length(seq)-2, lower = -Inf, upper = 1 )

The above code will result in the following error message:

Error in integrate(t_distr, df = length(seq) - 1, lower = -Inf, upper = 1) : 
  evaluation of function gave a result of wrong length

Any suggestions how to make it possible to integrate the pdf of a t-distribution using the integrate function? It also does not work with a "df" of just n...

Use t.test function. This has nothing to do with the t distribution unless you want the pvalue — Onyambu, Mar 27 '23 at 06:53
You have `dt` for density, `pt` for probability and `qt` for quantile. You could easily use hose. Or do you want to implement your own? — Onyambu, Mar 27 '23 at 07:03
I know that these functions exist, but I want to recreate it via using the actual formulas. And what I want is the p-value that is why I am trying to integrate.... I did the same already for the z-test.... Now I want to do it via the PDF of a t-distribution as well. Apart from that: a t-test does not result in a t-distribution but a t-value on a PDF of a t-distribution. However, thanks anyway.... — Steffen Schwerdtfeger, Mar 27 '23 at 07:29

score 1 · Accepted Answer · answered Mar 27 '23 at 07:38

A couple of issues:

Your t_distr function uses seq which is not defined (actually it is defined to be function, later you overwrite it with a vector, avoid that). You want your t_distr (the density of the t distribution) to take one argument x:
```
t_distr <- function(x, df){
  t_1 <- gamma((df + 1) / 2) / (sqrt(df * pi) * gamma(df / 2))
  t_2 <- (1 + (x ^ 2 / df)) ^ (-(df + 1) / 2)
  t_1 * t_2
}
```
(you can check that your t density is correctly defined by all.equal(t_distr(2.123, 12), dt(2.123, 12)).
For integrate (to move from the desnity to the cdf) you need to provide your function and limits. The function will then be integrated in its first argument:
```
integrate(t_distr, df = 12, lower = -Inf, upper = 1)
```
this will integrate your density from -Inf to 1 with 12 degrees of freedom. Again we can check whether we got the correct results by comparing that to the built-in cdf of the t-distribution:
```
all.equal(integrate(t_distr, df = 12, lower = -Inf, upper = 1)$value, pt(1, 12))
```
The degrees of freedom have actually nothing to do with the integration itself and should be seen as an additional parameter. Maybe that's were the confusion came from.

Thanks so much! "seq" within the function was the problem indeed!! In a former version I used seq instead of x. — Steffen Schwerdtfeger, Mar 27 '23 at 07:47
`R` lexical scoping is a bless and a curse. As `seq` is not defined in the body of `t_distr` it will be looked up in the global environment and all the way up to the base namespace (where the function is defined). However, as you overwrote `seq` by a vector this error is even more difficult to spot, because now we use at least some numerics instead of a function. Bototm line: avoid at all costs to re-use pre-defined names (`c`, `data` are classics). — thothal, Mar 27 '23 at 07:52

Integrating the PDF of a t-distribution in R via integrate()

1 Answers1