Plotting the area under the curve of various distributions in R

Question

Suppose I'm trying to find the area below a certain value for a student t distribution. I calculate my t test statistic to be t=1.78 with 23 degrees of freedom, for example. I know how to get the area under the curve above t=1.78 with the pt() function. How can I get a plot of the student distribution with 23 degrees of freedom and the area under the curve above 1.78 shaded in. That is, I want the curve for pt(1.78,23,lower.tail=FALSE) plotted with the appropriate area shaded. Is there a way to do this?

score 3 · Accepted Answer · answered Nov 15 '18 at 16:54

3

ggplot version:

ggplot(data.frame(x = c(-4, 4)), aes(x)) +
  stat_function(fun = dt, args =list(df =23)) +
  stat_function(fun = dt,   args =list(df =23),
                xlim = c(1.78,4),
                geom = "area")

answered Nov 15 '18 at 16:54

Jrakru56

1,211
9
16

I really like this solution as it's working better for me in terms of switching the function from student to normal to other distribution -- thanks! – BLP92 Nov 15 '18 at 17:20

Milan Valášek · Answer 2 · 2018-11-15T16:48:18.197

1

This should work:

x_coord <- seq(-5, 5, length.out = 200) # x-coordinates
plot(x_coord, dt(x_coord, 23), type = "l",
     xlab = expression(italic(t)), ylab = "Density", bty = "l") # plot PDF
polygon(c(1.78, seq(1.78, 5, by = .3), 5, 5), # polygon for area under curve
        c(0, dt(c(seq(1.78, 5, by = .3), 5), 23), 0),
        col = "red", border = NA)

Regarding arguments to polygon():

your first and last points should be [1.78, 0] and [5, 0] (5 only in case the plot goes to 5) - these basically devine the bottom edge of the red polygon
2nd and penultimate points are [1.78, dt(1.78, 23)] and [5, dt(5, 23)] - these define the end points of the upper edge
the stuff in between is just X and Y coordinates of an arbitrary number of points along the curve [x, dt(x, 23)] - the more points, the smoother the polygon

Hope this helps

edited Nov 15 '18 at 16:48

answered Nov 15 '18 at 16:43

Milan Valášek

571
3
10

This is exactly what I was looking for, thank you! If I want to adapt this for different distributions, then I just need to change the dt to the appropriate function for that distribution -- correct? For example, doing this for a normal curve I could use the same arguments but change dt to dnorm(x_coord) and similarly change the c(0,dt(c(seq(1.78,5,by=.3),5),23) to c(0,dnorm(c(seq(1.78,5,by=.3),5)),0) and that should accomplish the same? – BLP92 Nov 15 '18 at 16:49
You are correct. If you like the solution, feel free to accept it.. ;) – Milan Valášek Nov 15 '18 at 16:50
Milan, When I make those changes it just plots an empty normal curve instead of filling in the area. Should I be adjusting anything else? I'm using the same test statistic score of 1.78 for simplicity so I would have assumed no. – BLP92 Nov 15 '18 at 17:05
Hi, I like Jrakru's `ggplot` solution but just to follow up on your question, it's difficult to diagnose without seeing your code. The code above works for me with `dnorm()` instead od `dt()` just fine. – Milan Valášek Nov 16 '18 at 10:10

Plotting the area under the curve of various distributions in R

2 Answers2