2

I have blood concentration versus time data for 100 subjects. I am interested in plotting the 5, 50 and 95% quantile concentration vs time curves. While i can determine the quantiles for the entire concentration range, I am unable to figure out in R how to stratify the concentration quantiles by time. Any help would be appreciated.

a<-quantile(conc~time, 0.05) 

does not work.

Andre Silva
  • 4,782
  • 9
  • 52
  • 65
tcek
  • 51
  • 4

3 Answers3

2

Assuming a dataframe, df, with columns df$subject, df$time, and df$conc, then

q <- sapply(c(low=0.05,med=0.50,high=0.95),
              function(x){by(df$conc,df$time,quantile,x)})

generates a matrix, q, with columns low, med, and high containing the 5, 50, and 95% quantiles, one row for each time. Full code below.

# generate some moderately realistic data
# concentration declines exponentially over time
# rate (k) is different for each subject and distributed as N[50,10]
# measurement error is distributed as N[1, 0.2]
time    <- 1:1000
df      <- data.frame(subject=rep(1:100, each=1000),time=rep(time,100))
k       <- rnorm(100,50,10)   # rate is different for each subject
df$conc <- 5*exp(-time/k[df$subject])+rnorm(100000,1,0.2)

# generates a matrix with columns low, med, and high 
q <- sapply(c(low=0.05,med=0.50,high=0.95),
            function(x){by(df$conc,df$time,quantile,x)})
# prepend time and convert to dataframe
q <- data.frame(time,q)
# plot the results
library(reshape2)
library(ggplot2)
gg <- melt(q, id.vars="time", variable.name="quantile", value.name="conc")
ggplot(gg) + 
  geom_line(aes(x=time, y=conc, color=quantile))+
  scale_color_discrete(labels=c("5%","50%","95%"))

jlhoward
  • 58,004
  • 7
  • 97
  • 140
0

Ideally some data would help to make sure but this should work:

a<-by(conc,time,quantile,0.05)

If conc and time are both in data frame (call it frame1):

a<-by(frame1$conc,frame1$time,quantile,probs=c(0.05,0.5))
crogg01
  • 2,446
  • 15
  • 35
0

This is another approach using data.table. I'm not sure if this is what you are looking for, but one option is to cut the time variable and convert it to 3 categories (or what you need) using cut() and then calculate the quantiles for each group.

Define your function

qt <- function(x) quantile(x, probs = c(0.05, 0.5, 0.95))

Create Data

DT <- data.table(time = sample(1:100, 100), blood_con = sample(500:1000, 100))
DT$cut_time <- cut(DT$time, right = FALSE, breaks = c(0, 30, 60, 10e5), 
                   labels = c("LOW", "MEDIUM", "HIGH"))

head(DT)

Apply qt function to all columns and group by cut_time

Q <- DT[, list(blood_con = qt(blood_con)), by = cut_time]
Q$quantile_label <- as.factor(c("5%", "50%", "95%"))

Plot

ggplot(Q, (aes(x = cut_time, y = blood_con, label = quantile_label, color = quantile_label))) + 
  geom_point(size = 4) +
  geom_text(hjust = 1.5)
marbel
  • 7,560
  • 6
  • 49
  • 68