-1

Aim: I want to calculate the percentage of observations (CONC) that lie outside 90% CI for all subjects in the data.

My data frame contains the following columns:

df <-
ID  TIME CONC   CI90low     CI90hi
1   4   9.38    0.870240934 133.6468179
1   5   37.5    0.936887451 140.4165014
1   6   50.9    1.804344597 16.7551025
1   8   53.5    55.34913078 146.1486235
1   10  64.8    8.433188849 126.9535201
1   12  47.8    15.48328251 94.23716498
1   24  19.4    2.457364534 34.00074335
1   36  5.54    1.107788098 22.38902995
1   48  2.52    0.456572767 14.28822964
2   1   7.23    0.309733729 52.68946657
2   1.5 27.1    0.705395145 100.630645
2   10  51.1    9.78008354  134.8669611
2   12  37.1    5.500102861 94.25775578

I thought of one possible way to accomplish this but I am not sure how to code it in R.

My idea is to add a new column in the data frame then:

1) For each subject at each time point (ID,TIME) check if the concentration (CONC) lies between the lower and upper limit of the 90% CI provided. If YES, then add a value of Zero0 to the new column if NO then add a value of 1. I tried ifelse but wasn't able to nail it down.

2)count the numbers of zeros in the column. Then:

    % of observations outside the 90%CI = total number of ONEs/length(df)*100%

I would appreciate your help in coding this. Perhaps, you may have another way of doing it.

Amer
  • 2,131
  • 3
  • 23
  • 38

2 Answers2

0

A simple approach:

mean(with(df, CONC < CI90low | CONC > CI90hi)) * 100
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
  • Thanks Sven. I have modified the data frame to have values outside the 90%CI. It looks that the second solution you provided puts 100 if the CONC fell outside the 90CI. I want rather to calculate the percentage of the observations (As a whole) for all subjects that lie outside the 90CI – Amer Jan 03 '15 at 12:27
0

I found an answer myself:

1) add an indicator column with values of 0 (if within) or 1 (if outside the 90CI) using:

  df$IND <- ifelse(df$CONC < df$CI90low|df$CONC > df$CI90hi,1,0) 

2) calculate the percentage outside by:

 Percernt_Out <- sum(df$IND)/length(df$IND)*100

Note that sum(df$IND) will give the total number of observations outside the 90CI as others have a a value of 0in them.

Amer
  • 2,131
  • 3
  • 23
  • 38
  • You could simplify 1) to `df$IND <- df$CONC < df$CI90low | df$CONC > df$CI90hi` resulting in a logical vector of TRUE/FALSE which you can then use to compute the sum. – talat Jan 03 '15 at 15:55
  • In addition to @docendo discumus' suggestion, `Percent_Out <- mean(df$IND)*100` – Khashaa Jan 03 '15 at 16:07