0

I have several columns (in a data frame) and boxplot them with this command:

boxplot(allm, las=2)

It´s all ok, but I wish to put in the label how many non-missing observations they are in each column (in this case the numeric values, because I have NAs too).

A data frame used to draw a boxplot looks like it:

NE001710    NE001360    NE001398    NE001380    NE001707
-0.12        -0.61       -0.61        -0.02       0.13
-0.58        -0.43       -0.24        -0.27      -0.47
  NA          0.19       -0.37        -0.14      -0.53
  NA         -0.13       -0.27        -0.38       0.05
  NA          NA          0.32        -0.34       0.01

The desired labels of the boxplot have to be something like NE001710(2), NE0011360(4)... NE001707(5)

gagolews
  • 12,836
  • 2
  • 50
  • 75
user3091668
  • 2,230
  • 6
  • 25
  • 42

2 Answers2

2

Try using the sprintf function (or paste or some other one that would allow you to create a set of labels before a call to boxplot).

Exemplary data:

data <- data.frame(v1=runif(5), v2=runif(5), v3=runif(5))
data[1,1] <- NA
data[1,2] <- NA
data[2,1] <- NA
data
##          v1         v2        v3
## 1        NA         NA 0.3031038
## 2        NA 0.99395272 0.9481445
## 3 0.4596111 0.17398552 0.6135870
## 4 0.9175369 0.02094728 0.7256759
## 5 0.1932377 0.71577514 0.8811639

Generate labels with sprintf - combine the column names and the number of NAs into one string:

(nam <- sprintf("%s (%d)",
                   colnames(data),
                   apply(data, 2, function(d) sum(!is.na(d)))
))
## "v1 (3)" "v2 (4)" "v3 (5)"

Draw:

boxplot(data, names=nam)
gagolews
  • 12,836
  • 2
  • 50
  • 75
1

You can change the names of your columns using the code below:

for(i in c(1:length(allm))){
colnames(allm)[i]<-paste(colnames(allm)[i],"(", sum(!is.na(allm[,i])),")", sep="")
}
DJack
  • 4,850
  • 3
  • 21
  • 45