7

I have an xts in the following format

                   a        b     c        d       e        f   ......
2011-01-03         11.40    NA    23.12    0.23    123.11   NA  ......
2011-01-04         11.49    NA    23.15    1.11    111.11   NA  ......
2011-01-05         NA       NA    23.11    1.23    142.32   NA  ......
2011-01-06         11.64    NA    39.01    NA      124.21   NA  ......
2011-01-07         13.84    NA    12.12    1.53    152.12   NA  ......

Is there a function I can apply to generate a new xts or data.frame missing the columns containing only NA?

The position of the columns with the NAs isn't static so just removing those columns by name or position isn't possible

lab_notes
  • 407
  • 5
  • 11

4 Answers4

5

Supose DF is your data.frame

 DF [, -which(sapply(DF, function(x) sum(is.na(x)))==nrow(DF))]
               a     c    d      e
2011-01-03 11.40 23.12 0.23 123.11
2011-01-04 11.49 23.15 1.11 111.11
2011-01-05    NA 23.11 1.23 142.32
2011-01-06 11.64 39.01   NA 124.21
2011-01-07 13.84 12.12 1.53 152.12
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
  • A more robust solution would be: `DF[, sapply(DF, function(x) sum(is.na(x)))!=nrow(DF)]` because it would work even if there are no columns with all missing values (see my answer). – Joshua Ulrich Oct 27 '12 at 12:27
5

@Jiber's solution works, but might give you unexpected results if there are no columns with all NA. For example:

# sample data
library(xts)
data(sample_matrix)
x <- as.xts(sample_matrix)

# Jiber's solution, when no columns have all missing values
DF <- as.data.frame(x)
DF[, -which(sapply(DF, function(x) sum(is.na(x)))==nrow(DF))]
# data frame with 0 columns and 180 rows

Here's a solution that works whether or not there are columns that have all missing values:

y <- x[,apply(!is.na(x), 2, all)]
x$High <- NA
x$Close <- NA
z <- x[,apply(!is.na(x), 2, all)]
Community
  • 1
  • 1
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
2

Try this:

dataframe[,-which(apply(is.na(dataframe), 2, all))]
Fridiculous
  • 338
  • 3
  • 15
2

This seems simpler:

DF[, colSums(is.na(DF)) < nrow(DF)]
Jacob Amos
  • 996
  • 11
  • 18
Ali
  • 9,440
  • 12
  • 62
  • 92
  • It should be either `DF[, !colSums(!is.na(DF)) < nrow(DF)]` or `DF[, !colSums(is.na(DF)) > 0]` to work... – sdittmar Oct 13 '18 at 19:25