24

I am trying to subset a data frame by taking the integer values of 2 columns om my data frame

Subs1<-subset(DATA,DATA[,2][!is.na(DATA[,2])] & DATA[,3][!is.na(DATA[,3])])

but it gives me an error : longer object length is not a multiple of shorter object length.

How can I construct a subset which is composed of NON NA values of column 2 AND column 3?

Thanks a lot?

EnginO
  • 321
  • 3
  • 4
  • 8
  • 4
    I'd try `DATA[complete.cases(DATA[, 2:3]), ]` - all rows except those with NA in column 2 and column 3. – lukeA Feb 13 '15 at 09:16
  • I have never used tihs statement luke, could you provide me the whole line in R syntax? – EnginO Feb 13 '15 at 09:20
  • 1
    This is very basic and there's plenty of information on the web: https://www.google.com/search?q=r+subsetting. Also look at `?complete.cases`. – lukeA Feb 13 '15 at 09:25

3 Answers3

31

Try this:

Subs1<-subset(DATA, (!is.na(DATA[,2])) & (!is.na(DATA[,3])))

The second parameter of subset is a logical vector with same length of nrow(DATA), indicating whether to keep the corresponding row.

cogitovita
  • 1,685
  • 1
  • 15
  • 15
11

The na.omit functions can be an answer to you question

 Subs1 <- na.omit(DATA[2:3])

[https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html]

Berecht
  • 1,085
  • 9
  • 23
3

Here an example. a,b ,c are 3 vectors which a and b have a missing value. once they are created i use cbind in order to bind them in one matrix which afterwards you can transform to data frame.

The final result is a dataframe where 2 out of 3 columns have a missing value. So we need to keep only the rows with complete cases.DATA[complete.cases(DATA), ] is used in order to keep only these rows that have not missing values in every column. subset object is these rows that have complete cases.

  a <- c(1,NA,2)
  b <- c(NA,1,2)
  c <- c(1,2,3)
  DATA <- as.data.frame(cbind(a,b,c))
  subset <-  DATA[complete.cases(DATA), ]