2

I'm working in R with data imported from a csv file and I'm trying to take a rowSum of a subset of my data. The data can either be 0, 1, or blank. I'd like to take a sum of all the 1s across all these rows (and ideally find a count of how many non-blank columns there are in each row, but that's my next problem). I am trying the following code:

df1 <- read.csv("/Users/ardyn/test.csv", header = T, na.strings = "")

rowSums(df1[,36:135])

Which gives me the following error:

"Error in rowSums(df1[, 36:135]) : 'x' must be numeric".

When I check, the columns I'm trying to sum across are factors with 3 levels (".","0","1").

How do I import the data or change my rowSums command so that when I take a sum across a subset of variables it just counts the 1s?

AkselA
  • 8,153
  • 2
  • 21
  • 34
Ardyn
  • 125
  • 2
  • 7

2 Answers2

2

Only numbers and NA can be handled by rowSums(). If it works, try setting na.strings=".".
Else we can substitute all . with NA after reading the csv.

df1 <- read.csv("/Users/ardyn/test.csv", header = TRUE, 
  na.strings = ".", stringsAsFactors=FALSE)

rowSums(df1[,36:135], na.rm=TRUE)

Example of changing . to NA post fact:

dtf <- as.data.frame(matrix(sample(c(".", "0", "1"), 20, replace=TRUE), 4))

sapply(dtf, function(x) as.numeric(gsub("\\.", "NA", x)))

#      V1 V2 V3 V4 V5
# [1,]  1  0  0  0  1
# [2,]  1  1  0  0  0
# [3,]  1  1 NA  1 NA
# [4,] NA NA  1  0  0
AkselA
  • 8,153
  • 2
  • 21
  • 34
  • The modification to the read and rowSums commands worked perfectly. I didn't need to substitute the . with NA. Thank you so much! – Ardyn Dec 07 '17 at 18:16
0

I am not sure if the previous answer took care of the problem where you wanted to sum over only the 1's. So maybe this is what you can do

df1 <- read.csv("/Users/ardyn/test.csv", header = TRUE, na.strings = ".",stringsAsFactors=FALSE)

myfun <- function(x) {
if (x==1) {
    return (as.numeric(x))
          }
else {
    return (0L)
      }
}
rowSums(apply(df1,c(1,2),myfun))

I think it should stop throwing the 'x' must be numeric error

Gompu
  • 415
  • 1
  • 6
  • 21