-2

You may recognise this from Kaggle. I have multiple columns called Soil_Type1 all the way to Soil_Type40. The have the value 0 if that soil type is absent or 1 if it is present. Only 1 soil type can be present per row.

I want to create a new column that takes the value S1 if Soil_Type1 = 1, S2 if Soil_Type2 = 1 etc. I can do it brute force, i.e. each row at time. Is there any way of looping this?

train_raw[,16:53 := lapply(.SD, as.character), .SDcols =16:53 ]

train_raw[,Soil_Type := "" ]
train_raw[Soil_Type1 == 1, Soil_Type := "S1"]
train_raw[Soil_Type2 == 1, Soil_Type := "S2"]
train_raw[Soil_Type3 == 1, Soil_Type := "S3"]
train_raw[Soil_Type4 == 1, Soil_Type := "S4"]

EDIT:

Sorry, is this what you mean by a reproducible example?

train_raw <- data.table(Soil_Type = "", 
                        Soil_Type1 = c(0,0,0,1), 
                        Soil_Type2 = c(0,0,1,0), 
                        Soil_Type3 = c(1,1,0,0))

train_raw[,Soil_Type := "" ]
cchamberlain
  • 17,444
  • 7
  • 59
  • 72
Conor
  • 27
  • 1
  • 7
  • It's hard to tell without a reproducible example, but something like that should work `train_raw[, Soil_Type := "" ] ; indx <- which(names(train_raw) == "Soil_Type") ; cols <- paste0("Soil_Type", 1:4) ; for(j in 1:length(cols)) set(train_raw, i = train_raw[[cols[j]]] == 1L, j = indx, value = paste0("S", j))` while you can play around with `col` whatever you like. – David Arenburg Oct 15 '15 at 11:28
  • Thanks for your reply and sorry for not having a reproducible example. I tried you code but in the for loop i get an error - Error in set(train_raw, i = train_raw[[cols[j]]] == 1L, j = indx, value = paste0("S", : i is type 'logical'. Must be integer, or numeric is coerced with warning. If i is a logical subset, simply wrap with which(), and take the which() outside the loop if possible for efficiency. – Conor Oct 15 '15 at 12:41
  • Simply change to `train_raw[[cols[j]]] == 1L` to `which(train_raw[[cols[j]]] == 1L)` within `set`. You also only have 3 columns here instead of 4, so change `cols` to `cols <- paste0("Soil_Type", 1:3)` – David Arenburg Oct 15 '15 at 12:46
  • That worked, thank you very much. – Conor Oct 15 '15 at 13:08

1 Answers1

1

Thanks to David Arenburg for answering.

train_raw[, Soil_Type := "" ]
indx <- which(names(train_raw) == "Soil_Type")
cols <- paste0("Soil_Type", 1:4)
for(j in 1:length(cols))
   set(train_raw,which(train_raw[[cols[j]]] == 1L), 
       j = indx, value = paste0("S", j))
Community
  • 1
  • 1
Conor
  • 27
  • 1
  • 7