-4

I have several lines of codes that I'm figuring out how to simplify. My attempts to do so have resulted in errors. Below is a small section of the lines of code:

SS_data$Cope1 <- as.numeric(SS_data$Cope1)
SS_data$Cope2 <- as.numeric(SS_data$Cope2)
SS_data$Cope3 <- as.numeric(SS_data$Cope3)
SS_data$Cope4 <- as.numeric(SS_data$Cope4)
SS_data$Cope5 <- as.numeric(SS_data$Cope5)
SS_data$Cope6 <- as.numeric(SS_data$Cope6)
SS_data$Cope7 <- as.numeric(SS_data$Cope7)
SS_data$Cope8 <- as.numeric(SS_data$Cope8)
SS_data$Cope9 <- as.numeric(SS_data$Cope9)
SS_data$Cope10 <- as.numeric(SS_data$Cope10)
SS_data$Cope11 <- as.numeric(SS_data$Cope11)
SS_data$Cope12 <- as.numeric(SS_data$Cope12)
SS_data$Cope13 <- as.numeric(SS_data$Cope13)
SS_data$Cope14 <- as.numeric(SS_data$Cope14)
SS_data$Cope15 <- as.numeric(SS_data$Cope15)
SS_data$Cope16 <- as.numeric(SS_data$Cope16)
SS_data$Cope17 <- as.numeric(SS_data$Cope17)
SS_data$Cope18 <- as.numeric(SS_data$Cope18)
SS_data$Cope19 <- as.numeric(SS_data$Cope19)
SS_data$Cope20 <- as.numeric(SS_data$Cope20)

I'm also trying to simplify the codes below. I end up recoding for each variable and I'm wondering if there is a way to simplify this as well.

WHOQOL16[WHOQOL16 == "Very dissatisfied"] <- 1
WHOQOL16[WHOQOL16 == "Dissatisfied"] <- 2
WHOQOL16[WHOQOL16 == "Neither satisfied nor dissatisfied"] <- 3
WHOQOL16[WHOQOL16 == "Satisfied"] <- 4
WHOQOL16[WHOQOL16 == "Very satisfied"] <- 5
              
WHOQOL17[WHOQOL17 == "Very dissatisfied"] <- 1
WHOQOL17[WHOQOL17 == "Dissatisfied"] <- 2
WHOQOL17[WHOQOL17 == "Neither satisfied nor dissatisfied"] <- 3
WHOQOL17[WHOQOL17 == "Satisfied"] <- 4
WHOQOL17[WHOQOL17 == "Very satisfied"] <- 5
              
WHOQOL18[WHOQOL18 == "Very dissatisfied"] <- 1
WHOQOL18[WHOQOL18 == "Dissatisfied"] <- 2
WHOQOL18[WHOQOL18 == "Neither satisfied nor dissatisfied"] <- 3
WHOQOL18[WHOQOL18 == "Satisfied"] <- 4
WHOQOL18[WHOQOL18 == "Very satisfied"] <- 5
              
WHOQOL19[WHOQOL19 == "Very dissatisfied"] <- 1
WHOQOL19[WHOQOL19 == "Dissatisfied"] <- 2
WHOQOL19[WHOQOL19 == "Neither satisfied nor dissatisfied"] <- 3
WHOQOL19[WHOQOL19 == "Satisfied"] <- 4
WHOQOL19[WHOQOL19 == "Very satisfied"] <- 5
Jarhed
  • 13
  • 3
  • Welcome to Stack Overflow. Can you please edit your post to include your data. The following code will generate a code snippet with 10 random records that you can paste into your original post: dput(dplyr::sample_n(YourDatasetsNameGoesHere, 10)). To use my code, you may need to install dplyr with: install.packages("dplyr") – itsMeInMiami Sep 12 '20 at 13:41
  • In `dplyr`, you can use `SS_data %>% mutate(across(starts_with('Cope'), as.numeric))` to turn all columns which start with `'Cope'` to numeric. For the second part are `WHOQOL16`, `WHOQOL17` are separate vectors in your global environment? – Ronak Shah Sep 12 '20 at 13:44
  • Thank you. As for WHOQOL16, WHOQOL17, not necessarily. Those are separate columns. – Jarhed Sep 12 '20 at 14:09
  • I tried SS_data %>% mutate(across(starts_with('Cope'), as.numeric)) and, unfortunately, it still remains as character. I have the dplyr installed and loaded from the library. It's also not giving any error, so I'm not quite sure how to troubleshoot it. The code would run, but when I check the structure, it still remains as character. – Jarhed Sep 12 '20 at 14:17

2 Answers2

0

Questions posted to the tag on SO should include reproducible data but I have done it for you this time in the Note at the end.

The following uses only base R.

First make a copy of DF in DF2 in case you want to run the code again from scratch since the code will overwrite DF2.

Next convert columns 1 and 2 to numeric and convert X, Y and Z in columns 3 and 4 to 1, 2 and 3. If non-numeric entries appear in columns 1 or 2 or entries that are not X, Y or Z appear in columns 3 or 4 then NA will be assigned to those entries. (Alternately for the second line of code there exists a recode function in the dplyr package and a different recode function having the same purpose in the car package.)

The column numbers are obvious in this example but if they are not in your data use expressions like grep("Cope", names(DF)) to get them.

DF2 <- DF
DF2[1:2] <- lapply(DF2[1:2], as.numeric)
DF2[3:4] <- lapply(DF2[3:4], match, c("X", "Y", "Z"))

giving the following where the warning is just to let you know that it encountered a value that could not be converted to numeric so it converted it to NA.

> DF2
Warning message:
In lapply(DF[1:2], as.numeric) : NAs introduced by coercion
   A  B  C  D
1  1 11  1  1
2 NA 12  2 NA
3  3 13 NA  3

Note

DF <- data.frame(A = c("1", "x", "3"), B = c("11", "12", "13"),
  C = c("X", "Y", "a"), D = c("X", NA, "Z"))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
0

In dplyr you can use across function to apply same function to multiple columns.

We change columns that start with "Cope" to numeric and recode the columns which start with "WHOQOL".

library(dplyr)

SS_data_new <- SS_data %>% 
                    mutate(across(starts_with('Cope'), as.numeric), 
                           across(starts_with('WHOQOL'), 
                           ~recode(., "Very dissatisfied" = 1, 
                                       "Dissatisfied" = 2, 
                                       "Neither satisfied nor dissatisfied" = 3, 
                                       "Satisfied" = 4, 
                                       "Very satisfied" = 5)))
SS_data_new
#  Cope1 Cope2 WHOQOL
#1     1     4      1
#2     2     5      1
#3     3     6      4
str(SS_data_new)
#data.frame':   3 obs. of  3 variables:
# $ Cope1 : num  1 2 3
# $ Cope2 : num  4 5 6
# $ WHOQOL: num  1 1 4

data

SS_data <- data.frame(Cope1 = c('1', '2', '3'), Cope2 = c('4', '5', '6'), 
           WHOQOL = c("Very dissatisfied", "Very dissatisfied", "Satisfied"))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I used the coding as you have shown. The code ran, but when I checked the structure, the attribute is still character and not numeric. And the WHOQOL columns were not recoded and remained the same. Any thoughts on why this is? – Jarhed Sep 12 '20 at 14:25
  • you need to assign the data back to a new or same variable. I have assigned it back to `SS_data_new` so check the structure of `SS_data_new` with `str(SS_data_new)`. – Ronak Shah Sep 12 '20 at 14:29
  • Thank you. I did do that initially and it still shows as character. (The code seems very straightforward and not quite sure why it's not working as intended.) – Jarhed Sep 12 '20 at 14:40
  • Do you have the same column names as shown here in your real data? – Ronak Shah Sep 12 '20 at 15:19
  • Yes, I do. The column names are the same. – Jarhed Sep 12 '20 at 15:45
  • Does it return you any error/warning message? Note that `across` is a new function introduced in `dplyr` 1.0.0 – Ronak Shah Sep 12 '20 at 15:46
  • No error at all. The code still runs. Just that when I check the structure after running the code, it still shows as character. – Jarhed Sep 12 '20 at 15:55
  • I just created a small example and it seems to work for me , see updated answer. Maybe you need to share your data similarly where we can reproduce the error from our end. – Ronak Shah Sep 12 '20 at 16:00
  • Thank you very much. I'll check it out. – Jarhed Sep 12 '20 at 16:13
  • @Jarhed Were you able to figure out the issue. Did it work for you? – Ronak Shah Sep 13 '20 at 03:11