3

I'm making the leap from SPSS to R and I was wondering how you deal with system missings...

For example, if I wanted to rewrite the following SPSS code into R:

RECODE income (1 THRU 6 = copy) (else = SYSMIS) INTO income2

I am able to write the following recode:

income_2018$income2 <- dplyr::recode(income_2018$income, '1' = 1L, '2' = 2L, '3' = 3L, '4' = 4L, '5' 
= 5L, '6' = 6L)

How do I deal with system missings (the 'else' statement in the SPSS code)?

Thanks!

Johnny
  • 91
  • 1
  • 5

2 Answers2

1

You can add the .default argument which will recode all values not explicitly named:

dplyr::recode(income_2018$income, '1' = 1L, '2' = 2L, '3' = 3L, '4' = 4L, '5' 
= 5L, '6' = 6L, .default = NA_integer_)
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56
-1

If you convert the numbers to integer/numeric this should work automatically.

income_2018$income <- as.integer(income_2018$income)
#Or to change it to numeric
#income_2018$income <- as.numeric(income_2018$income)

It will return warning when converting non-numerics to integer before turning them to NA.

x <- c('1', '2', '4', '6', 'a')
as.integer(x)
#[1]  1  2  4  6 NA

Warning message: NAs introduced by coercion


As commented by @H 1 these would turn all numbers to it's numeric equivalent. If we are interested only in numbers between 1 and 6 we can do.

income_2018$income[income_2018$income > 6 | income_2018$income < 1] <- NA
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213