2

I have a dataframe df_3 from which I want to mutate multiple columns starting with Team_. I want to replace 0s contained in the columns with NA. I use a code which I have previously successfully used but now gives me the following error:

Error in `mutate()`:
ℹ In argument: `across(starts_with("Team_"), ~na_if(., "0"))`.
Caused by error in `across()`:
! Can't compute column `Team_Num_1`.
Caused by error in `na_if()`:
! Can't convert `y` <character> to match type of `x` <double>.
Backtrace:
  1. df_3 %>% mutate(across(starts_with("Team_"), ~na_if(., "0")))
 10. dplyr::na_if(Team_Num_1, "0")

Any idea why that is or how I can solve it? I did not change anything in the original df and the code used to run through before hand, not sure what has changed.

Replicable code:

structure(list(Team_1 = c("0", "werg", "sdf"), Team_Desc_1 = c("wer", 
"wtrb", "wergt"), Team_URL_1 = c("ewrg", "werg", "asd"), Team_Ver_1 = c("25", 
"2523", "342"), Team_Num_1 = c(0, 23, 12), Team_Value_1 = c("aed", 
"jfsa", "vsf"), Name_1 = c("etwbv", "werg", "sdfg"), Txt_1 = c("abc", 
"bfh", "fse"), Head_1 = c("abc1", "bfh", "fse"), Team_2 = c("werh", 
"wtt", "qwe"), Team_Desc_2 = c("sdfg", "wer", "sdfgv"), Team_URL_2 = c("qwe", 
"gvre", "vrw"), Team_Ver_2 = c("4123", "5133", "4126"), Team_Num_2 = c(3, 
0, 123), Team_Value_2 = c("aewed", "jfsbwa", "vsbf"), Name_2 = c("qwreg", 
"gvr", "wref"), Txt_2 = c("rege", "wer", "vwr"), Head_2 = c("rege1", 
"wer", "vwr")), row.names = c(NA, -3L), class = c("tbl_df", "tbl", 
"data.frame"))
zephryl
  • 14,633
  • 3
  • 11
  • 30
Soph2010
  • 563
  • 3
  • 13
  • 3
    Your Team data isn't numeric. It won't match a `0` but it will match `"0"` so use `na_if(., "0")` instead – MrFlick Feb 15 '23 at 15:57
  • Thanks @MrFlick. Makes total sense, however, I have one column that does have numeric data, so if I use "0" that column doesn't work? I edited the dput() to replicate the error – Soph2010 Feb 15 '23 at 16:00
  • 3
    Then use additional conditions - `across(starts_with("Team_") & where(is.numeric), ...)` for the numeric one and vice versa for the non-numeric. Or type-correct your data upstream (probably better). – Gregor Thomas Feb 15 '23 at 16:03

2 Answers2

9

According to the changelog for dplyr 1.1.0, na_if() now uses the vctrs package, which is stricter about type stability:

na_if() (#6329) now casts y to the type of x before comparison, which makes it clearer that this function is type and size stable on x.

So instead, use na_if(x, "0"):

library(dplyr)

dat %>%
  mutate(across(starts_with("Team_"), ~ na_if(.x, "0")))
# # A tibble: 3 × 18
#   Team_1 Team_Desc_1 Team_UR…¹ Team_…² Team_…³ Team_…⁴ Name_1 Txt_1 Head_1 Team_2
#   <chr>  <chr>       <chr>     <chr>   <chr>   <chr>   <chr>  <chr> <chr>  <chr> 
# 1 NA     wer         ewrg      25      aed     aed     etwbv  abc   abc1   werh  
# 2 werg   wtrb        werg      2523    jfsa    jfsa    werg   bfh   bfh    wtt   
# 3 sdf    wergt       asd       342     vsf     vsf     sdfg   fse   fse    qwe   
# # … with 8 more variables: Team_Desc_2 <chr>, Team_URL_2 <chr>,
# #   Team_Ver_2 <chr>, Team_Num_2 <chr>, Team_Value_2 <chr>, Name_2 <chr>,
# #   Txt_2 <chr>, Head_2 <chr>, and abbreviated variable names ¹​Team_URL_1,
# #   ²​Team_Ver_1, ³​Team_Num_1, ⁴​Team_Value_1

If you have a mix of character and numeric columns, you could do:

dat2 <- tibble(
  Team_1 = c("0", "werg", "sdf"), 
  Team_Desc_1 = c(0, 3, 4), 
  Name_1 = c("etwbv", "werg", "sdfg")
)

dat2 %>% 
  mutate(
    across(starts_with("Team_") & where(is.character), ~ na_if(.x, "0")),
    across(starts_with("Team_") & where(is.numeric), ~ na_if(.x, 0)),
  )
# # A tibble: 3 × 3
#   Team_1 Team_Desc_1 Name_1
#   <chr>        <dbl> <chr> 
# 1 NA              NA etwbv 
# 2 werg             3 werg  
# 3 sdf              4 sdfg 
zephryl
  • 14,633
  • 3
  • 11
  • 30
-1

I had the same issue and for the sake of simplicity opted for the following, which should work regardless of data classes of different columns.

all_data[all_data %in% c(-Inf, Inf)] <- NA
nick
  • 1