0

the key issue is i need it dynamic, as I many not know which columns are empty (especially if new ones are added).

here is my current attempt based on code from here:

singleAction <- tibble::tibble(
  datePublished = as.Date("2022/01/01"),
  title = "",
  summary = "   ",
  createdby = "bob",
  customer = "james",
  contactAnalyst = "jack",
  type = "fighting",
  status = "draft",
  draftLink = "            ",
  finalLink = " ",
  invalidated = FALSE,
  number = 4
  
) 
# idea 1, in two steps >
singleAction  <- singleAction %>%
      mutate_if(is.character, map(stringr::str_remove_all(., " "))) %>%
      mutate_if(is.character, mutate(across(.fns = ~replace(., . ==  "" , NA))))

# idea 2 
singleAction <- mutate_if(is.character, replace(., grepl("^\\s*$", .) == TRUE, NA))
    
    
print(singleAction)

this code works for a single variable but a table/column:

replaceEmpty <- function(x){
  if(nchar(x) == 0 || stringr::str_remove_all(x, " ") == ""){
    return(NA)
  }
  return(x)
}

in this case I am updating to NA so that the database (t-sql/ms sql server) stores NULL in these fields.

Aaron C
  • 301
  • 1
  • 8

1 Answers1

2

str_squish from the stringr package removes whitespace at the start and end, and replaces all internal whitespace with a single space. This makes it easy to test for an empty string and convert to NA:

library(dplyr)
library(stringr)

singleAction |>
  mutate(across(where(is.character), ~ ifelse(str_squish(.) == "", NA, .)))

Output

  datePublished title summary createdby customer contactAnalyst type     status
  <date>        <lgl> <lgl>   <chr>     <chr>    <chr>          <chr>    <chr> 
1 2022-01-01    NA    NA      bob       james    jack           fighting draft 
# ℹ 4 more variables: draftLink <lgl>, finalLink <lgl>, invalidated <lgl>,
LMc
  • 12,577
  • 3
  • 31
  • 43
  • Thanks, that was quick. this also works: singleAction <- singleAction %>% mutate_if(is.character, ~ ifelse(stringr::str_squish(.) == "", NA, .)) – Aaron C Aug 01 '23 at 21:33
  • 1
    You're welcome! Yup that will work, but just an FYI, that after dplyr 1.0.0 scoped verbs (e.g., _if, _at, etc.) were superseded in favor of `across`. – LMc Aug 01 '23 at 21:35
  • ah, it is.. cool. for future reference another alternative: singleAction <- singleAction %>% mutate(across(where(is.character), ~ replace(.,stringr::str_squish(.) == "", NA))). The ifelse is more flexible which may or may not be good. – Aaron C Aug 01 '23 at 21:42
  • @AaronC Yes! That will work too. – LMc Aug 01 '23 at 21:44