4

I'm in the process of creating a generic function in my package. The goal is to find columns that are percent columns, and then to use parse_number on them if they are character columns. I haven't been able to figure out a solution using mutate_at and ifelse. I've pasted a reprex below.

 library(tidyverse)


df <- tibble::tribble(
  ~name, ~pass_percent, ~attendance_percent, ~grade,
  "Jon",         "90%",                0.85,    "B",
  "Jim",        "100%",                   1,    "A"
  )

percent_names <- df %>% select(ends_with("percent"))%>% names()


# Error due to attendance_percent already being in numeric value

if (percent_names %>% length() > 0) {
    df <-
      df %>%
      dplyr::mutate_at(percent_names, readr::parse_number)
  }
#> Error in parse_vector(x, col_number(), na = na, locale = locale, trim_ws = trim_ws): is.character(x) is not TRUE
Jazzmatazz
  • 615
  • 7
  • 18

2 Answers2

4

your attendance_percent variable is numeric, not character and parse_number only wants character variables, see here. So a solution would be:

edited_parse_number <- function(x, ...) {
  if (mode(x) == 'numeric') {
    x
  } else {
    parse_number(x, ...)
  }
}


df %>%
  dplyr::mutate_at(vars(percent_names), edited_parse_number)

#  name  pass_percent attendance_percent grade
#  <chr>        <dbl>              <dbl> <chr>
#1 Jon             90               0.85 B    
#2 Jim            100               1    A   

OR

if you didn't want to use that extra function, extract character variables at beginning:

percent_names <- df %>% 
  select(ends_with("percent")) %>% 
  select_if(is.character) %>% 
  names()
percent_names
# [1] "pass_percent"


df %>%
  dplyr::mutate_at(vars(percent_names), parse_number)
#   name  pass_percent attendance_percent grade
#   <chr>        <dbl>              <dbl> <chr>
# 1 Jon             90               0.85 B    
# 2 Jim            100               1    A    
user63230
  • 4,095
  • 21
  • 43
2

Alternatively, without having to create a function, you can just add an ifelse statement into mutate_at such as:

if (percent_names %>% length() > 0) {
  df <-
    df %>% rowwise() %>%
    dplyr::mutate_at(vars(percent_names), ~ifelse(is.character(.), 
                                                  parse_number(.),
                                                  .))
}

Source: local data frame [2 x 4]
Groups: <by row>

# A tibble: 2 x 4
  name  pass_percent attendance_percent grade
  <chr>        <dbl>              <dbl> <chr>
1 Jon             90               0.85 B    
2 Jim            100               1    A    
dc37
  • 15,840
  • 4
  • 15
  • 32
  • this is neater than my answer but your final `.` is changing your numbers in your output? – user63230 Mar 11 '20 at 15:51
  • Ok, I corrected it by adding `rowwise` first. But not sure about the logic behind that. – dc37 Mar 11 '20 at 15:55
  • Thanks, but what does rowwise do? – Jazzmatazz Mar 11 '20 at 16:03
  • You're welcome ;). `rowwise` will evaluate the dataframe rows by rows (https://dplyr.tidyverse.org/reference/rowwise.html). Not sure, why it is required here but it seems that the `ifelse` statement will evaluate the first rows of columns of `percent_names` and then apply the TRUE or FALSE conditions to the first rows. – dc37 Mar 11 '20 at 16:06