0

I have a database saved in excel, and when I bring it into R there are many columns that should be numeric, but they get listed as characters. I know that in read_excel I can specify each column format using the col_types = "numeric", but I have > 500 columns, so this gets a bit tedious.

Any suggestions on how to do this either when importing with read_excel, or after with dplyr or something similar?

I can do this 1 by 1 using a function that I wrote but it still requires writing out each column name

convert_column <- function(data, col_name) {
  new_col_name <- paste0(col_name)
  data %>% mutate(!!new_col_name := as.numeric(!!sym(col_name)))
}

  convert_column("gFat_OVX") %>%
  convert_column("gLean_OVX")%>%
  convert_column("pFat_OVX") %>%
  convert_column("pLean_OVX")

I would ideally like to say "if a column contains the text "Fat" or "Lean" in the header, then convert to numeric", but I'm open to suggestions.

select(df, contains("Fat" | "Lean"))

I'm not sure how to make an example that allows people to test this out, given that we're starting with an excel sheet here.

Erin Giles
  • 73
  • 6
  • Try `df %>% mutate_if(is.character, as.numeric)`, this changes all columns to numeric if it encounters any character col – AlSub Mar 21 '21 at 22:22
  • This is super-close -- it will change everything to numeric (or if I use "select" first, then I lose the other character columns I want) Can I somehow combine the mutate_if with something else? ie, ```df %>% mutate_if(select(contains ("Fat")) & is.character, as.numeric) ```--although I tried this and it doesn't work – Erin Giles Mar 21 '21 at 22:32

1 Answers1

0

dplyr::mutate and across may be a solution after reading in the data.

Something like this, where df1 is your data frame from read_excel:

library(dplyr)

df1 <- df1 %>% 
  mutate(across(contains(c("Fat", "Lean")), ~as.numeric(.x)))
neilfws
  • 32,751
  • 5
  • 50
  • 63