How to avoid scaling dummy variables in dataframe in r?

Question

I want to standardise all my variables before applying machine learning methods. However, to my understanding, dummy variables should never be standardised. After entering the following code, r standardized all my variables, even the ones which are binary. How can I avoid this happening?

#standardize all non-categorical variables to have mean zero and a standard deviation of one

df_standardized <- df %>% mutate(across(where(is.numeric), scale))

I checked my data types are they are "int", not numeric. Thank you in advance for your help.

An `int` type is an integer and is considered a numeric type. Did you just want to scale the numeric, non-integer values? Try `is.double` rather than `is.numeric` — MrFlick, Nov 08 '22 at 18:41

akrun · Accepted Answer · 2022-11-08T18:42:32.097

1

scale returns a matrix, we can convert the matrix to vector by either as.numeric or as.vector. In addition, use inherits for only modifying the numeric columns

library(dplyr)
out <- df %>% 
   mutate(across(where(~ inherits(.x, "numeric")),
         ~ as.numeric(scale(.x))))

data

data(iris)
df <- iris
df$intCol <- 1L

edited Nov 08 '22 at 18:42

answered Nov 08 '22 at 18:37

akrun

874,273
37
540
662

How to avoid scaling dummy variables in dataframe in r?

1 Answers1

data