-3

Lets take mtcars as example and create a new variable:

mtcars$name <-  rownames(mtcars)
mtcars[,] <- lapply(mtcars, factor)
mtcars[,] <- lapply(mtcars, as.numeric)

Now the names are converted into numerics which i definitely dont want

> mtcars
                    mpg cyl disp hp drat wt qsec vs am gear carb name
Mazda RX4            16   2   13 11   16  9    6  1  2    2    4   18
Mazda RX4 Wag        16   2   13 11   16 12   10  1  2    2    4   19
Datsun 710           19   1    6  6   15  7   22  2  2    2    1    5
Hornet 4 Drive       17   2   16 11    5 16   24  2  1    1    1   13
Hornet Sportabout    13   3   23 15    6 18   10  1  1    1    2   14
Valiant              12   2   15  9    1 19   29  2  1    1    1   31
Duster 360            3   3   23 20    7 21    5  1  1    1    4    7
Merc 240D            20   1   12  2   11 15   27  2  1    2    2   21

How can i convert factors back into the right formats.(char,log,num ...) ?

Andre Elrico
  • 10,956
  • 6
  • 50
  • 69
  • which you can find in `help("factor")` under "Warning". – Roland Oct 05 '16 at 12:59
  • R FAQ 7.10: https://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f – Ben Bolker Oct 05 '16 at 13:05
  • that's exactly what the duplicate question addresses. Identify which columns are factors, then only apply the conversion to those columns. – Ben Bolker Oct 05 '16 at 13:24
  • then they shouldn't have a problem. Since *all* columns are now factors, they want them all converted back to numeric. I'm not clear on the question ... ??? – Ben Bolker Oct 05 '16 at 13:27
  • I think its not possible to do what i want. I thought factor levels are something beeing added to a type (like num, char ...). But factors are a own type. How can R ever know when going away from factor what was originaly a num and what is char. – Andre Elrico Oct 05 '16 at 13:56
  • I think you still haven't clarified your question sufficiently. It's true that if I start with `dd <- data.frame(x1=c("5","6","7","8"),x2=5:8)`, then do `dd2 <- as.data.frame(lapply(dd,factor))`, there's no way to get back from `dd2` to `dd`. On the other hand, if I start from `dd <- data.frame(x1=c("a","b","c","d"),x2=5:8)`, I *can* probably figure it out, because I can establish that `x1` can't be successfully be converted back to numeric (`as.numeric(as.character(dd2$x1))` will give `NA` values and a warning). ... – Ben Bolker Oct 05 '16 at 14:30
  • If what I described in the previous comment is what you want, though, you need to give a **reproducible example that illustrates your point** - trying to read your mind is frustrating ... – Ben Bolker Oct 05 '16 at 14:31
  • I would suggest that you take a shot at editing your answer below (which isn't an answer ...) into your question -- then we can re-open the question and try to answer it ... – Ben Bolker Oct 05 '16 at 14:35

2 Answers2

1
df <- data.frame(x = factor(1:10)
                 ,y = factor(1:10))

str(df)

df[,] <- lapply(df, function(x) {as.numeric(as.character(x))})

str(df)

result

'data.frame':   10 obs. of  2 variables:
$ x: num  1 2 3 4 5 6 7 8 9 10
$ y: num  1 2 3 4 5 6 7 8 9 10
Wietze314
  • 5,942
  • 2
  • 21
  • 40
1

It is possible that type.convert would suit your needs. It coerces its input to the most basic data type that can represent it. Thus, it would turn a character column that contains numbers that can be represented as integer into an integer column.

mtcars$name <-  rownames(mtcars)
str(mtcars)
# 'data.frame': 32 obs. of  12 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
# $ disp: num  160 160 108 258 360 ...
# $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
# $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
# $ qsec: num  16.5 17 18.6 19.4 17 ...
# $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
# $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
# $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
# $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
# $ name: chr  "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ...

mtcars[,] <- lapply(mtcars, factor)
str(mtcars)
# 'data.frame': 32 obs. of  12 variables:
# $ mpg : Factor w/ 25 levels "10.4","13.3",..: 16 16 19 17 13 12 3 20 19 14 ...
# $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
# $ disp: Factor w/ 27 levels "71.1","75.7",..: 13 13 6 16 23 15 23 12 10 14 ...
# $ hp  : Factor w/ 22 levels "52","62","65",..: 11 11 6 11 15 9 20 2 7 13 ...
# $ drat: Factor w/ 22 levels "2.76","2.93",..: 16 16 15 5 6 1 7 11 17 17 ...
# $ wt  : Factor w/ 29 levels "1.513","1.615",..: 9 12 7 16 18 19 21 15 13 18 ...
# $ qsec: Factor w/ 30 levels "14.5","14.6",..: 6 10 22 24 10 29 5 27 30 19 ...
# $ vs  : Factor w/ 2 levels "0","1": 1 1 2 2 1 2 1 2 2 2 ...
# $ am  : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
# $ gear: Factor w/ 3 levels "3","4","5": 2 2 2 1 1 1 1 2 2 2 ...
# $ carb: Factor w/ 6 levels "1","2","3","4",..: 4 4 1 1 2 1 4 2 2 4 ...
# $ name: Factor w/ 32 levels "AMC Javelin",..: 18 19 5 13 14 31 7 21 20 22 ...


mtcars[,] <- lapply(mtcars, function(x) type.convert(as.character(x), as.is = TRUE))
str(mtcars)
#'data.frame':  32 obs. of  12 variables:
#$ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#$ cyl : int  6 6 4 6 8 6 8 4 4 6 ...
#$ disp: num  160 160 108 258 360 ...
#$ hp  : int  110 110 93 110 175 105 245 62 95 123 ...
#$ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#$ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
#$ qsec: num  16.5 17 18.6 19.4 17 ...
#$ vs  : int  0 0 1 1 0 1 0 1 1 1 ...
#$ am  : int  1 1 1 0 0 0 0 0 0 0 ...
#$ gear: int  4 4 4 3 3 3 3 4 4 4 ...
#$ carb: int  4 4 1 1 2 1 4 2 2 4 ...
#$ name: chr  "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ...

If you don't store the original column classes before you turn the columns into factors, there is no way to restore this information completely. However, that shouldn't be necessary anyway.

Roland
  • 127,288
  • 10
  • 191
  • 288