0

I want to convert factors into numeric. I know this a FAQ, I tried already as.numeric(levels(f))[f] and as.numeric(as.character(f)). But this doesn't help as I want to convert all columns (more than 1000 and all of type factor) of the df into numeric. How can I do this?

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
N.Varela
  • 910
  • 1
  • 11
  • 25
  • Maybe an example of your data? Also I guess you import that with read.table. If that's the case try stringsAsFactors=F – Matias Andina Nov 12 '15 at 13:55
  • 2
    Why do you have columns that are of class factor and should be numeric? That's the question you should be investigating. – Roland Nov 12 '15 at 14:03
  • Just to clarify -- do you want to convert the **value** of the factor to numeric, or the **level**? In other words, if my levels were "10", "1000", and "100000" -- what should these become? Do the values have other symbols in them ("$100,000") that are preventing `read.table` from recognizing them as numeric? – C8H10N4O2 Nov 12 '15 at 14:08
  • Sorry for late answer: I want the values not levels. I read the data using read.csv(). I have more columns that are really factors (first 10 cols), that is just a subset of columns 11:n. Is maybe there a way to assign to this column numeric by read.csv? – N.Varela Nov 13 '15 at 14:39

2 Answers2

-1

We can try

 yourdat[] <- lapply(yourdat, function(x) if(is.factor(x)) as.numeric(levels(x))[x]
                              else x)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Why `as.numeric(levels(x))[x]` and not just `as.numeric(x)` since we know x is a factor ? – C8H10N4O2 Nov 12 '15 at 14:05
  • @C8H10N4O2 Try `v1 <- factor(c(1, 132, 27)); as.numeric(v1)` – akrun Nov 12 '15 at 15:44
  • right -- I'm not sure which of these conversions the OP actually wants. I guess we'll find out if he accepts your answer. – C8H10N4O2 Nov 12 '15 at 18:54
  • @C8H10N4O2 I just noticed that all the columns are `factors`. In that case the `if/else` condition is not needed. – akrun Nov 12 '15 at 18:55
  • 1
    `as.data.frame(lapply(yourdat, function(x) as.numeric(levels(x))[x]))`works fine thanks @akrun – N.Varela Nov 13 '15 at 17:40
  • @N.Varela Yes, it works well. I thought in case you missed any numeric columns, then the `if/else` could be used. – akrun Nov 13 '15 at 17:42
  • @akrun However, the colnames gets "." instead of " ". I use country names (e.g. Antigua and Barbuda) and I need them to be the same after the conversion. Is there a way to exclude the colnames from this function? – N.Varela Nov 13 '15 at 17:51
  • @N.Varela I haven't tested this as there was no example data. If you get `.` as column names, then change the column name after the conversion. i.e. `nm1 <- names(yourdat); setNames(as.data.frame(lapply(yourdat, function(x) as.numeric(levels(x))[x])), nm1)` – akrun Nov 13 '15 at 17:55
-1

Try this:

for(name in names(df)){
    if(is.factor(df[[name]])){
        class(df[[name]]) <- "numeric"
        #df[[name]] <- as.numeric(df[[name]])
    }
}

In above solution you can replace class(df[[name]]) <- "numeric" with df[[name]] <- as.numeric(df[[name]]). Both will give same result.

Example:

mtcars_copy <- data.frame(mpg = mtcars$mpg, cyl = mtcars$cyl, disp = as.factor(mtcars$disp))

str(mtcars_copy)
## 'data.frame':    32 obs. of  3 variables:
## $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
## $ disp: Factor w/ 27 levels "71.1","75.7",..: 13 13 6 16 23 15 23 12 10 14 ...

for(name in names(mtcars_copy)){
    if(is.factor(mtcars_copy[[name]])){
        mtcars_copy[[name]]) <- as.numeric(mtcars_copy[[name]])
    }
}

class(mtcars_copy$disp)
## [1] "integer"

str(mtcars_copy)
## 'data.frame':    32 obs. of  3 variables:
## $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
## $ disp: int  13 13 6 16 23 15 23 12 10 14 ...

class numeric is a group of many classes. class integer and double are two of those sub-classes. R auto converts between different numeric classes as per requirements.

narendra-choudhary
  • 4,582
  • 4
  • 38
  • 58
  • sorry, I don't get it to work, I turned df into my df-name and the result is however NULL. – N.Varela Nov 13 '15 at 17:43
  • @N.Varela I've added an example. See if this solution works for you now. You don't have to convert data frame into anything. `names(df)` is a list of column names of data-frame `df`. Run `?names` in R console for more details. – narendra-choudhary Nov 14 '15 at 01:43