1

I want to change the character "F" to "X" in a dataframe. Please see below.

df <- data.frame(N=c(1,2,3,4,5,6),CAT=c('A','B','C','D','E','F'))
df

Result:
      N CAT
    1 1   A
    2 2   B
    3 3   C
    4 4   D
    5 5   E
    6 6   F

I've run this code and it doesn't work

    df$CAT[df$CAT == 'F'] <- 'X'

Error in `$<-.data.frame`(`*tmp*`, code, value = character(0)) : 
  replacement has 0 rows, data has 6

This code seems to work on other data I've imported via csv. Is there a reason why it doesn't work with this specific dataframe I've created? Any help much appreciated.

zx8754
  • 52,746
  • 12
  • 114
  • 209
H.Cheung
  • 855
  • 5
  • 12
  • 1
    It is working for me though Please check whether there are leading/lagging spaces for 'CAT' – akrun Aug 05 '20 at 20:35
  • 2
    That works for me as well. Are you sure the error occurs when you copy/paste the code you've shared? – MrFlick Aug 05 '20 at 20:36
  • @akrun - there are times when i'm running code and isn't doing things as it should. This arose because i saw your previous solution to a problem. The guy was happy but when i tried to run it myself, it didn't work. Do i need to re-install R? Can R become corrupted or outdated? I've been loading a lot of packages recently, could that effect R? – H.Cheung Aug 05 '20 at 20:58
  • I don't save the session when I close the session. If you are saving the session on global env, it could pollute the env – akrun Aug 05 '20 at 20:59

2 Answers2

4

It is the proverbial stringsAsFactors=FALSE. For those reading it after R4.0 it is no longer a problem, but for many years before 2020 users struggled remembering that data.frame (and as.data.frame() for that matter) automatically coerces all strings to factors.

What then happens is that you are trying to introduce new levels into a factor and this is not how it needs to be done in R. If creation of factor was not an intention, you could just modify your data frame creation code.

df <- data.frame(N=c(1,2,3,4,5,6),
                 CAT=c('A','B','C','D','E','F'),
                 stringsAsFactors = FALSE)

If you, however, wanted to create a factor, here's how you can go about modifying the levels and recoding one of the levels.

df <- data.frame(N=c(1,2,3,4,5,6),
                 CAT=c('A','B','C','D','E','F'),
                 stringsAsFactors = TRUE)
df
str(df)
#> 'data.frame':    6 obs. of  2 variables:
#> $ N  : num  1 2 3 4 5 6
#> $ CAT: Factor w/ 6 levels "A","B","C","D",..: 1 2 3 4 5 6

levels(df$CAT)[levels(df$CAT)=="F"] <- "X"

df

#> N CAT
#> 1 1   A
#> 2 2   B
#> 3 3   C
#> 4 4   D
#> 5 5   E
#> 6 6   X
dmi3kno
  • 2,943
  • 17
  • 31
  • It worked. Thanks @dmi3kno. I remember this being an issue which put me off using R in the past. I will download the latest R. – H.Cheung Aug 05 '20 at 21:52
1

You could use the recode function from dplyr

df <- data.frame(N=c(1,2,3,4,5,6),CAT=c('A','B','C','D','E','F'))

df <- df %>% 
  mutate(CAT = recode(CAT, 'F'= 'X'))

df
Susan Switzer
  • 1,531
  • 8
  • 34