-3

Here are the sample data.

df1 <- data.frame(y = 1:5, x = c("s", "m", "l", "s", "m"))

df2 <- data.frame(y = 1:4, x = c("s", "l", "s", "l"))

I'd like df2 has three levels based on df1$x.

I tried

df2$x <- factor(df1$x)

Then I've got

Error in `$<-.data.frame`(`*tmp*`, "x", value = c(3L, 2L, 1L, 3L, 2L)) : 
  replacement has 5 rows, data has 4

or

levels(df2$x) <- factor(df1$x)

then it changes df2.

df2
  y x
1 1 m
2 2 s
3 3 m
4 4 s

How can I do it?

microbe
  • 2,139
  • 3
  • 14
  • 17
  • The first one gave this error. `Error in `$<-.data.frame`(`*tmp*`, "x", value = c(3L, 2L, 1L, 3L, 2L)) : replacement has 5 rows, data has 4`. The second one changed the labels. – microbe Aug 21 '12 at 18:51

2 Answers2

1

I am not sure I understand your goal correctly.

df1 <- data.frame(y = 1:5, x = factor(c("s", "m", "l", "s", "m")))
df2 <- data.frame(y = 1:4, x = factor(c("s", "l", "s", "l")))
df2$x
#[1] s l s l
#Levels: l s
levels(df2$x)<-unique(c(levels(df2$x),levels(df1$x)))
df2$x
#[1] s l s l
#Levels: l s m
Roland
  • 127,288
  • 10
  • 191
  • 288
  • Unfortunately, the order of levels was changed and different from df1. Here is the output. `> str(df1) 'data.frame': 5 obs. of 2 variables: $ y: int 1 2 3 4 5 $ x: Factor w/ 3 levels "l","m","s": 3 2 1 3 2 > str(df2) 'data.frame': 4 obs. of 2 variables: $ y: int 1 2 3 4 $ x: Factor w/ 3 levels "l","s","m": 2 1 2 1` – microbe Aug 21 '12 at 18:57
  • So? It seems to me that @Rolands solution also works for that. What it does is to set all possible factor levels from both df1$x and df2$x as the levels of df2$x. For completeness' sake you could also add `levels(df1$x)<-levels(df2$x)` at the end of his code to make both factors the same. – ROLO Aug 21 '12 at 19:18
0
df1 <- data.frame(y = 1:5, x = c("s", "m", "l", "s", "m"))

df2 <- data.frame(y = 1:4, x = factor(c("s", "l", "s", "l"), levels=levels(df1$x)))

 str(df2)
'data.frame':   4 obs. of  2 variables:
 $ y: int  1 2 3 4
 $ x: Factor w/ 3 levels "l","m","s": 3 1 3 1
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • This does not work if not all levels of df2 are already in df1. E.g. `df1 <- data.frame(y = 1:5, x = c("a", "m", "l", "a", "m"))` would give problems. @Roland's solution does not have this issue. – ROLO Aug 21 '12 at 22:04
  • That's true but that was not the problem posed. Furthermore its rather easy to fix with `levels= unique( c(levels(df1$x, x) )` – IRTFM Aug 21 '12 at 22:47