I want to seperate variables according to a "lead" variable. x3 in the following case:
set.seed(2)
df = data.frame(x1 = sample(4), x2 = sample(4), x3 = sample(letters[1:2], size = 4, replace = TRUE))
df
# x1 x2 x3
# 1 1 4 a
# 2 3 3 b
# 3 2 1 b
# 4 4 2 a
# Desired output
# x3 x1.a x2.a x1.b x2.b
# a 1 4 NA NA
# b NA NA 3 3
# b NA NA 2 1
# a 4 2 NA NA
I somehow sense that this could be achieved with reshape2::dcast()
but I could only get it to work for two variables in total:
reshape2::dcast(df[,2:3], seq_along(x3) ~ x3, value.var = "x2")[, -1]
# a b
# 1 2 NA
# 2 NA 1
# 3 NA 3
# 4 4 NA
But may be this is just a total abuse of dcast
. Is there an elegant solution to this problem, without splitting and merging df
?
EDIT: Some people mentioned that to do this is a horrible idea and that i probably should not do such a thing. Let me elaborate on when this can make sense.
Imagine x3
is a switch for an specific algorithm. In this case a
and b
are the options. Furthermore x1
and x2
are parameters both algorithms can take. Unfortunately both algorithms behave really different on the same parameter settings for x1
and x2
so it makes sense to handle them as distinct features to take their uncorrolatedness in to account.