I have a dataframe in R, here there is an example
asdf <- data.frame(id = c(2345, 7323, 2345, 4533),
place = c("Home", "Home", "Office", "Office"),
sex = c("Male", "Male", "Male", "Female"),
consumed = c(1000, 800, 1000, 500))
As you can see there is one id duplicated, because he has two locations, Home and Office. I want to convert every character variable to a dummy variable, and obtain just one id, without duplicated id's. I am sure that the only duplicated values can be the "place" variable.
When i apply dummyVars from caret, i can't do this, and for me this behavior does not make sense, for example, when I apply the following
dummy <- dummyVars( ~ ., data = asdf, fullRank = FALSE, levelsOnly = TRUE)
predict(dummy, asdf)
I get the following dataframe, with duplicated id's
result <- data.frame(id = c(2345, 7323, 2345, 4533),
placeHome = c(1, 1, 0, 0),
placeOffice = c(0, 0, 1, 1),
sexFemale = c(0, 0, 0, 1),
sexMale = c(1, 1, 1, 0),
consumed = c(1000, 800, 1000, 500))
but I want this
sexy_result <- data.frame(id = c(2345, 7323, 4533),
placeHome = c(1, 1, 0),
placeOffice = c(1, 0, 1),
sexFemale = c(0, 0, 1),
sexMale = c(1, 1, 0),
consumed = c(1000, 800, 500))