0

In a classification model, if I want to keep a categorical variable (e.g Gender), I need to create a dummy variable first.

My question is, this new dummy variable (e.g 1=male, 2=female) should be created as numeric vector? I tried to keep that dummy variable as factor (e.g "1","2") but then I tried to feature scale the dataset and it was not working.

So if I keep those dummy variables as numeric vectors and then create the model, is it going to have any negative effect on the model? I am concerned about this because 1 for male and 2 or female is not actually numeric values they are just category.

Please help me. This question is bothering me for two days. BTW, I use R for machine learning.

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • All you should have to do is to have a factor variable and R will automatically create the dummies for you. Which exact procedure from which package are you using? You will need to include some code to get a real answer. – Elin Jan 27 '18 at 15:17
  • I will apply several classification models to compare the results.For example, for logistic regression I will fitt the model as model <- glm(formula = Purchased ~.,family=binomial,data=training_set).Do i need to add any other parameters?thanks – Ehtasham Billah Mymun Jan 27 '18 at 15:32
  • 1
    Can you please add that to your question. – Elin Jan 27 '18 at 15:40
  • See this question https://stackoverflow.com/questions/11952706/generate-a-dummy-variable – Elin Jan 27 '18 at 15:44

0 Answers0