Dummy or indicator variables are used to include categorical or qualitative variables in a regression model.
Questions tagged [dummy-variable]
868 questions
2
votes
4 answers
R: Generate a dummy variable based on the existence of one column' value in another column
I have a data frame like this:
A B
2012,2013,2014 2011
2012,2013,2014 2012
2012,2013,2014 2013
2012,2013,2014 2014
2012,2013,2014 2015
I wanted to create a dummy variable, which indicates whether the…

Ian Wang
- 23
- 4
2
votes
4 answers
Create categorical variable from mutually exclusive dummy variables
How can I create a categorical variable from mutually exclusive dummy variables (taking values 0/1)?
Basically I am looking for the exact opposite of this solution:…

ECII
- 10,297
- 18
- 80
- 121
2
votes
0 answers
How can I split a column with dummy_cols
I am using fastDummies in R and trying to use the split argument. I am not getting it to split properly. Here's what I am trying.
library(fastDummies)
ID <- seq(1:4)
pets <- c("dog", "cat;dog;mouse", "dog;mouse", "cat")
df <- data.frame("ID" = ID,…

Kevin M
- 481
- 6
- 20
2
votes
0 answers
Why is the default of Pandas get_dummies()'s parameter drop_first = False?
Pandas' function pandas.get_dummies() returns for a categorical variable with k levels either
k dummy variables (= one-hot encoding) if drop_first = False
k - 1 dummy variables (= dummy encoding) if drop_first = True
Both contain the same…

00schneider
- 698
- 9
- 21
2
votes
2 answers
dummy variable columns based on strings from other columns
I have a database with patient id number and the treatment they recived. I would like to have a dummy column for every different INDIVIDUAL treatment (ie, as in did the patient recieve treatment A,B,C,D).
This is way simplified because I have over…

VivG
- 57
- 6
2
votes
1 answer
Generate dummy variables from all possible combinations of variables
I’ve 5 conditions that can be present (=1) or not (=0):
set.seed(101)
df <- data.frame(
alfa = sample(c(0, 1), 30, replace = TRUE),
beta = sample(c(0, 1), 30, replace = TRUE),
gamma = sample(c(0, 1), 30, replace = TRUE),
delta = sample(c(0, 1), 30,…

Borexino
- 802
- 8
- 26
2
votes
2 answers
Convert 1 column data into multi hot encoding
As an example to the problem, suppose we have a dataframe:
Name Class
0 Aci FB
1 Dan TWT
2 Ann GRS
3 Aci GRS
4 Dan FB
The resulted dataframe would be
df
Name FB TWT GRS
0 Aci 1 0 1
0 Dan 1 1 0
0 Ann …

Codevan
- 538
- 3
- 20
2
votes
2 answers
R multi-hot encoding among multiple columns
My data is in the shape
Event Id Var1 Var2 Var3
1 a x w y
2 a z y w
3 b x y q
and I need to create multi-hot encoded vectors for each row in the table, considering all the values appearing in Var1, Var2 and…

M_Stones
- 23
- 3
2
votes
1 answer
Pandas DataFrame: How to convert numeric columns into pairwise categorical data?
Given a pandas dataFrame, how does one convert several numeric columns (where x≠1 denotes the value exists, x=0 denotes it doesn't) into pairwise categorical dataframe? I know it is similar to one-hot decoding but the columns are not exactly one…

Codevan
- 538
- 3
- 20
2
votes
2 answers
how to rename a dataframe column which is a digit like name?
I have a dataframe column contains 10 different digits. Through pd.get_dummies I've got 10 new columns which column names are numbers. Then I want to rename these number named columns by df = df.rename(columns={'0':'topic0'}) but failed. How can I…

Helix Herry
- 327
- 1
- 4
- 14
2
votes
2 answers
Dummy coding in pandas with custom value weights
I have data in the shape as follows:
pd.DataFrame({'id': [1,2,3], 'item': ['item_a', 'item_a', 'item_b'],
'score': [1,-1,1]})
id item score
1 item_a 1
2 item_a -1
3 item_b 1
I want to get dummy codes for the…

Daniel
- 363
- 3
- 11
2
votes
2 answers
pandas - multiple 'yes/no' dummy variables
I have a data frame with multiple categorical variables that I need to convert into dummy variables. Gender and region (4 types) are easy with pd.get_dummies. However, I have several variables that are yes/no after that. What can I do so that the…

immaprogrammingnoob
- 167
- 2
- 13
2
votes
1 answer
Creating dummy variables (n-1) categories
I found similar entries but not exactly what I want. For two categorized variable (e.g., gender(1,2)), I need to create a dummy variable, 0s being male and 1s being female.
Here how my data look like and what I did.
data <-…

amisos55
- 1,913
- 1
- 10
- 21
2
votes
1 answer
ValueError: Columns must be same length as key
I have a problem running the code below.
data is my dataframe. X is the list of columns for train data. And L is a list of categorical features with numeric values.
I want to one hot encode my categorical features. So I do as follows. But a…

Minila S
- 21
- 1
- 1
- 5
2
votes
3 answers
Creating Dummy Variables from String Column
I have a pandas dataframe (N = 1485) that looks like this:
ID Intervention
1 Blood Draw, Flushed, Locked
1 Blood Draw, Port De-Accessed, Heparin-Locked, Tubing Changed
1 Blood Draw, Flushed
2 Blood…

G. Nguyen
- 151
- 3
- 14