Questions tagged [dummy-variable]

Dummy or indicator variables are used to include categorical or qualitative variables in a regression model.

868 questions
2
votes
4 answers

R: Generate a dummy variable based on the existence of one column' value in another column

I have a data frame like this: A B 2012,2013,2014 2011 2012,2013,2014 2012 2012,2013,2014 2013 2012,2013,2014 2014 2012,2013,2014 2015 I wanted to create a dummy variable, which indicates whether the…
Ian Wang
  • 23
  • 4
2
votes
4 answers

Create categorical variable from mutually exclusive dummy variables

How can I create a categorical variable from mutually exclusive dummy variables (taking values 0/1)? Basically I am looking for the exact opposite of this solution:…
ECII
  • 10,297
  • 18
  • 80
  • 121
2
votes
0 answers

How can I split a column with dummy_cols

I am using fastDummies in R and trying to use the split argument. I am not getting it to split properly. Here's what I am trying. library(fastDummies) ID <- seq(1:4) pets <- c("dog", "cat;dog;mouse", "dog;mouse", "cat") df <- data.frame("ID" = ID,…
Kevin M
  • 481
  • 6
  • 20
2
votes
0 answers

Why is the default of Pandas get_dummies()'s parameter drop_first = False?

Pandas' function pandas.get_dummies() returns for a categorical variable with k levels either k dummy variables (= one-hot encoding) if drop_first = False k - 1 dummy variables (= dummy encoding) if drop_first = True Both contain the same…
00schneider
  • 698
  • 9
  • 21
2
votes
2 answers

dummy variable columns based on strings from other columns

I have a database with patient id number and the treatment they recived. I would like to have a dummy column for every different INDIVIDUAL treatment (ie, as in did the patient recieve treatment A,B,C,D). This is way simplified because I have over…
VivG
  • 57
  • 6
2
votes
1 answer

Generate dummy variables from all possible combinations of variables

I’ve 5 conditions that can be present (=1) or not (=0): set.seed(101) df <- data.frame( alfa = sample(c(0, 1), 30, replace = TRUE), beta = sample(c(0, 1), 30, replace = TRUE), gamma = sample(c(0, 1), 30, replace = TRUE), delta = sample(c(0, 1), 30,…
Borexino
  • 802
  • 8
  • 26
2
votes
2 answers

Convert 1 column data into multi hot encoding

As an example to the problem, suppose we have a dataframe: Name Class 0 Aci FB 1 Dan TWT 2 Ann GRS 3 Aci GRS 4 Dan FB The resulted dataframe would be df Name FB TWT GRS 0 Aci 1 0 1 0 Dan 1 1 0 0 Ann …
Codevan
  • 538
  • 3
  • 20
2
votes
2 answers

R multi-hot encoding among multiple columns

My data is in the shape Event Id Var1 Var2 Var3 1 a x w y 2 a z y w 3 b x y q and I need to create multi-hot encoded vectors for each row in the table, considering all the values appearing in Var1, Var2 and…
M_Stones
  • 23
  • 3
2
votes
1 answer

Pandas DataFrame: How to convert numeric columns into pairwise categorical data?

Given a pandas dataFrame, how does one convert several numeric columns (where x≠1 denotes the value exists, x=0 denotes it doesn't) into pairwise categorical dataframe? I know it is similar to one-hot decoding but the columns are not exactly one…
Codevan
  • 538
  • 3
  • 20
2
votes
2 answers

how to rename a dataframe column which is a digit like name?

I have a dataframe column contains 10 different digits. Through pd.get_dummies I've got 10 new columns which column names are numbers. Then I want to rename these number named columns by df = df.rename(columns={'0':'topic0'}) but failed. How can I…
Helix Herry
  • 327
  • 1
  • 4
  • 14
2
votes
2 answers

Dummy coding in pandas with custom value weights

I have data in the shape as follows: pd.DataFrame({'id': [1,2,3], 'item': ['item_a', 'item_a', 'item_b'], 'score': [1,-1,1]}) id item score 1 item_a 1 2 item_a -1 3 item_b 1 I want to get dummy codes for the…
Daniel
  • 363
  • 3
  • 11
2
votes
2 answers

pandas - multiple 'yes/no' dummy variables

I have a data frame with multiple categorical variables that I need to convert into dummy variables. Gender and region (4 types) are easy with pd.get_dummies. However, I have several variables that are yes/no after that. What can I do so that the…
2
votes
1 answer

Creating dummy variables (n-1) categories

I found similar entries but not exactly what I want. For two categorized variable (e.g., gender(1,2)), I need to create a dummy variable, 0s being male and 1s being female. Here how my data look like and what I did. data <-…
amisos55
  • 1,913
  • 1
  • 10
  • 21
2
votes
1 answer

ValueError: Columns must be same length as key

I have a problem running the code below. data is my dataframe. X is the list of columns for train data. And L is a list of categorical features with numeric values. I want to one hot encode my categorical features. So I do as follows. But a…
Minila S
  • 21
  • 1
  • 1
  • 5
2
votes
3 answers

Creating Dummy Variables from String Column

I have a pandas dataframe (N = 1485) that looks like this: ID Intervention 1 Blood Draw, Flushed, Locked 1 Blood Draw, Port De-Accessed, Heparin-Locked, Tubing Changed 1 Blood Draw, Flushed 2 Blood…