Dummy or indicator variables are used to include categorical or qualitative variables in a regression model.
Questions tagged [dummy-variable]
868 questions
2
votes
2 answers
Create dummy variables that are dependent on IDs following an ordered sequence
Here is my input:
structure(list(date = c(1990, 1991, 1992, 1990, 1991, 1992, 1990,
1991, 1992, 1990, 1991, 1992, 1990, 1991, 1992, 1990, 1991, 1992,
1990, 1991, 1992, 1990, 1991, 1992, 1990, 1991, 1992, 1990, 1991,
1992, 1990, 1991, 1992, 1990,…

ZZ Top
- 93
- 5
2
votes
2 answers
Create dummy column and input value from other column
I have data containing a list of topics (topics 1-5; and 0 meaning no topic is assigned) and their value. I want to create a new column for each topic and fill the column with the value. Here's what the table looks like...
reviewId topic value
…

Dewani
- 137
- 6
2
votes
1 answer
How to make dummy coding (pd.get_dummies()) only for categories which share in nominal variables is at least 40% in Python Pandas?
I have DataFrame like below:
COL1 | COL2 | COL3 | ... | COLn
-----|------|------|------|----
111 | A | Y | ... | ...
222 | A | Y | ... | ...
333 | B | Z | ... | ...
444 | C | Z | ... | ...
555 | D | P | ... |…

dingaro
- 2,156
- 9
- 29
2
votes
1 answer
How to add empty/dummy row with continuous datetime index in pandas?
This is my dataframe
consumption hour
start_time
2022-09-30 14:00:00+02:00 199.0 14.0
2022-09-30 15:00:00+02:00 173.0 15.0
2022-09-30 16:00:00+02:00 173.0 16.0
2022-09-30…

Naeem
- 45
- 4
2
votes
3 answers
Underscore variable with walrus operator in Python
In Python, the variable name _ (underscore) is often used for throwaway variables (variables that will never be used, hence do not need a proper name).
With the walrus operator, :=, I see the need for a variable that is rather short lived (used in…

DustByte
- 651
- 6
- 16
2
votes
3 answers
Removing all binary variables from the data
I have data as follows:
df <- data.frame(A=c(1,2,3), B=c(1,0,1), C=c(0.1, 0.011, 0.3), D=c(0, 0.5, 1))
A B C D
1 1 1 0.100 0.0
2 2 0 0.011 0.5
3 3 1 0.300 1.0
Ho can I remove all binary variables (= B) from this data.frame?

Tom
- 2,173
- 1
- 17
- 44
2
votes
1 answer
Pandas: pivot comma delimited column into multiple columns
I have the following Pandas DataFrame:
import pandas as pd
import numpy as np
df = pd.DataFrame({'id': [1, 2, 3, 4], 'type': ['a,b,c,d', 'b,d', 'c,e', np.nan]})
I need to split the type column based on the commma delimiter and pivot the values…

Hui
- 97
- 7
2
votes
2 answers
How to use an existing dummy variable to create a new one that takes the value 1 for certain lead observations within a group
I have a dataset like the one below:
dat <- data.frame (id = c(1,1,1,1,1,2,2,2,2,2),
year = c(2015, 2016, 2017,2018, 2019, 2015, 2016, 2017, 2018, 2019),
sp=c(1,0,0,0,0,0,1,0,0,0))
dat
id year sp
1 1 2015 …

Teo
- 33
- 4
2
votes
3 answers
Separate rows to make dummy rows
Consider this dataframe:
dat <- structure(list(col1 = c(1, 2, 0), col2 = c(0, 3, 2), col3 = c(1, 2, 3)), class = "data.frame", row.names = c(NA, -3L))
col1 col2 col3
1 1 0 1
2 2 3 2
3 0 2 3
How can one dummify rows?…

Maël
- 45,206
- 3
- 29
- 67
2
votes
3 answers
How to specify which column to remove in get_dummies in pandas
I have a DataFrame column with 3 values - Bart, Peg, Human. I need to one-hot encode them such that Bart and Peg stay as columns and human is represented as 0 0.
Xi | Architecture
0 | Bart
1 | Bart
2 | Peg
3 | Human
4 | Human
5 | Peg
..
.
I…

Kiera.K
- 317
- 1
- 13
2
votes
1 answer
How can I create a dummy variable based on text analysis and time sequence of events?
Coworkers
Date
A
2011-01-01
D
2011-01-02
B;;D
2011-01-03
E;;F
2011-01-04
D
2012-11-05
D;;G
2012-11-06
A
2012-11-09
Hello, I am trying to create a dummy variable based on text analysis (e.g., grepl).
The unit of analysis is a…

Juno Oh
- 23
- 3
2
votes
3 answers
How do I create two new variables out of one variable, and attach dummy values to it in R?
I am completely new to any kind of coding, nevermind R in particular, so my days of googling have not been very helpful. I would really appreciate any kind of help/insights!
I would like to know how to get two new variables out of the original…

Bommby
- 35
- 4
2
votes
1 answer
Variable Importance Dummy Variables R
How can I determine variable importance (vip package in r) for categorical predictors when they have been one-hot encoded? It seems impossible for r to do this when the model is built on the dummy variables rather than the original categorical…

mapleleaf
- 758
- 3
- 8
- 14
2
votes
0 answers
One Hot Encoding: Avoiding dummy variable trap and process unseen data with scikit learn
I'm building a model, pretty much similiar to the well known House Price Prediction. I got to the point that I need to encode my nominal categorical variables by using scikit-learns OneHotEncoder. The so called "Dummy Variable Trap" is clear to me…

Buggy
- 43
- 5
2
votes
2 answers
Is there a way to display the reference category in a regression output in R?
I am estimating a regression model with some factor/categorial variables and some numerical ones. Is it possible to display the reference category for each factor/categorial variable in the summary of the regression model?
Ideally this would…

SKupek
- 63
- 6