Questions tagged [dummy-variable]

Dummy or indicator variables are used to include categorical or qualitative variables in a regression model.

868 questions
2
votes
3 answers

Decide which category to drop in pandas get_dummies()

Let's say I have the following df: data = [{'c1':a, 'c2':x}, {'c1':b,'c2':y}, {'c1':c,'c2':z}] df = pd.DataFrame(data) Output: c1 c2 0 a x 1 b y 2 c z Now I want to use pd.get_dummies() to one hot encode the two…
TiTo
  • 833
  • 2
  • 7
  • 28
2
votes
2 answers

Using model.matrix() to create dummy variables

the following code doesn't work, could someone offer help? dataframe1<-data.frame(x1 = c(1:5) , x2 = 1 , x3 = 0) dataframe1 model.matrix(~x1 - 1 , dataframe1)
Mathilda Fang
  • 353
  • 1
  • 13
2
votes
2 answers

Can't figure out how to remove column name from dummy variable heading

I wrote this code and used library('fastDummies'): New_Data <- dummy_cols(New_Curve_Data, select_columns = 'CountyName') I just want the actual county name that is Banks to be displayed and not CountyName_Banks etc. There are like 100 dummy…
Eena29
  • 23
  • 2
2
votes
1 answer

How can I modify this code chunk to recode all ~400 variables in my dataframe and not just one?

I wrote this in R to dummy code a character variable that consists of either "0" or a character response (like "sh6sej"), so that all character responses go to 1 and all 0's go to 0. It works fine for the one variable. However, is there a way to…
2
votes
2 answers

Create Dummy Columns for values in Single Pandas Column and Group into single row

I am trying to take a pandas dataframe and perform a pivot like operation on a single column. I want to take multiple rows (grouped by some identification columns) and convert that single column into dummy indicator variables. I know of…
Coldchain9
  • 1,373
  • 11
  • 31
2
votes
2 answers

Efficiently reshaping a non-standard dummy-coded matrix or table in R

I have a data frame that has several hundred thousand rows and 6 columns. Each column contains IDs (there are about 500 unique IDs in total). I would like to convert this data frame into a large table/matrix with each unique ID having its own…
donkeyshark
  • 111
  • 1
  • 8
2
votes
2 answers

Lasso Regression coefficients to find a linear model

I am doing linear models in R. My factors include birth rates, death rates, infant mortality rates, life expectancies, and region. region has 7 levels, using numerical numbers to represent each region: East Asia & Pacific South Asia Europe &…
user14622762
2
votes
1 answer

How to create dummy variable based on the value of two columns in R?

The question title might not completely reflect my problem, and that's perhaps the reason why I cannot come up with a solution for my problem. I have read simmilar questions (e.g., Assign a value to column based on condition across rows or R:…
2
votes
1 answer

Creating a dummy variable and data wrangling

I have a dataframe that looks like this: I need to create a new dataframe in which the student names are the index, the course number is the columns and the values are 0 or 1, depending on whether or not the student took that course. I have tried…
Sofia
  • 31
  • 2
2
votes
1 answer

GNUPLOT - problems with "set dummy"

Hey i want to plot some functions but have problems with my dummy variables. Everytime i want to plot my second graphic gnuplot says that dummy M or x is not defined but i don't know why. Its not neccesary to have different dummy variables but it…
iPhil
  • 33
  • 3
2
votes
2 answers

Change structure of DF to dummy

I am looking for a way of changing structure of DF so I can use beta regression after. The df looks like this at the moment: rating playerID 0.6 a1 NA b2 0.9 a4 NA b5 0 a3 NA …
K-tan
  • 79
  • 8
2
votes
1 answer

R Quantreg: Singularity with categorical survey data

For my Bachelor's thesis I am trying to apply a linear median regression model on constant sum data from a survey (see formula from A.Blass (2008)). It is an attempt to recreate the probability elicitation approach proposed by A. Blass et al (2008)…
2
votes
3 answers

Create numerically encoded dummy variables efficiently in R?

How can we transform data of the form df <- structure(list(customer_number = c(3, 3, 1, 1, 3), item = c("milkshake","burger", "apple", "burger", "water") ), row.names = c(NA, -5L), class…
stevec
  • 41,291
  • 27
  • 223
  • 311
2
votes
2 answers

creating dummy quantile variable from continuous variable

Here is the data I am working with: x <- getURL("https://raw.githubusercontent.com/dothemathonthatone/maps/master/testmain.csv") data <- read.csv(text = x) I want to make a dummy variable for the top, middle, and lower third of the values in…
Collective Action
  • 7,607
  • 15
  • 45
  • 60
2
votes
1 answer

Relevel factor and glm with effect coding

I'm having trouble understanding effect coding with glm. As an example: data('mpg') mpg$trans = as.factor(mpg$trans) levels(mpg$trans) [1] "auto(av)" "auto(l3)" "auto(l4)" "auto(l5)" "auto(l6)" "auto(s4)" "auto(s5)" "auto(s6)" …
W. Mooi
  • 119
  • 1
  • 3
  • 11