Questions tagged [dummy-variable]

Dummy or indicator variables are used to include categorical or qualitative variables in a regression model.

868 questions
0
votes
0 answers

How do I assign different categories the same dummy variable in r?

I am working with a data set that looks like this: ClusterID URL Text_Body 0 www.text.com texttexttexttexttext..... 1 www.text1.com texttexttexttexttext..... 2 www.text2.com …
Vindication09
  • 45
  • 2
  • 8
0
votes
3 answers

SAS - Create Dummy Variables for All Variables

I have a dataset with X number of categorical variables for a given record. I would like to somehow turn this dataset into a new dataset with dummy variables, but I want to have one command / macro that will take the dataset and make the dummy…
Nate Thompson
  • 625
  • 1
  • 7
  • 22
0
votes
1 answer

recoding race with 4 categories to 3 categories and creating 2 dummies in R

I am working with a variable for race that takes on the following values:1 Black, 2 Hispanic, 3 Mixed Race (Non-Hispanic), 4 Non-Black / Non-Hispanic. I want to sum up 3 and 4 and make it the base category and keep Black and Hispanic. I tried to…
bree
  • 25
  • 1
  • 7
0
votes
1 answer

How to create dummy variables for glmnet(LASSO) by model.matrix?

I have dataset: SalesPrice SqFeet Beds Baths AirCond Garage Pool Year Quality Style Lot Highway 1: 360.0 3.032 4 4 1 2 0 1972 2 1 22.221 0 2: 340.0 2.058 4 2 1 2 0…
Mat_nekras
  • 81
  • 6
0
votes
1 answer

SPSS version 23, MIXED module: maximum dummy variables?

I am using the MIXED routine, repeated measures. I have 10 dummy variables (0/1) and 8 scaled variables for fixed effects. The results keep showing that one of the dummy variables is redundant. I played around moving the order in which the dummy and…
0
votes
1 answer

Filter categories in data frame before generating dummy columns for them

I have a dataset with categorical values in some columns (one row may contain multiple categories, separated by ,). Example: user hashtags 0 u1 a,b 1 u2 a,c 2 u3 c I want to make dummy columns for these categories. I'm also…
martindzejky
  • 389
  • 1
  • 3
  • 15
0
votes
1 answer

When to take dummy variables in classification problems?

I am doing a binary classification problem where I am predicting if a customer will subscribe for a campaign(For Airline Industry). My data set is at Customer and Campaign name level and there are 43 variables under consideration. There are certain…
0
votes
1 answer

pandas get_dummies syntax error

I have a dataset that is 30k in size. I have a column titled "Native Country" I want to create a new variable for every unique value in that column (the Algorithm I am using can only handle numeric value so I need to convert text to binary…
Jim
  • 405
  • 1
  • 4
  • 6
0
votes
1 answer

Converting comma separated list to dummy variables

I have a table as follows: yel <- data.table(id=c(1,2,3)) yel$names[1] <- "\"parking space\", \"dining\", \"3bh\"" yel$names[2] <- "\"parking\" , \"outdoor\"" yel$names[3] <- "\"Hello!\",\"dining room\",\"3bh\"" yel id …
Manish Ranjan
  • 45
  • 1
  • 5
0
votes
0 answers

Dummy variables in Neural Networks

I am attempting to create a neural network, however, I am having trouble finding documentation with regards to categorical data. I have a variable in my dataset which is categorical with 11 levels. I think I need to convert this variable, however,…
John Meighan
  • 69
  • 1
  • 10
0
votes
1 answer

R - DiD (Difference in DIfference) Model

I'm trying to set up a DiD model with R. I have a baseline phase and a treatment group. I'm trying to consider baseline and age fluence in the model. So I created two dummy variables. young <- Shower_data$Age %in% c("20-29", "30-39") old <-…
user6555550
0
votes
0 answers

Convert column dummies into row dummies in R

I'm trying to convert several column dummies into one dummies. What I have now: HS2002 <- c(30343, 30344, 30345, 30346, 30349) Original.rate <- c(10, 12, 10, 13, 2) 2004 <- c(0, 0, 0, 1, 0) 2005 <- c(0, 1, 0, 1, 0) 2006 <- c(1, 1, 0, 1, 0) format1…
JJH
  • 1
  • 2
0
votes
1 answer

tableau: creating bins for variable bound between 0 and 1

I have just started using Tableau and I have run into a problem. I want to create a histogram of the percentage of loans that have not been paid back. I created a variable called 'Delinquent num' coding the loans that have not been paid back as 1…
user3426752
  • 415
  • 1
  • 8
  • 17
0
votes
1 answer

pandas get_dummies on high cardinality variables using one hot encoding creates too many new features

I have several high cardinal variables in a dataset and want to convert them into dummies. All of them have more than 500 levels. When I used pandas get_dummies, the matrix got so large and my program crashed. pd.get_dummies(data, sparse=True,…
Felicia.H
  • 361
  • 6
  • 15
0
votes
0 answers

Dummy/Binary Category Variable creation in Data Frame

I have originally tried to extract genres from the Kaggle IMDB data set: https://www.kaggle.com/param1/d/deepmatrix/imdb-5000-movie-dataset/the-money-makers The raw data for genres comes in a format like Action_Adventure_Comedy etc. From this I used…
Stu Richards
  • 141
  • 1
  • 11