Dummy or indicator variables are used to include categorical or qualitative variables in a regression model.
Questions tagged [dummy-variable]
868 questions
0
votes
1 answer
dummyVars producing NA values in output
I have used dummyVars function from Caret package before to make dummy variables out of characters/factors with also missing values (NA) and it worked successfully.
This time, however, the output I get includes NA values. The default is that it…

Sourav Sarkar
- 23
- 5
0
votes
0 answers
1 hot encoding training and test data separately in R
I need to add about 100 extra columns to data.frame based on the length of a previous data.frame
For example, I have two data.frames Xtrain and Xtest. Xtrain as 1000 columns, but Xtest has only 900 columns. This difference is due to 1-hot encoding…

Sonu Mishra
- 1,659
- 4
- 26
- 45
0
votes
1 answer
Caret RFE to deal to dummy variables that are levels of the same categorical variable
I have a classification problem and one of the predictors is a categorical variable X with four levels A,B,C,D that was transformed to three dummy variables A,B,C. I was trying to use the Recursive Feature Selection (RFE) in the caret package to…

ybeybe
- 149
- 1
- 12
0
votes
1 answer
How to apply linear regresssion of sklearn for some string variable
I am going to predict the box office of a movie using logistic regression.
I got some train data including the actors and directors. This is my datas:
Director1|Actor1|300 million
Director2|Actor2|500 million
I am going to encode the directors and…

KengoTokukawa
- 55
- 1
- 10
0
votes
0 answers
Multiply 3-d matrix by a 2-d matrix with dummy variables
I have a 3D matrix, X, of size AxBxC and a 2D matrix, Y, of size CxD. I want to do a matrix multiply and end up with a 3d matrix, R, of size AxBxD:
A = 30, B = 70, C = 300, D = 100.
The 3-d matrix, is a dummy variable which takes the value:
1 - in…

Tomas
- 43
- 5
0
votes
2 answers
Dummy variable as slope shifter without intercept
This is my first time to ask here.
I have trouble generating the slope dummy variables only(without intercept dummy).
However, if I multiply dummy variable by independent variable as shown below,
both slope dummy and intercept dummy results are…

yjkim
- 1
- 1
0
votes
1 answer
Estimate SE for all factor levels with zero-inflated model
I have a fairly complicated ZINB model. I have tried to replicate the basic structure of what I'm trying to do:
MyDat<-cbind.data.frame(fac1 = rep(c("A","B","C","D"),10),
fac2=c(rep("X",20),rep("Y",20)),
offset=c(runif(20,…

CAS
- 1
- 3
0
votes
1 answer
Create a dummy if any of a number of conditions is met
I would like to create a dummy if an action happens in a capital city and my dataset contains 34 countries in it. Also, some times can happen that the word is within a larger string (e.g. "Berlin, Germany, DE").
Let's say the column looks as…

Spl4t
- 53
- 8
0
votes
2 answers
One Hot Encoding of complex variables
I have a dataset where all my data is categorical and I would like to use one hot encoding for further analysis.
Main issues I would like to resolve:
Some cells contain many text in one cell (an example will follow).
Some numerical values need to…

Boro Dega
- 393
- 1
- 3
- 13
0
votes
1 answer
Rapidminer dummy coding mismatch
I'm trying to use a neural network by training it on trainData and then testing on testData, as anyone would do. However, the data requires dummy coding of some nominal features to numerical. When I do that, it trains the neural network but fails…

Dimebag
- 833
- 2
- 9
- 29
0
votes
3 answers
Create factor variables for year integers in r
I have a panel data set like below. But actual data set has several thousands observations. I want to create 14 facotors as a new column "Year_dum" for years 1984-1998 (15 years). I searched for creating dummy variables in r, but could not find a…

Doo
- 29
- 7
0
votes
2 answers
Iterate through and overwrite specific values in a pandas dataframe
I have a large dataframe collating a bunch of basketball data (screenshot below). Every column to the right of Opp Lineup is a dummy variable indicating if that player (indicated in the column name) is in the current lineup (the last part of the…

John
- 83
- 1
- 2
- 12
0
votes
1 answer
How to create dummy variables within a loop in python?
So I have a dataframe with a bunch of eatures, some of which I want to make into a dummy variable, some of which I want to leave alone, and I wanted to create a lazy/faster way to do this rather than just typing:
dum_A =…

pakkunrob
- 435
- 1
- 5
- 9
0
votes
3 answers
Subsetting by multiple aggregate conditions in dplyr
I was hoping someone knew of an easy/efficient in dplyr in which I can define an indicator variable to take the value of 1 if on Date X, an IP address was present >50 times. The data is two columns, one of IP addresses and the other associated…

ew23
- 3
- 2
0
votes
1 answer
Record variable to a dummy variable using weekdays
I have a variable that starting from Monday that lists each date from 1-7. I want to change this to weekday vs. weekend, with a 0-1 respectively to create a dummy variable. I know how to do one, but I can't figure out how to include 6 AND 7 in…

Greg
- 3
- 3