Questions tagged [multicollinearity]
62 questions
0
votes
0 answers
linear regression: Would these variables be considered multicollinear?
My linear model would be Score ~ Age + Collection1 + Collection3
I transform the Collection Column into dummy variables and I don't have Collection5 column to prevent the dummy variable trap.
For Collection 1, 3, and 5, I am sampling the same people…

Michael Zhao
- 31
- 4
0
votes
0 answers
Python VIF returns infinity values for dummy variables
So in the stroke prediction dataset, I've created dummy variables for all the categorical variables, i.e gender_male and gender_female, smoking_status_smokes and smoking_status_unknown and so on. Now to check for multicollinearity for all the…

IndigoChild
- 842
- 3
- 11
- 29
0
votes
0 answers
Error to perform Collinearity with findCorrelation () function (creation SDM)
I'm trying to create a species distribution model (SDM) with the presence-absence methodology. I have done all the necessary steps (below the complete code of one of the interested species).
To do this, I downloaded the 19 bioclimatic variables from…

Ludovico
- 5
- 2
0
votes
0 answers
When to use VIF to detect multicollinearity for the variables in a GLM and how to handle transformed variables?
This is my first project creating my own models. I have 12 possible variables for a habitat model. I am using glms (binominal, logit). I want to check for multicollinearity using the VIF. I have variables on which I will use a log transformation,…

Nicole
- 1
- 1
0
votes
1 answer
Regression with several dummy variables
I am running a logistic regression and I want to control for the country of the respondents. I have 12 countries. I used the "fastDummy" package to create dummies for each country
ALL<-dummy_cols(ALL, select_columns = "country")
I get something like…

Yacila
- 13
- 3
0
votes
1 answer
Collinearity between intercept and slopes that vary by time coding - linear mixed effects model
I'm currently trying to run a linear mixed effects model to estimate how stress changes as a function of time (over 6 timepoints). I've noticed that when the intercepts and slopes of stress trajectories are extracted for each individual in my…

M_Oxford
- 361
- 4
- 11
0
votes
0 answers
How to drop specific instances of a factor variable in R from a regression
I'm running this regression.
model1 <- lm(DV ~ IV1 + IV1 + IV3 + SubjectID, data = df)
I'm checking for multicollinearity between the variables. The SubjectID is the ID of each subject. Each subject has 8 observations and there are about 300…

Eric Tim
- 53
- 8
0
votes
1 answer
Principal Component Analysis (collinear predictors) and predict function in R
I have a dataset which has 3 collinear predictors.
I end up extracting these predictors and use a principal component analysis to reduce multi-collinearity.
What I want is to use these predictors for further modelling.
Is it incorrect to use the…

Srivats Chari
- 75
- 6
0
votes
0 answers
Logistic Model Error: Singular matrix while having highly correlated categorical dummy
Similar to Question here:
If I have one of the dummies of the categorical variables which has high VIF (multicollinearity), I would assume it should not be removed from the predictor list.
But the logistic regression of statsmodels has the 'Singular…

Bridget Huang
- 83
- 1
- 7
0
votes
0 answers
Why do vif() results from the car package differ from those in lmridge R?
I'm not a frequent poster so I apologize if this format is not correct.
If you tell me how to show the data I will make that change.
In the meantime, here is the code.
The vif() generates very different scores at the k=0 level in lmridge. Why are…

Regis Maria O'Connor
- 43
- 5
0
votes
0 answers
alias() returns nothing and vif() returns NAN in R
I am running a linear regression model. I have 33 continuous explanatory variables.
The result of linear regression is:
ESTF<-lm(log(HousePrice_2$price.yen.m2.)~.,data = HousePrice_2)
Call:
lm(formula = log(HousePrice_2$V1) ~ E1 + E3 + E4 + E5 +…

Zhan
- 1
- 2
0
votes
1 answer
Stata omitting 'collinear' variables following interaction term
In Stata, I've recently found that when I use the same variable across multiple interaction terms in one regression model, Stata flags that variable for collinearity. For instance, running:
regress dep i.gender##c.age i.ethnicity##c.age
Flags the…

Alice
- 99
- 1
- 9
0
votes
0 answers
Unable to use the package called ‘car’ in RStudio and unable to use vif() because of that
I want to check the multicolinearity of variables in my lm model using vif().
It is throwing error and hence I am not able to use:
> library(car)
Error in library(car) : there is no package called ‘car’
> vif(mymodel3)
Error in vif(mymodel3) :…
0
votes
0 answers
linear regression scaling dependent variable when constant creates multicollinearity warning
I'm running a linear regression with just one IV. When I run the regression with a constant using statsmodels I get a Multi-Collinearity warning. After searching on here I can see it could be a scaling issue. the coefficents are
constant: 14.0202 …

em456
- 359
- 2
- 11
0
votes
2 answers
Understanding Coefficient of Determination
I was going through the documentation to understand the Coefficient of Determination and from the document i got an understanding that Coefficient of Determination is nothing but R x R (correlation coefficient)
so i took the housing price dataset…

Lijin Durairaj
- 4,910
- 15
- 52
- 85