Questions tagged [multicollinearity]

62 questions
0
votes
0 answers

linear regression: Would these variables be considered multicollinear?

My linear model would be Score ~ Age + Collection1 + Collection3 I transform the Collection Column into dummy variables and I don't have Collection5 column to prevent the dummy variable trap. For Collection 1, 3, and 5, I am sampling the same people…
0
votes
0 answers

Python VIF returns infinity values for dummy variables

So in the stroke prediction dataset, I've created dummy variables for all the categorical variables, i.e gender_male and gender_female, smoking_status_smokes and smoking_status_unknown and so on. Now to check for multicollinearity for all the…
IndigoChild
  • 842
  • 3
  • 11
  • 29
0
votes
0 answers

Error to perform Collinearity with findCorrelation () function (creation SDM)

I'm trying to create a species distribution model (SDM) with the presence-absence methodology. I have done all the necessary steps (below the complete code of one of the interested species). To do this, I downloaded the 19 bioclimatic variables from…
0
votes
0 answers

When to use VIF to detect multicollinearity for the variables in a GLM and how to handle transformed variables?

This is my first project creating my own models. I have 12 possible variables for a habitat model. I am using glms (binominal, logit). I want to check for multicollinearity using the VIF. I have variables on which I will use a log transformation,…
Nicole
  • 1
  • 1
0
votes
1 answer

Regression with several dummy variables

I am running a logistic regression and I want to control for the country of the respondents. I have 12 countries. I used the "fastDummy" package to create dummies for each country ALL<-dummy_cols(ALL, select_columns = "country") I get something like…
0
votes
1 answer

Collinearity between intercept and slopes that vary by time coding - linear mixed effects model

I'm currently trying to run a linear mixed effects model to estimate how stress changes as a function of time (over 6 timepoints). I've noticed that when the intercepts and slopes of stress trajectories are extracted for each individual in my…
M_Oxford
  • 361
  • 4
  • 11
0
votes
0 answers

How to drop specific instances of a factor variable in R from a regression

I'm running this regression. model1 <- lm(DV ~ IV1 + IV1 + IV3 + SubjectID, data = df) I'm checking for multicollinearity between the variables. The SubjectID is the ID of each subject. Each subject has 8 observations and there are about 300…
Eric Tim
  • 53
  • 8
0
votes
1 answer

Principal Component Analysis (collinear predictors) and predict function in R

I have a dataset which has 3 collinear predictors. I end up extracting these predictors and use a principal component analysis to reduce multi-collinearity. What I want is to use these predictors for further modelling. Is it incorrect to use the…
0
votes
0 answers

Logistic Model Error: Singular matrix while having highly correlated categorical dummy

Similar to Question here: If I have one of the dummies of the categorical variables which has high VIF (multicollinearity), I would assume it should not be removed from the predictor list. But the logistic regression of statsmodels has the 'Singular…
0
votes
0 answers

Why do vif() results from the car package differ from those in lmridge R?

I'm not a frequent poster so I apologize if this format is not correct. If you tell me how to show the data I will make that change. In the meantime, here is the code. The vif() generates very different scores at the k=0 level in lmridge. Why are…
0
votes
0 answers

alias() returns nothing and vif() returns NAN in R

I am running a linear regression model. I have 33 continuous explanatory variables. The result of linear regression is: ESTF<-lm(log(HousePrice_2$price.yen.m2.)~.,data = HousePrice_2) Call: lm(formula = log(HousePrice_2$V1) ~ E1 + E3 + E4 + E5 +…
Zhan
  • 1
  • 2
0
votes
1 answer

Stata omitting 'collinear' variables following interaction term

In Stata, I've recently found that when I use the same variable across multiple interaction terms in one regression model, Stata flags that variable for collinearity. For instance, running: regress dep i.gender##c.age i.ethnicity##c.age Flags the…
Alice
  • 99
  • 1
  • 9
0
votes
0 answers

Unable to use the package called ‘car’ in RStudio and unable to use vif() because of that

I want to check the multicolinearity of variables in my lm model using vif(). It is throwing error and hence I am not able to use: > library(car) Error in library(car) : there is no package called ‘car’ > vif(mymodel3) Error in vif(mymodel3) :…
0
votes
0 answers

linear regression scaling dependent variable when constant creates multicollinearity warning

I'm running a linear regression with just one IV. When I run the regression with a constant using statsmodels I get a Multi-Collinearity warning. After searching on here I can see it could be a scaling issue. the coefficents are constant: 14.0202 …
em456
  • 359
  • 2
  • 11
0
votes
2 answers

Understanding Coefficient of Determination

I was going through the documentation to understand the Coefficient of Determination and from the document i got an understanding that Coefficient of Determination is nothing but R x R (correlation coefficient) so i took the housing price dataset…