0

I would appreciate help interpreting the following pairwise scatterplots of predictor variables to check for multicollinearity and then fit the data to the results to avoid this occurring.

enter image description here

Background: I am working on a task where I have to carry out multilinear regression. In this task I have three explanatory variables, tar, nico and weight and want to predict CO, so CO is the response variable (dependent variable). The data comes from 25 American cigarette brands where tar, nico and weight are the respective brand's content of tar, nicotine and weight per cigarette. And CO is how much carbon monoxide a cigarette emits.

Question: In the task, I will now plot all the explanatory variables in pairs against each other to look for multicollinearity and find an observation that is questionable to include in the regression. Which I have done, see the picture above. But how should I interpret this image?

My thoughts: I have understood that multicollinearity would not exist if all the images in this plot looked different, but I can clearly see that this is not the case here. For example, three out of four plots after "tar" are similar and this form also appears in one plot after "nico" and two plots after "weight". But does this then mean that the three predictor variables are multicollinear? Or that some data in "tar" is collinear with another data in "tar"? After I figure out where this collinearity (possibly) arises, I need to fit the data and run a new multilinear regression on the reduced data set for which the questionable observation has been removed. I think this is done by setting the value of the dubious observation to NA, but then I have to find this one first.

Finally: How should I interpret the image and then fit the data to get rid of any collinearity? Any thoughts and tips on this are welcomed! Thanks in advance!

idlatva
  • 29
  • 7
  • 1
    This is not an ideal method of checking assumptions. Check out the packages `mvn` or `gvlma` to test the assumptions. It looks like you're somewhat new to the forum, welcome to Stack Overflow! Did you know? Questions like this (not programming code specific), are typically better suited for the Cross Validated forum on the Stack Exchange. (Stack Overflow is for programming.) – Kat Sep 17 '22 at 15:10
  • Hi @Kat and thank you! I will check out those packages and I appreciate you letting me know where these types of questions are better suited, I'll definitely keep that in mind. – idlatva Sep 17 '22 at 20:37

0 Answers0