0

I am trying to predict the statistically significant variables out of a list of binary variables. I am having a conceptual doubt in the below mentioned 2 approaches to find the relevant variables.

Dependent variable: Height of a person

Independent variables:

  1. Gender(Male or Female)
  2. Financial_Status(Below Poverty Line or not)
  3. College_Graduate(Yes or No)

Approach 1: Fitting a linear regression while taking these as dependent/independent variables and finding the statistically significant variables

Approach 2: Performing an individual statistical test for each dependent variable(t-test or some other relevant test) to compute the statistically significant variables

Are both of these approaches similar and will give similar results? If not, what's the exact difference?

ShubhamA
  • 312
  • 3
  • 10

1 Answers1

0

Since you have multiple independent variables, than clearly no.

If you would like to go for the ttest approach for each of the values of the different independent variables (Gender, Financial_Status and College_Graduate) then it means you'll perform 3 different tests. Performing multiple tests is something that is risky in terms of false positive results, and thus should be adjusted with a multiple comparison adjustment method (Bonferoni, FDR, among others).

On the other hand, if you'll use a single multiavariate linear regression you wouldn't have the correct for multiple comparisons, which is why, in my opinion, is the better approach.

EyalItskovits
  • 116
  • 1
  • 9