0

I am currently running multiple linear regressions in R and want to summarize them using the Stargazer package. In the regressions I am removing outliers of the respective variables (e.g. outliers of GDP growth, outliers of unemployment rate etc.), and control for things like level of development. The analysis of outliers is through boxplots:

    outlier_val_GDPGrowth = boxplot.stats(data$GDPpCGrowth1991)$out

Then I run multiple regressions, with and without outliers:

    LRUnemploymentwOutliers = lm(unemployment_1991_2009 ~ degrees_consensus, data = data)
    LRUnemployment = lm(unemployment_1991_2009[!(data$unemployment_1991_2009 %in% outlier_val_Unemployment)] ~ degrees_consensus[!(data$unemployment_1991_2009 %in% outlier_val_Unemployment)] + developed[!(data$unemployment_1991_2009 %in% outlier_val_Unemployment)], data = data)
    LRBudgetwOutliers = lm(budget_balance_2003_2007 ~ degrees_consensus + developed, data = data)
    LRBudget = lm(budget_balance_2003_2007[!(data$budget_balance_2003_2007 %in% outlier_val_Budget)] ~ degrees_consensus[!(data$budget_balance_2003_2007 %in% outlier_val_Budget)] + developed[!(data$budget_balance_2003_2007 %in% outlier_val_Budget)], data = data)

Afterwards, I run stargazer

    stargazer(LRGDPGrowthwOutliers, LRGDPGrowth, LRCPI, LRGDPDeflator, 
    LRUnemploymentwOutliers, LRUnemployment, LRBudgetwOutliers, LRBudget, 
      column.separate = c(2,1,1,2,2))

The problem now is, however, that stargazer seems to see the three different version of, e.g. developed (developed with outliers, developed without budget balance outliers, developed without Growth outliers) as different variables, creating different lines for them in the table. Is there some way to fix this, or must I content myself with removing the outlier exclusion?

Thank you!

JPR
  • 13
  • 3
  • After a lot of search and Trial & Error I managed to find a solution (which might be unclean, but ok). The key is to set the coefficient names equal to each other, i.e. if you have two regressions r.1 and r.2, run names(r.1$coefficients) = names(r.2$coefficients) – JPR Apr 23 '18 at 17:20
  • Better: don't put subsets inside the `formula` argument to `lm`, use the `subset` argument instead, it will save you duplication. –  Apr 24 '18 at 11:59

0 Answers0