I am trying to build a regression table with R package Stargazer that contains 4 different lm models. My data contains x,y and two categorical variables. Depending on the model I either include none, one or both categorical variables. However, I don't want the values of the categorical variables to show up in the regressions but instead pass them as arguments to the omit parameter in the stargazer command. However, for model 4 I don't get the expected Yes/No output stating whether the categorical variables are included the model.
Here is a Minimum Working Example:
library(stargazer)
set.seed(42)
x <- rnorm(100, mean = 100, sd = 5)
e <- rnorm(100, mean = 0, sd = 10)
y <- x*1.5+e
countries <- sample(c("CAN", "GRC", "PRT", "THA", "NZL"), size=100, replace=T)
birth_cohorts <- sample(c("1980", "1990", "2000", "2010"), size=100, replace=T)
model1 <- lm(y ~ x)
sum1 <- summary(model1)
sum1
model2 <- lm(y ~ x + countries - 1)
sum2 <- summary(model2)
sum2
model3 <- lm(y ~ x + birth_cohorts - 1)
sum3 <- summary(model3)
sum3
model4 <- lm(y ~ x + countries + birth_cohorts - 1)
sum4 <- summary(model4)
sum4
stargazer(model1, model2, model3, model4,
type = "text",
omit = c("countries", "birth_cohorts"),
omit.labels = c("Country-fixed effects", "Cohort-fixed effects"),
omit.yes.no = c("Yes", "No"))
Expected Output:
==========================================================================================================================
Dependent variable:
----------------------------------------------------------------------------------------------------
y
(1) (2) (3) (4)
--------------------------------------------------------------------------------------------------------------------------
x 1.554*** 1.534*** 1.541*** 1.517***
(0.175) (0.173) (0.174) (0.171)
Constant -6.316
(17.585)
--------------------------------------------------------------------------------------------------------------------------
Cohort-fixed effects No No Yes Yes
Country-fixed effects No Yes No Yes
--------------------------------------------------------------------------------------------------------------------------
Observations 100 100 100 100
R2 0.445 0.997 0.997 0.997
Adjusted R2 0.439 0.996 0.996 0.997
Residual Std. Error 9.083 (df = 98) 8.916 (df = 94) 8.984 (df = 95) 8.764 (df = 91)
F Statistic 78.590*** (df = 1; 98) 4,692.164*** (df = 6; 94) 5,546.079*** (df = 5; 95) 3,238.705*** (df = 9; 91)
==========================================================================================================================
Note: *p<0.1; **p<0.05; ***p<0.01
Output I get:
==========================================================================================================================
Dependent variable:
----------------------------------------------------------------------------------------------------
y
(1) (2) (3) (4)
--------------------------------------------------------------------------------------------------------------------------
x 1.554*** 1.534*** 1.541*** 1.517***
(0.175) (0.173) (0.174) (0.171)
Constant -6.316
(17.585)
--------------------------------------------------------------------------------------------------------------------------
Cohort-fixed effects No No Yes No
Country-fixed effects No Yes No Yes
--------------------------------------------------------------------------------------------------------------------------
Observations 100 100 100 100
R2 0.445 0.997 0.997 0.997
Adjusted R2 0.439 0.996 0.996 0.997
Residual Std. Error 9.083 (df = 98) 8.916 (df = 94) 8.984 (df = 95) 8.764 (df = 91)
F Statistic 78.590*** (df = 1; 98) 4,692.164*** (df = 6; 94) 5,546.079*** (df = 5; 95) 3,238.705*** (df = 9; 91)
==========================================================================================================================
Note: *p<0.1; **p<0.05; ***p<0.01
In an older post (Dummy variables in several regressions using Stargazer in R) somebody suggested to just flip the models like this
stargazer(model4, model3, model2, model1,
type = "text",
omit = c("countries", "birth_cohorts"),
omit.labels = c("Country-fixed effects", "Cohort-fixed effects"),
omit.yes.no = c("Yes", "No"))
And this actually works and I get the correct Yes/No values:
==========================================================================================================================
Dependent variable:
----------------------------------------------------------------------------------------------------
y
(1) (2) (3) (4)
--------------------------------------------------------------------------------------------------------------------------
x 1.517*** 1.541*** 1.534*** 1.554***
(0.171) (0.174) (0.173) (0.175)
Constant -6.316
(17.585)
--------------------------------------------------------------------------------------------------------------------------
Cohort-fixed effects Yes Yes No No
Country-fixed effects Yes No Yes No
--------------------------------------------------------------------------------------------------------------------------
Observations 100 100 100 100
R2 0.997 0.997 0.997 0.445
Adjusted R2 0.997 0.996 0.996 0.439
Residual Std. Error 8.764 (df = 91) 8.984 (df = 95) 8.916 (df = 94) 9.083 (df = 98)
F Statistic 3,238.705*** (df = 9; 91) 5,546.079*** (df = 5; 95) 4,692.164*** (df = 6; 94) 78.590*** (df = 1; 98)
==========================================================================================================================
Note: *p<0.1; **p<0.05; ***p<0.01
However, I would like to stick to the original sorting of the models. Any idea what might be the problem and how I could solve it without changing the order of the models? Thanks.