3

I am trying to build a regression table with R package Stargazer that contains 4 different lm models. My data contains x,y and two categorical variables. Depending on the model I either include none, one or both categorical variables. However, I don't want the values of the categorical variables to show up in the regressions but instead pass them as arguments to the omit parameter in the stargazer command. However, for model 4 I don't get the expected Yes/No output stating whether the categorical variables are included the model.

Here is a Minimum Working Example:

library(stargazer)

set.seed(42)
x <- rnorm(100, mean = 100, sd = 5)
e <- rnorm(100, mean = 0, sd = 10)
y <- x*1.5+e
countries <- sample(c("CAN", "GRC", "PRT", "THA", "NZL"), size=100, replace=T)
birth_cohorts <- sample(c("1980", "1990", "2000", "2010"), size=100, replace=T)

model1 <- lm(y ~ x)
sum1 <- summary(model1)
sum1
model2 <- lm(y ~ x + countries - 1)
sum2 <- summary(model2)
sum2
model3 <- lm(y ~ x + birth_cohorts - 1)
sum3 <- summary(model3)
sum3
model4 <- lm(y ~ x + countries + birth_cohorts - 1)
sum4 <- summary(model4)
sum4

stargazer(model1, model2, model3, model4,
          type = "text",
          omit = c("countries", "birth_cohorts"),
          omit.labels = c("Country-fixed effects", "Cohort-fixed effects"),
          omit.yes.no = c("Yes", "No"))

Expected Output:

==========================================================================================================================
                                                              Dependent variable:                                         
                      ----------------------------------------------------------------------------------------------------
                                                                       y                                                  
                               (1)                      (2)                       (3)                       (4)           
--------------------------------------------------------------------------------------------------------------------------
x                            1.554***                1.534***                  1.541***                  1.517***         
                             (0.175)                  (0.173)                   (0.174)                   (0.171)         
                                                                                                                          
Constant                      -6.316                                                                                      
                             (17.585)                                                                                     
                                                                                                                          
--------------------------------------------------------------------------------------------------------------------------
Cohort-fixed effects            No                      No                        Yes                       Yes            
Country-fixed effects           No                      Yes                       No                        Yes           
--------------------------------------------------------------------------------------------------------------------------
Observations                   100                      100                       100                       100           
R2                            0.445                    0.997                     0.997                     0.997          
Adjusted R2                   0.439                    0.996                     0.996                     0.997          
Residual Std. Error      9.083 (df = 98)          8.916 (df = 94)           8.984 (df = 95)           8.764 (df = 91)     
F Statistic           78.590*** (df = 1; 98) 4,692.164*** (df = 6; 94) 5,546.079*** (df = 5; 95) 3,238.705*** (df = 9; 91)
==========================================================================================================================
Note:                                                                                          *p<0.1; **p<0.05; ***p<0.01

Output I get:

==========================================================================================================================
                                                              Dependent variable:                                         
                      ----------------------------------------------------------------------------------------------------
                                                                       y                                                  
                               (1)                      (2)                       (3)                       (4)           
--------------------------------------------------------------------------------------------------------------------------
x                            1.554***                1.534***                  1.541***                  1.517***         
                             (0.175)                  (0.173)                   (0.174)                   (0.171)         
                                                                                                                          
Constant                      -6.316                                                                                      
                             (17.585)                                                                                     
                                                                                                                          
--------------------------------------------------------------------------------------------------------------------------
Cohort-fixed effects            No                      No                        Yes                       No           
Country-fixed effects           No                      Yes                       No                        Yes           
--------------------------------------------------------------------------------------------------------------------------
Observations                   100                      100                       100                       100           
R2                            0.445                    0.997                     0.997                     0.997          
Adjusted R2                   0.439                    0.996                     0.996                     0.997          
Residual Std. Error      9.083 (df = 98)          8.916 (df = 94)           8.984 (df = 95)           8.764 (df = 91)     
F Statistic           78.590*** (df = 1; 98) 4,692.164*** (df = 6; 94) 5,546.079*** (df = 5; 95) 3,238.705*** (df = 9; 91)
==========================================================================================================================
Note:                                                                                          *p<0.1; **p<0.05; ***p<0.01

In an older post (Dummy variables in several regressions using Stargazer in R) somebody suggested to just flip the models like this

stargazer(model4, model3, model2, model1,
          type = "text",
          omit = c("countries", "birth_cohorts"),
          omit.labels = c("Country-fixed effects", "Cohort-fixed effects"),
          omit.yes.no = c("Yes", "No"))

And this actually works and I get the correct Yes/No values:

==========================================================================================================================
                                                              Dependent variable:                                         
                      ----------------------------------------------------------------------------------------------------
                                                                       y                                                  
                                 (1)                       (2)                       (3)                     (4)          
--------------------------------------------------------------------------------------------------------------------------
x                             1.517***                  1.541***                  1.534***                 1.554***       
                               (0.171)                   (0.174)                   (0.173)                 (0.175)        
                                                                                                                          
Constant                                                                                                    -6.316        
                                                                                                           (17.585)       
                                                                                                                          
--------------------------------------------------------------------------------------------------------------------------
Cohort-fixed effects             Yes                       Yes                       No                       No          
Country-fixed effects            Yes                       No                        Yes                      No          
--------------------------------------------------------------------------------------------------------------------------
Observations                     100                       100                       100                     100          
R2                              0.997                     0.997                     0.997                   0.445         
Adjusted R2                     0.997                     0.996                     0.996                   0.439         
Residual Std. Error        8.764 (df = 91)           8.984 (df = 95)           8.916 (df = 94)         9.083 (df = 98)    
F Statistic           3,238.705*** (df = 9; 91) 5,546.079*** (df = 5; 95) 4,692.164*** (df = 6; 94) 78.590*** (df = 1; 98)
==========================================================================================================================
Note:                                                                                          *p<0.1; **p<0.05; ***p<0.01

However, I would like to stick to the original sorting of the models. Any idea what might be the problem and how I could solve it without changing the order of the models? Thanks.

intedgar
  • 631
  • 1
  • 11

1 Answers1

2

My klunky solution would be to just manually add.lines. I.e.

stargazer(model1, model2, model3, model4,
          type = "text",
          omit = c("countries", "birth_cohorts"),
          add.lines = list(c('Country FE','No','No','Yes','Yes'),
                           c('Birth cohort FE', 'No', 'Yes', 'No', 'Yes'))
)


------------------------------------------------------------------------------------------------------------------------
Country FE                    No                      No                        Yes                       Yes
Birth cohort FE               No                      Yes                       No                        Yes
Observations                 100                      100                       100                       100
R2                          0.445                    0.997                     0.997                     0.997
Adjusted R2                 0.439                    0.996                     0.996                     0.997
Residual Std. Error    9.083 (df = 98)          8.916 (df = 94)           8.984 (df = 95)           8.764 (df = 91)
F Statistic         78.590*** (df = 1; 98) 4,692.164*** (df = 6; 94) 5,546.079*** (df = 5; 95) 3,238.705*** (df = 9; 91)
========================================================================================================================
Note:                                                                                        *p<0.1; **p<0.05; ***p<0.01

Based on this discussion, there might be a bug in Stargazer omit.labels, and, sadly, the package has not been updated in a while.

If you are working with LaTeX, you could also look into starpolishr. Example for how to add lines with starpolishr can be found in the answer to this question.

Otto Kässi
  • 2,943
  • 1
  • 10
  • 27
  • 1
    Thanks for the suggestion. That wouId actually work, yes. I didn't want to do that because I have a lot of regression tables and didn't want to specify it everywhere. What I did instead now was to put the models in flipped order like suggested in the link mentioned above (https://stackoverflow.com/questions/36022621/dummy-variables-in-several-regressions-using-stargazer-in-r) and then I wrote a short python script to reverse the columns again to normal for many tables simultaneously. If it helps anyone I published it on my github (same name as here on stackoverflow) – intedgar Nov 05 '21 at 10:42