0

I'm trying to compute summary statistics for the following variable into one column in Stata.

eststo: quietly estpost sum q58_amount_1 if wave == 1 & q56_save1 == 1
eststo: quietly estpost sum q58_amount_2 if wave == 1 & q56_save2 == 1
eststo: quietly estpost sum q58_amount_3 if wave == 1 & q56_save3 == 1
eststo: quietly estpost sum q58_amount_4 if wave == 1 & q56_save4 == 1
esttab, replace cells("mean(fmt(2)) sd(fmt(2))") label title(Savings II) noobs
eststo clear

q58_amount_1 - 4 are saving amounts for different option (e.g. using bank accounts, putting the money underneath a mattress, etc.). q56_save1 == 1 indicates that there are indeed positive savings for the given option. wave indicates the time period.

Here is some sample data

     | id   wave   q58_~t_1   q58_~t_2   q58_~t_3   q58_~t_4   q56_sa~1   q56_sa~2   q56_sa~3   q56_sa~4 |
     |---------------------------------------------------------------------------------------------------|
  1. |  1      1          0    1020000          0          0          0          1          0          0 |
  2. |  1      2      60000     380000          0          0          1          1          0          0 |
  3. |  2      1          0      50000          0          0          0          1          0          0 |
  4. |  2      2          0    1500000          0      15000          0          1          0          1 |
  5. |  3      1          0          0          0     150000          0          0          0          1 |
     |---------------------------------------------------------------------------------------------------|
  6. |  3      2     200000          0      30000      11000          1          0          1          1 |
  7. |  4      1          0          0          0          0          0          0          0          0 |
  8. |  4      2          0     220000      40000      20000          0          1          1          1 |
  9. |  5      1     200000          0          0          0          1          0          0          0 |
 10. |  5      2          0          0          0          0          0          0          0          0 |
     |---------------------------------------------------------------------------------------------------|
 11. |  6      1          0     100000          0      10000          0          1          0          1 |
 12. |  6      2          0          0          0     300000          0          0          0          1 |
 13. |  7      1          0    3000000     700000          0          0          1          1          0 |
 14. |  7      2     500000          0    1500000          0          1          0          1          0 |
 15. |  8      1      60000     320000          0          0          1          1          0          0 |
     |---------------------------------------------------------------------------------------------------|
 16. |  8      2          0     350000          0          0          0          1          0          0 |
 17. |  9      1          0     250000          0          0          0          1          0          0 |
 18. |  9      2          .          .          .          .          .          .          .          . |
 19. | 10      1          0     200000          0          0          0          1          0          0 |
 20. | 10      2     130000     800000    1000000          0          1          1          1          0 |

Using the above command I get the following output:


Savings II
--------------------------------------------------------------------------------------------------------------------
> --------
                              (1)                       (2)                       (3)                       (4)     
>         
                                                                                                                    
>         
                             mean           sd         mean           sd         mean           sd         mean     
>       sd
--------------------------------------------------------------------------------------------------------------------
> --------
q58_amount_1            288669.40    987614.42                                                                      
>         
q58_amount_2                                      664938.81   1609076.48                                            
>         
q58_amount_3                                                               2448495.47   5439977.76                  
>         
q58_amount_4                                                                                          243138.35    5
> 33569.69
--------------------------------------------------------------------------------------------------------------------
> --------

. eststo clear

. 
end of do-file

. 

What I would like to have are two columns: One for the mean and one for the sd, i.e. not 8 separate columns. I'm not familiar with esttab so this might seem obvious for regular Stata users. I`d be grateful for some advice.

  • Cross-posted at https://www.statalist.org/forums/forum/general-stata-discussion/general/1640324-combining-summary-statistics-into-one-column-using-esttab It's always courteous to tell people about cross-posting – Nick Cox Dec 10 '21 at 15:50
  • Thanks for the hint! I will do that in the future. – python_begins Dec 12 '21 at 15:04

1 Answers1

3

I don't personally use esttab, but it appears your data is apt for a reshape to long. Doing this will occlude the need for 4 separate estpost calls.

Sample Data (ps. please in the future include a reproducible example)

Here I am assuming that q56_save{i} is binary as well as 2 waves.

clear all
set obs 100

forval i = 1/4 {
    gen q58_amount_`i' = rnormal()
    gen q56_save`i' = mod(_n,`i'+1)
    replace q56_save`i' = 0 if q56_save`i' > 1
}
gen wave = _n < _N/2

Reshape and esttab

Note here I substitute tabstat for summarize.

gen row_id = _n
reshape long q56_save q58_amount_, i(wave row_id) j(q)
gen str_quarter = "q58_amount_" + string(q)

* add "nototal" option to suppress Total line
eststo: quietly estpost tabstat q58_amount if wave == 1 & q56_save == 1, s(mean sd) by(str_quarter) 
esttab, replace cells("Mean(fmt(2)) SD(fmt(2))") label title(Savings II) noobs

Savings II
----------------------------------------------
                              (1)             
                                              
                             Mean           SD
----------------------------------------------
q58_amount_1                 0.62         0.98
q58_amount_2                -0.04         1.19
q58_amount_3                 0.40         0.90
q58_amount_4                -0.07         0.91
Total                        0.30         1.04
----------------------------------------------

JR96
  • 953
  • 5
  • 12
  • Thank you for the answer. Unfortunately when I run your code I get an empty table. Might this be because of by Stata version? Also, I think you did not get the q56_save dummies entirely right. I included some of my data (apologies for not doing that in the first place). – python_begins Dec 14 '21 at 10:44
  • Stata version is most unlikely to be an issue here. – Nick Cox Dec 14 '21 at 11:47
  • The form of your q56_save dummies isn't important to the method. so long as they are sometimes equal to 1. Did you change the case of `mean(fmt(2)) sd(fmt(2))` as I have shown? If left all lowercase this will result in an empty table. – JR96 Dec 14 '21 at 15:05
  • If I change `Mean(fmt(2)) SD(fmt(2))` to `mean(fmt(2)) sd(fmt(2))` the table is not empty anymore, not the other way around. Thanks for the hint, though. And thanks for your solution. – python_begins Dec 15 '21 at 09:36