0

I would like to compute a summary statistic of a set of groups within my data (such as age,sex) with confidence intervals. For that purpose I use monte carlo simulation drawing values from a Poisson distribution for every row in my data and then collapsing the rows to have the summary. The whole procedure works fine if the result of the simulation is just one value (using return scalar in rclass) but as soon as I try to simulate more than one result (using ereturn matrix in eclass) it is not working (see Stata code below). I get the error message: "type mismatch error in expression: e(A)". How could I simulate a whole vector or even matrix of results without more complex loops etc.?

Many thanks! Fred

program bootPGW, eclass
use "C:\Users\649007\Desktop\Demetriq_PWG_edu.dta", replace
gen id=_n
sort id
gen N=_N
by id: gen    DL2=floor(rpoisson(calung))
by id:gen     D02=floor(rpoisson(D0))
by id:gen     Dsmoking=floor(rpoisson(smoking))
by id:gen     ML2=(DL2/numpyr)*1000
by id:gen     AL2=(ML2-CPSIIrate)/ML2
by id:replace AL2=0 if AL2<0
by id:gen     A02=1-exp(-PWGcoef*(ML2-CPSIIrate))
by id:gen     A2=(AL2*DL2+A02*D02)/(DL2+D02)
gen Adeaths=totdeath*A2
collapse (sum) Adeaths=Adeaths totdeath=totdeath Dsmoking=Dsmoking, by(edu_3cat sex country year)
gen AF_PWG=Adeaths/totdeath
gen AF_simple=Dsmoking/totdeath
mkmat AF_PWG, matrix(A)
ereturn matrix A=A
end

simulate a=e(A), reps(1000) nodots seed(123): bootPGW 
  • crossposted at: [link](http://www.statalist.org/forums/forum/general-stata-discussion/general/2438-monte-carlo-simulation-of-a-whole-matrix-using-simulate-and-eclass) I will post the answer here, if we solve the problem in the Stata forum. – user3568812 Apr 24 '14 at 15:28

1 Answers1

5

The key part is that you can return the matrix in e(b) using ereturn post, in which case you can use the short-cut _b in simulate.

Other than that I cleaned up the code a bit: you don't need to do it by _n, as that is just the same as a regular generate. you also don't need the floor() function around rpoisson() as it already returns integers. You are not using N, so you don't need it, but even if you do you can better store it as a local macro (or if you prefer a scalar) instead of a variable (column in the dataset) as it is a single number, so storing it in a variable just wastes memory.

program bootPGW, eclass
    use "C:\Users\649007\Desktop\Demetriq_PWG_edu.dta", replace
    gen     DL2=rpoisson(calung)
    gen     D02=rpoisson(D0)
    gen     Dsmoking=rpoisson(smoking)
    gen     ML2=(DL2/numpyr)*1000
    gen     AL2=(ML2-CPSIIrate)/ML2
    replace AL2=0 if AL2<0
    gen     A02=1-exp(-PWGcoef*(ML2-CPSIIrate))
    gen     A2=(AL2*DL2+A02*D02)/(DL2+D02)
    gen Adeaths=totdeath*A2
    collapse (sum) Adeaths=Adeaths totdeath=totdeath Dsmoking=Dsmoking, ///
        by(edu_3cat sex country year)
    gen AF_PWG=Adeaths/totdeath
    gen AF_simple=Dsmoking/totdeath
    mkmat AF_PWG, matrix(A)
    ereturn post A
end

simulate _b, reps(1000) nodots seed(123): bootPGW 
Maarten Buis
  • 2,684
  • 12
  • 17
  • Thanks Maarten! This would indeed be very elegant. But I get the following error: `conformability error an error occurred when simulate executed bootPGW r(503);` I guess there is something wrong with the matrix from mkmat? – user3568812 Apr 24 '14 at 15:12
  • A needs to be a rowvector, so you probably need to restructure the matrix A. – Maarten Buis Apr 25 '14 at 07:25
  • Alright it works. I transposed the matrix using `matrix B=A'` and then returned the rowvector B with `ereturn post B`. Although I still cannot simulate a whole matrix, simulating vectors already makes life much easier. Thanks a lot! – user3568812 Apr 25 '14 at 12:50
  • Btw, `gen AF_simple=Dsmoking/totdeath` is also unnecessary as it does not seem to be doing anything. – Francis Smart May 04 '14 at 22:43