I am new to the plm package, but I need to do a system gmm on a dataset for my bachelor thesis in economics. I am trying to regress the GDP on the amount of railways stations built. This includes of course a lagged GDP variable and other controls, such as the human capital etc.
I use the pgmm function in the plm-package, however, I always get the following error message:
Error in solve.default(crossprod(WX, t.CP.WX.A1)) :
Lapack routine dgesv: system is exactly singular: U[1,1] = 0
I understand that this can occur if there is a high correlation between some variables, but I am quite sure that this is not the case.
This is the code I tried and a sample from the panel data:
gmm_estimator <- pgmm(log(GDP) ~ lag(log(GDP), 1:2) + lag(Stations, 0:1) + lag(Humancapital, 0:2)
| lag(log(GDP), 2:5),
data = Bachelor_DF,
effect = "twoways",
model="twosteps")
id year GDP Stations Humancapital
GDP_1860 101 1870 370.5480 5 NA
GDP_1870 101 1880 430.2257 0 0.40458015
GDP_1880 101 1890 512.6674 0 0.31274131
GDP_1900 101 1900 657.8438 0 0.36095965
GDP_1910 101 1910 887.6366 0 0.48163265
GDP_1860.1 102 1870 383.6571 3 NA
GDP_1870.1 102 1880 431.9505 2 0.53590734
GDP_1880.1 102 1890 479.9470 0 0.30722892
GDP_1900.1 102 1900 659.2220 2 0.42857143
GDP_1910.1 102 1910 743.7952 0 0.49295775
GDP_1860.2 103 1870 327.8067 1 NA
GDP_1870.2 103 1880 411.4266 5 0.34935065
GDP_1880.2 103 1890 491.3625 0 0.23244552
GDP_1900.2 103 1900 727.0378 3 0.42086835
GDP_1910.2 103 1910 879.2722 1 0.44058745
GDP_1860.3 104 1870 364.6654 3 NA
GDP_1870.3 104 1880 456.8578 4 0.40669241
GDP_1880.3 104 1890 535.1647 0 0.23443223
GDP_1900.3 104 1900 767.6347 2 0.25925926
GDP_1910.3 104 1910 918.7938 0 0.41682975
GDP_1860.4 105 1870 358.7054 1 NA
GDP_1870.4 105 1880 461.3142 6 0.31396588
GDP_1880.4 105 1890 551.4780 2 0.28403361
GDP_1900.4 105 1900 764.8562 0 0.33770492
GDP_1910.4 105 1910 1081.6443 3 0.45477599
GDP_1860.5 106 1870 433.6760 0 NA
GDP_1870.5 106 1880 500.1974 3 0.40723982
GDP_1880.5 106 1890 582.9612 0 0.32718894
GDP_1900.5 106 1900 827.3129 2 0.38211788
GDP_1910.5 106 1910 1132.2033 0 0.51446945
GDP_1860.6 107 1870 439.5436 0 NA
GDP_1870.6 107 1880 520.0694 0 0.53041695
GDP_1880.6 107 1890 581.3917 0 0.37232143
GDP_1900.6 107 1900 767.7895 5 0.45877551
GDP_1910.6 107 1910 929.2661 0 0.57142857
GDP_1860.7 108 1870 391.5621 0 NA
GDP_1870.7 108 1880 446.5804 7 0.37020906
GDP_1880.7 108 1890 572.8922 0 0.29817833
GDP_1900.7 108 1900 771.3239 0 0.42473634
GDP_1910.7 108 1910 1065.9663 0 0.46185065
GDP_1860.8 109 1870 390.2888 3 NA
GDP_1870.8 109 1880 461.9698 0 0.44285714
GDP_1880.8 109 1890 567.3165 0 0.33956044
GDP_1900.8 109 1900 782.6338 0 0.39285714
GDP_1910.8 109 1910 944.6441 0 0.52597403
GDP_1860.9 110 1870 394.4520 3 NA
GDP_1870.9 110 1880 454.7100 8 0.51518560
GDP_1880.9 110 1890 638.3998 0 0.38557994
GDP_1900.9 110 1900 874.0908 0 0.42964286
GDP_1910.9 110 1910 1209.1216 4 0.54148282
I also experimented with using different lags as the instrument variable (unfortunately, I only have 5 different time periods for the GDP, so the time frame is quite limited in itself...).
Furthermore, I tried to change the effect to "individual", but the error persists.