0

I want to run Panel OLS regressions with 3+ fixed-effect and errors clustering, but linearmodels.PanelOLS only allow for ≤2 fixed-effect and my implementation with statsmodels.OLS doesn't cope with error clustering.

Context

Dataset

I have a panel dataset constituted of two measured variables (population and gdp) and multiple indices:

  • The country these data points refer to (e.g. Angola, Benin, Chad, etc.),
  • The source these data come from (e.g. OECD, IMF, WordBank)
  • The year_data, i.e. what year the datapoint refer to (e.g. the population in 2018),
  • The year_publication, i.e. when these data were published (e.g. the population in 2018 as published in 2019). Indeed, actual values might be revised over time once estimates are consolidated.

Additionally, we have continent, for which country is nested into.

Thus, each record can be described as:

The year_data population/gdp of country (that is part of continent), as estimated by source in year_publication is….

The 2018 population/gdp of Benin (that is part of Africa), as estimated by the IMF in 2020 is….

Actual problem

I have the working script below in Stata that I want to port to Python:

/* Install dependencies */

ssc install ftools
ssc install reghdfe


/* Prepare data */

* Load dataset
clear
input str6 continent str7 country str9 source int(year_publication year_data population) float gdp
"Africa" "Angola"  "OECD"      2020 2018 972  52.69
"Africa" "Angola"  "OECD"      2020 2019 986  802.7
"Africa" "Angola"  "OECD"      2020 2020 641 568.74
"Africa" "Angola"  "OECD"      2021 2018 438 168.83
"Africa" "Angola"  "OECD"      2021 2019 958 310.57
"Africa" "Angola"  "OECD"      2021 2020 270 144.02
"Africa" "Angola"  "OECD"      2022 2018 528 359.71
"Africa" "Angola"  "OECD"      2022 2019 974 582.98
"Africa" "Angola"  "OECD"      2022 2020 835 820.49
"Africa" "Angola"  "IMF"       2020 2018 168 148.85
"Africa" "Angola"  "IMF"       2020 2019 460 236.21
"Africa" "Angola"  "IMF"       2020 2020 360 297.15
"Africa" "Angola"  "IMF"       2021 2018 381 249.13
"Africa" "Angola"  "IMF"       2021 2019 648 128.05
"Africa" "Angola"  "IMF"       2021 2020 206 179.05
"Africa" "Angola"  "IMF"       2022 2018 282 150.29
"Africa" "Angola"  "IMF"       2022 2019 125  23.42
"Africa" "Angola"  "IMF"       2022 2020 410 247.35
"Africa" "Angola"  "WorldBank" 2020 2018 553 182.06
"Africa" "Angola"  "WorldBank" 2020 2019 847 698.87
"Africa" "Angola"  "WorldBank" 2020 2020 844 126.61
"Africa" "Angola"  "WorldBank" 2021 2018 307 239.76
"Africa" "Angola"  "WorldBank" 2021 2019 659 510.73
"Africa" "Angola"  "WorldBank" 2021 2020 548 331.89
"Africa" "Angola"  "WorldBank" 2022 2018 448 122.76
"Africa" "Angola"  "WorldBank" 2022 2019 768 761.41
"Africa" "Angola"  "WorldBank" 2022 2020 324 163.57
"Africa" "Benin"   "OECD"      2020 2018 513  196.9
"Africa" "Benin"   "OECD"      2020 2019 590   83.7
"Africa" "Benin"   "OECD"      2020 2020 791 511.09
"Africa" "Benin"   "OECD"      2021 2018 799 474.43
"Africa" "Benin"   "OECD"      2021 2019 455 234.21
"Africa" "Benin"   "OECD"      2021 2020 549 238.83
"Africa" "Benin"   "OECD"      2022 2018 235 229.33
"Africa" "Benin"   "OECD"      2022 2019 347  46.51
"Africa" "Benin"   "OECD"      2022 2020 532 392.13
"Africa" "Benin"   "IMF"       2020 2018 138 137.05
"Africa" "Benin"   "IMF"       2020 2019 978 239.82
"Africa" "Benin"   "IMF"       2020 2020 821  33.41
"Africa" "Benin"   "IMF"       2021 2018 453 291.93
"Africa" "Benin"   "IMF"       2021 2019 526 381.88
"Africa" "Benin"   "IMF"       2021 2020 467 313.57
"Africa" "Benin"   "IMF"       2022 2018 948 555.23
"Africa" "Benin"   "IMF"       2022 2019 323 289.91
"Africa" "Benin"   "IMF"       2022 2020 421  62.35
"Africa" "Benin"   "WorldBank" 2020 2018 983 271.69
"Africa" "Benin"   "WorldBank" 2020 2019 138  23.55
"Africa" "Benin"   "WorldBank" 2020 2020 636 623.65
"Africa" "Benin"   "WorldBank" 2021 2018 653 534.99
"Africa" "Benin"   "WorldBank" 2021 2019 564  368.8
"Africa" "Benin"   "WorldBank" 2021 2020 741 312.02
"Africa" "Benin"   "WorldBank" 2022 2018 328 292.11
"Africa" "Benin"   "WorldBank" 2022 2019 653 429.21
"Africa" "Benin"   "WorldBank" 2022 2020 951 242.73
"Africa" "Chad"    "OECD"      2020 2018 176  95.06
"Africa" "Chad"    "OECD"      2020 2019 783 425.34
"Africa" "Chad"    "OECD"      2020 2020 885  461.6
"Africa" "Chad"    "OECD"      2021 2018 673  15.87
"Africa" "Chad"    "OECD"      2021 2019 131  74.46
"Africa" "Chad"    "OECD"      2021 2020 430  61.58
"Africa" "Chad"    "OECD"      2022 2018 593 211.34
"Africa" "Chad"    "OECD"      2022 2019 647 550.37
"Africa" "Chad"    "OECD"      2022 2020 154 105.65
"Africa" "Chad"    "IMF"       2020 2018 160  32.41
"Africa" "Chad"    "IMF"       2020 2019 654  27.84
"Africa" "Chad"    "IMF"       2020 2020 616 468.92
"Africa" "Chad"    "IMF"       2021 2018 996   22.4
"Africa" "Chad"    "IMF"       2021 2019 126  93.18
"Africa" "Chad"    "IMF"       2021 2020 879 547.87
"Africa" "Chad"    "IMF"       2022 2018 663    520
"Africa" "Chad"    "IMF"       2022 2019 681 544.76
"Africa" "Chad"    "IMF"       2022 2020 101   55.6
"Africa" "Chad"    "WorldBank" 2020 2018 786 757.22
"Africa" "Chad"    "WorldBank" 2020 2019 599 593.69
"Africa" "Chad"    "WorldBank" 2020 2020 641 529.84
"Africa" "Chad"    "WorldBank" 2021 2018 343 287.89
"Africa" "Chad"    "WorldBank" 2021 2019 438 340.83
"Africa" "Chad"    "WorldBank" 2021 2020 762 594.67
"Africa" "Chad"    "WorldBank" 2022 2018 430 128.69
"Africa" "Chad"    "WorldBank" 2022 2019 260 242.59
"Africa" "Chad"    "WorldBank" 2022 2020 607  216.1
"Europe" "Denmark" "OECD"      2020 2018 114  86.75
"Europe" "Denmark" "OECD"      2020 2019 937 373.29
"Europe" "Denmark" "OECD"      2020 2020 866 392.93
"Europe" "Denmark" "OECD"      2021 2018 296  41.04
"Europe" "Denmark" "OECD"      2021 2019 402  32.67
"Europe" "Denmark" "OECD"      2021 2020 306   7.88
"Europe" "Denmark" "OECD"      2022 2018 540 379.51
"Europe" "Denmark" "OECD"      2022 2019 108  26.72
"Europe" "Denmark" "OECD"      2022 2020 752  307.2
"Europe" "Denmark" "IMF"       2020 2018 157  24.24
"Europe" "Denmark" "IMF"       2020 2019 303  79.04
"Europe" "Denmark" "IMF"       2020 2020 286 122.36
"Europe" "Denmark" "IMF"       2021 2018 569  69.32
"Europe" "Denmark" "IMF"       2021 2019 808 642.67
"Europe" "Denmark" "IMF"       2021 2020 157   5.58
"Europe" "Denmark" "IMF"       2022 2018 147 112.21
"Europe" "Denmark" "IMF"       2022 2019 414 311.16
"Europe" "Denmark" "IMF"       2022 2020 774 230.46
"Europe" "Denmark" "WorldBank" 2020 2018 695 350.03
"Europe" "Denmark" "WorldBank" 2020 2019 511 209.84
"Europe" "Denmark" "WorldBank" 2020 2020 181  29.27
"Europe" "Denmark" "WorldBank" 2021 2018 503 176.89
"Europe" "Denmark" "WorldBank" 2021 2019 710 609.02
"Europe" "Denmark" "WorldBank" 2021 2020 264 165.78
"Europe" "Denmark" "WorldBank" 2022 2018 670 638.99
"Europe" "Denmark" "WorldBank" 2022 2019 651  354.6
"Europe" "Denmark" "WorldBank" 2022 2020 632 623.94
"Europe" "Estonia" "OECD"      2020 2018 838 263.67
"Europe" "Estonia" "OECD"      2020 2019 638 533.95
"Europe" "Estonia" "OECD"      2020 2020 898 638.73
"Europe" "Estonia" "OECD"      2021 2018 262  98.16
"Europe" "Estonia" "OECD"      2021 2019 569 552.54
"Europe" "Estonia" "OECD"      2021 2020 868 252.48
"Europe" "Estonia" "OECD"      2022 2018 927 264.65
"Europe" "Estonia" "OECD"      2022 2019 205  150.6
"Europe" "Estonia" "OECD"      2022 2020 828 752.61
"Europe" "Estonia" "IMF"       2020 2018 841 176.31
"Europe" "Estonia" "IMF"       2020 2019 614 230.55
"Europe" "Estonia" "IMF"       2020 2020 500  41.19
"Europe" "Estonia" "IMF"       2021 2018 510 169.68
"Europe" "Estonia" "IMF"       2021 2019 765 401.85
"Europe" "Estonia" "IMF"       2021 2020 751  319.6
"Europe" "Estonia" "IMF"       2022 2018 314  58.81
"Europe" "Estonia" "IMF"       2022 2019 155   2.24
"Europe" "Estonia" "IMF"       2022 2020 734  187.6
"Europe" "Estonia" "WorldBank" 2020 2018 332 160.17
"Europe" "Estonia" "WorldBank" 2020 2019 466 385.33
"Europe" "Estonia" "WorldBank" 2020 2020 487 435.06
"Europe" "Estonia" "WorldBank" 2021 2018 461 249.19
"Europe" "Estonia" "WorldBank" 2021 2019 932 763.38
"Europe" "Estonia" "WorldBank" 2021 2020 650 463.91
"Europe" "Estonia" "WorldBank" 2022 2018 570 549.97
"Europe" "Estonia" "WorldBank" 2022 2019 909  80.48
"Europe" "Estonia" "WorldBank" 2022 2020 523 242.22
"Europe" "Finland" "OECD"      2020 2018 565 561.64
"Europe" "Finland" "OECD"      2020 2019 646 161.62
"Europe" "Finland" "OECD"      2020 2020 194 133.69
"Europe" "Finland" "OECD"      2021 2018 529  39.76
"Europe" "Finland" "OECD"      2021 2019 800 680.12
"Europe" "Finland" "OECD"      2021 2020 418 399.19
"Europe" "Finland" "OECD"      2022 2018 591 253.12
"Europe" "Finland" "OECD"      2022 2019 457 272.58
"Europe" "Finland" "OECD"      2022 2020 157  105.1
"Europe" "Finland" "IMF"       2020 2018 860 445.03
"Europe" "Finland" "IMF"       2020 2019 108  47.72
"Europe" "Finland" "IMF"       2020 2020 523 500.58
"Europe" "Finland" "IMF"       2021 2018 560  81.47
"Europe" "Finland" "IMF"       2021 2019 830 664.64
"Europe" "Finland" "IMF"       2021 2020 903 762.62
"Europe" "Finland" "IMF"       2022 2018 179 167.73
"Europe" "Finland" "IMF"       2022 2019 137  98.98
"Europe" "Finland" "IMF"       2022 2020 666 524.86
"Europe" "Finland" "WorldBank" 2020 2018 319 146.01
"Europe" "Finland" "WorldBank" 2020 2019 401 219.56
"Europe" "Finland" "WorldBank" 2020 2020 711  45.35
"Europe" "Finland" "WorldBank" 2021 2018 828  20.97
"Europe" "Finland" "WorldBank" 2021 2019 180   66.3
"Europe" "Finland" "WorldBank" 2021 2020 682  92.57
"Europe" "Finland" "WorldBank" 2022 2018 254   81.2
"Europe" "Finland" "WorldBank" 2022 2019 619 159.08
"Europe" "Finland" "WorldBank" 2022 2020 191  184.4
end

* Encode categorical variables
foreach variable in "continent" "country" "source" {
    encode `variable', generate(`variable'_id)
}


/* Run regression */

* Panel OLS with `year_publication' & `country' & `source' Fixed-Effect + year_publication' & `continent' & `source'-clustered errors
regress gdp population i.year_publication i.country_id i.source_id, vce(cluster source)
reghdfe gdp population, absorb(year_publication country source) vce(cluster source)

Problem

I would have liked to use linearmodels.panel.model.PanelOLS, yet it only supports 2 effects max, where I'd like 3+:

Model supports at most 2 effects. These can be entity-time, entity-other, time-other or 2 other.

According to Kevin S.'s comment to “Python panel data regression with more than two fixed effects”, I should either:

This would give same coefficient, yet different standard errors as using Instrumental Variables relaxes some of the assumptions that PanelOLS are based upon.
So it is not what I'm looking for.

  • or OneHot encode the categorical variables:

I have been able to replicate the result, yet without the clustering of errors (cf. working example below) — i.e. simply reghdfe gdp population, absorb(year_publication country source), without the vce(cluster source) part.
Moreover, my approach seems quite cumbersome and doesn't leverage the Statsmodels/Linearmodels work for Panel Dataset. I wouldn't want to (erroneously) reinvent the wheel.

Question

How to run a Panel OLS regression using 3+ fixed-effects and standard errors clustering?

In other words: How to port reghdfe gdp population, absorb(year_publication country source) vce(cluster source) from Stata to Python?

Working example in Python

from io import StringIO

import pandas as pd
import statsmodels.api as sm

DATA = """
"continent","country","source","year_publication","year_data","population","gdp"
"Africa","Angola","OECD",2020,2018,972,52.69
"Africa","Angola","OECD",2020,2019,986,802.7
"Africa","Angola","OECD",2020,2020,641,568.74
"Africa","Angola","OECD",2021,2018,438,168.83
"Africa","Angola","OECD",2021,2019,958,310.57
"Africa","Angola","OECD",2021,2020,270,144.02
"Africa","Angola","OECD",2022,2018,528,359.71
"Africa","Angola","OECD",2022,2019,974,582.98
"Africa","Angola","OECD",2022,2020,835,820.49
"Africa","Angola","IMF",2020,2018,168,148.85
"Africa","Angola","IMF",2020,2019,460,236.21
"Africa","Angola","IMF",2020,2020,360,297.15
"Africa","Angola","IMF",2021,2018,381,249.13
"Africa","Angola","IMF",2021,2019,648,128.05
"Africa","Angola","IMF",2021,2020,206,179.05
"Africa","Angola","IMF",2022,2018,282,150.29
"Africa","Angola","IMF",2022,2019,125,23.42
"Africa","Angola","IMF",2022,2020,410,247.35
"Africa","Angola","WorldBank",2020,2018,553,182.06
"Africa","Angola","WorldBank",2020,2019,847,698.87
"Africa","Angola","WorldBank",2020,2020,844,126.61
"Africa","Angola","WorldBank",2021,2018,307,239.76
"Africa","Angola","WorldBank",2021,2019,659,510.73
"Africa","Angola","WorldBank",2021,2020,548,331.89
"Africa","Angola","WorldBank",2022,2018,448,122.76
"Africa","Angola","WorldBank",2022,2019,768,761.41
"Africa","Angola","WorldBank",2022,2020,324,163.57
"Africa","Benin","OECD",2020,2018,513,196.9
"Africa","Benin","OECD",2020,2019,590,83.7
"Africa","Benin","OECD",2020,2020,791,511.09
"Africa","Benin","OECD",2021,2018,799,474.43
"Africa","Benin","OECD",2021,2019,455,234.21
"Africa","Benin","OECD",2021,2020,549,238.83
"Africa","Benin","OECD",2022,2018,235,229.33
"Africa","Benin","OECD",2022,2019,347,46.51
"Africa","Benin","OECD",2022,2020,532,392.13
"Africa","Benin","IMF",2020,2018,138,137.05
"Africa","Benin","IMF",2020,2019,978,239.82
"Africa","Benin","IMF",2020,2020,821,33.41
"Africa","Benin","IMF",2021,2018,453,291.93
"Africa","Benin","IMF",2021,2019,526,381.88
"Africa","Benin","IMF",2021,2020,467,313.57
"Africa","Benin","IMF",2022,2018,948,555.23
"Africa","Benin","IMF",2022,2019,323,289.91
"Africa","Benin","IMF",2022,2020,421,62.35
"Africa","Benin","WorldBank",2020,2018,983,271.69
"Africa","Benin","WorldBank",2020,2019,138,23.55
"Africa","Benin","WorldBank",2020,2020,636,623.65
"Africa","Benin","WorldBank",2021,2018,653,534.99
"Africa","Benin","WorldBank",2021,2019,564,368.8
"Africa","Benin","WorldBank",2021,2020,741,312.02
"Africa","Benin","WorldBank",2022,2018,328,292.11
"Africa","Benin","WorldBank",2022,2019,653,429.21
"Africa","Benin","WorldBank",2022,2020,951,242.73
"Africa","Chad","OECD",2020,2018,176,95.06
"Africa","Chad","OECD",2020,2019,783,425.34
"Africa","Chad","OECD",2020,2020,885,461.6
"Africa","Chad","OECD",2021,2018,673,15.87
"Africa","Chad","OECD",2021,2019,131,74.46
"Africa","Chad","OECD",2021,2020,430,61.58
"Africa","Chad","OECD",2022,2018,593,211.34
"Africa","Chad","OECD",2022,2019,647,550.37
"Africa","Chad","OECD",2022,2020,154,105.65
"Africa","Chad","IMF",2020,2018,160,32.41
"Africa","Chad","IMF",2020,2019,654,27.84
"Africa","Chad","IMF",2020,2020,616,468.92
"Africa","Chad","IMF",2021,2018,996,22.4
"Africa","Chad","IMF",2021,2019,126,93.18
"Africa","Chad","IMF",2021,2020,879,547.87
"Africa","Chad","IMF",2022,2018,663,520
"Africa","Chad","IMF",2022,2019,681,544.76
"Africa","Chad","IMF",2022,2020,101,55.6
"Africa","Chad","WorldBank",2020,2018,786,757.22
"Africa","Chad","WorldBank",2020,2019,599,593.69
"Africa","Chad","WorldBank",2020,2020,641,529.84
"Africa","Chad","WorldBank",2021,2018,343,287.89
"Africa","Chad","WorldBank",2021,2019,438,340.83
"Africa","Chad","WorldBank",2021,2020,762,594.67
"Africa","Chad","WorldBank",2022,2018,430,128.69
"Africa","Chad","WorldBank",2022,2019,260,242.59
"Africa","Chad","WorldBank",2022,2020,607,216.1
"Europe","Denmark","OECD",2020,2018,114,86.75
"Europe","Denmark","OECD",2020,2019,937,373.29
"Europe","Denmark","OECD",2020,2020,866,392.93
"Europe","Denmark","OECD",2021,2018,296,41.04
"Europe","Denmark","OECD",2021,2019,402,32.67
"Europe","Denmark","OECD",2021,2020,306,7.88
"Europe","Denmark","OECD",2022,2018,540,379.51
"Europe","Denmark","OECD",2022,2019,108,26.72
"Europe","Denmark","OECD",2022,2020,752,307.2
"Europe","Denmark","IMF",2020,2018,157,24.24
"Europe","Denmark","IMF",2020,2019,303,79.04
"Europe","Denmark","IMF",2020,2020,286,122.36
"Europe","Denmark","IMF",2021,2018,569,69.32
"Europe","Denmark","IMF",2021,2019,808,642.67
"Europe","Denmark","IMF",2021,2020,157,5.58
"Europe","Denmark","IMF",2022,2018,147,112.21
"Europe","Denmark","IMF",2022,2019,414,311.16
"Europe","Denmark","IMF",2022,2020,774,230.46
"Europe","Denmark","WorldBank",2020,2018,695,350.03
"Europe","Denmark","WorldBank",2020,2019,511,209.84
"Europe","Denmark","WorldBank",2020,2020,181,29.27
"Europe","Denmark","WorldBank",2021,2018,503,176.89
"Europe","Denmark","WorldBank",2021,2019,710,609.02
"Europe","Denmark","WorldBank",2021,2020,264,165.78
"Europe","Denmark","WorldBank",2022,2018,670,638.99
"Europe","Denmark","WorldBank",2022,2019,651,354.6
"Europe","Denmark","WorldBank",2022,2020,632,623.94
"Europe","Estonia","OECD",2020,2018,838,263.67
"Europe","Estonia","OECD",2020,2019,638,533.95
"Europe","Estonia","OECD",2020,2020,898,638.73
"Europe","Estonia","OECD",2021,2018,262,98.16
"Europe","Estonia","OECD",2021,2019,569,552.54
"Europe","Estonia","OECD",2021,2020,868,252.48
"Europe","Estonia","OECD",2022,2018,927,264.65
"Europe","Estonia","OECD",2022,2019,205,150.6
"Europe","Estonia","OECD",2022,2020,828,752.61
"Europe","Estonia","IMF",2020,2018,841,176.31
"Europe","Estonia","IMF",2020,2019,614,230.55
"Europe","Estonia","IMF",2020,2020,500,41.19
"Europe","Estonia","IMF",2021,2018,510,169.68
"Europe","Estonia","IMF",2021,2019,765,401.85
"Europe","Estonia","IMF",2021,2020,751,319.6
"Europe","Estonia","IMF",2022,2018,314,58.81
"Europe","Estonia","IMF",2022,2019,155,2.24
"Europe","Estonia","IMF",2022,2020,734,187.6
"Europe","Estonia","WorldBank",2020,2018,332,160.17
"Europe","Estonia","WorldBank",2020,2019,466,385.33
"Europe","Estonia","WorldBank",2020,2020,487,435.06
"Europe","Estonia","WorldBank",2021,2018,461,249.19
"Europe","Estonia","WorldBank",2021,2019,932,763.38
"Europe","Estonia","WorldBank",2021,2020,650,463.91
"Europe","Estonia","WorldBank",2022,2018,570,549.97
"Europe","Estonia","WorldBank",2022,2019,909,80.48
"Europe","Estonia","WorldBank",2022,2020,523,242.22
"Europe","Finland","OECD",2020,2018,565,561.64
"Europe","Finland","OECD",2020,2019,646,161.62
"Europe","Finland","OECD",2020,2020,194,133.69
"Europe","Finland","OECD",2021,2018,529,39.76
"Europe","Finland","OECD",2021,2019,800,680.12
"Europe","Finland","OECD",2021,2020,418,399.19
"Europe","Finland","OECD",2022,2018,591,253.12
"Europe","Finland","OECD",2022,2019,457,272.58
"Europe","Finland","OECD",2022,2020,157,105.1
"Europe","Finland","IMF",2020,2018,860,445.03
"Europe","Finland","IMF",2020,2019,108,47.72
"Europe","Finland","IMF",2020,2020,523,500.58
"Europe","Finland","IMF",2021,2018,560,81.47
"Europe","Finland","IMF",2021,2019,830,664.64
"Europe","Finland","IMF",2021,2020,903,762.62
"Europe","Finland","IMF",2022,2018,179,167.73
"Europe","Finland","IMF",2022,2019,137,98.98
"Europe","Finland","IMF",2022,2020,666,524.86
"Europe","Finland","WorldBank",2020,2018,319,146.01
"Europe","Finland","WorldBank",2020,2019,401,219.56
"Europe","Finland","WorldBank",2020,2020,711,45.35
"Europe","Finland","WorldBank",2021,2018,828,20.97
"Europe","Finland","WorldBank",2021,2019,180,66.3
"Europe","Finland","WorldBank",2021,2020,682,92.57
"Europe","Finland","WorldBank",2022,2018,254,81.2
"Europe","Finland","WorldBank",2022,2019,619,159.08
"Europe","Finland","WorldBank",2022,2020,191,184.4
"""

df = pd.read_csv(StringIO(DATA))
# df.set_index(["continent", "country", "source", "year_publication", "year_data"], inplace=True)
df_onehot = pd.get_dummies(data=df, columns=["continent", "country", "source", "year_publication", "year_data"])
df_onehot = sm.add_constant(df_onehot, has_constant="add")

# OLS regression with mocked fixed-effects, yet without error clustering
model = sm.OLS(
    endog = df_onehot[["gdp"]],
    exog = pd.concat(
        [
            df_onehot[["population"]],
            df_onehot.loc[:,df_onehot.columns.str.startswith('year_publication')],
            df_onehot.loc[:,df_onehot.columns.str.startswith('country')],
            df_onehot.loc[:,df_onehot.columns.str.startswith('source')],
            df_onehot[["const"]]
        ],
        axis=1
    ),
    hasconst=False,
)
result_nonclustered = model.fit()
print(result_nonclustered.summary())
ebosi
  • 1,285
  • 5
  • 17
  • 37
  • 1
    Robust covariance standard errors can be selected in `fit` in statsmodels, e.g. `model.fit(cov_type="cluster", cov_kwds={'groups': df['source'])` Clustering by one or two groups is available. – Josef Oct 20 '22 at 14:46
  • 1
    formulas can also be used to create fixed effects: `OLS.from_formla("gdp ~ population + C(year_publication) + C(country_id) + source_id, df)` – Josef Oct 20 '22 at 14:51
  • 1
    see also https://github.com/vgreg/python-se/blob/master/Standard%20errors%20in%20Python.ipynb for many examples with robust standard errors for clustered or panel data. – Josef Oct 20 '22 at 16:40
  • @Josef thanks, that's a few pointers I'll gladly explore! – ebosi Oct 20 '22 at 19:47

0 Answers0