I have a dataframe with 7 variables:
RACA pca pp pcx psc lp csc
0 BARBUDA 1915 470 150 140 87.65 91.41
1 BARBUDA 1345 305 100 110 79.32 98.28
2 BARBUDA 1185 295 80 85 62.19 83.12
3 BARBUDA 1755 385 120 130 80.65 90.01
4 BARBUDA 1570 325 120 120 77.96 87.99
5 CANELUDA 1640 365 110 115 81.38 87.26
6 CANELUDA 1960 525 135 145 89.21 99.37
7 CANELUDA 1715 410 100 120 79.35 99.84
8 CANELUDA 1615 380 100 110 76.32 99.27
9 CANELUDA 2230 500 165 160 90.22 99.56
10 CANELUDA 1570 400 105 95 85.24 83.95
11 COMERCIAL 1815 380 145 90 73.32 92.81
12 COMERCIAL 2475 345 180 140 71.77 105.64
13 COMERCIAL 1870 295 125 125 72.36 97.89
14 COMERCIAL 2435 565 185 160 73.24 107.39
15 COMERCIAL 1705 315 115 125 72.03 96.11
16 COMERCIAL 2220 495 165 150 87.63 96.89
17 PELOCO 1145 250 75 85 50.57 77.90
18 PELOCO 705 85 55 50 38.26 78.09
19 PELOCO 1140 195 80 75 66.15 96.35
20 PELOCO 1355 250 90 95 50.60 91.39
21 PELOCO 1095 220 80 80 53.03 84.57
22 PELOCO 1580 255 125 120 59.30 95.57
I want to fit a glm for every dependent variable, pca:csc, in R it's quite simple to do it, but I don't know how to get this working on Python. I tried to write a for loop and pass the column name to the formula but so far didn't work out:
for column in df:
col = str(column)
model = sm.formula.glm(paste(col,"~ RACA"), data=df).fit()
print(model.summary())
I am using Pandas and statsmodel
import pandas as pd
import statsmodels.api as sm
I imagine it must be so simple, but I sincerely couldn't figure it out yet.