To do a linear regression in Julia
we can use the function lm
like this:
using DataFrames
using GLM
df = DataFrame(x=[2,3,2,1,3,5,7,4,2],
y=[2,3,5,1,5,6,4,2,3])
9×2 DataFrame
Row │ x y
│ Int64 Int64
─────┼──────────────
1 │ 2 2
2 │ 3 3
3 │ 2 5
4 │ 1 1
5 │ 3 5
6 │ 5 6
7 │ 7 4
8 │ 4 2
9 │ 2 3
lm(@formula(y~x), df)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}
y ~ 1 + x
Coefficients:
───────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────
(Intercept) 2.14516 1.11203 1.93 0.0951 -0.484383 4.77471
x 0.403226 0.303282 1.33 0.2254 -0.313923 1.12037
───────────────────────────────────────────────────────────────────────
But I was wondering how to do a linear regression per group in Julia
. Here is some reproducible data:
df = DataFrame(group = ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B'],
x=[2,3,2,1,3,5,7,4,2,3,4,5,2,6,3,1,6,1],
y=[2,3,5,1,5,6,4,2,3,3,2,4,7,1,8,4,3,1])
18×3 DataFrame
Row │ group x y
│ Char Int64 Int64
─────┼─────────────────────
1 │ A 2 2
2 │ A 3 3
3 │ A 2 5
4 │ A 1 1
5 │ A 3 5
6 │ A 5 6
7 │ A 7 4
8 │ A 4 2
⋮ │ ⋮ ⋮ ⋮
12 │ B 5 4
13 │ B 2 7
14 │ B 6 1
15 │ B 3 8
16 │ B 1 4
17 │ B 6 3
18 │ B 1 1
3 rows omitted
So I was wondering if anyone knows how to perform a linear regression per group (in this case for group A and B in df
) and get the statistical coefficients like p-value and R Square per group in Julia
?