0

I need to do some multinomial regression in Julia. In R I get the following result:

library(nnet)
data <- read.table("Dropbox/scripts/timeseries.txt",header=TRUE)
multinom(y~X1+X2,data)

# weights:  12 (6 variable)
initial  value 10985.024274 
iter  10 value 10438.503738
final  value 10438.503529 
converged
Call:
multinom(formula = y ~ X1 + X2, data = data)

Coefficients:
  (Intercept)        X1        X2
2   0.4877087 0.2588725 0.2762119
3   0.4421524 0.5305649 0.3895339

Residual Deviance: 20877.01 
AIC: 20889.01 

Here is my data

My first attempt was using Regression.jl. The documentation is quite sparse for this package so I am not sure which category is used as baseline, which parameters the resulting output corresponds to, etc. I filed an issue to ask about these things here.

using DataFrames
using Regression
import Regression: solve, Options, predict
dat = readtable("timeseries.txt", separator='\t')
X = convert(Matrix{Float64},dat[:,2:3])
y = convert(Vector{Int64},dat[:,1])

ret = solve(mlogisticreg(X',y,3), reg=ZeroReg(),  options=Options(verbosity=:iter))

the result is

julia> ret.sol
3x2 Array{Float64,2}:
 -0.573027  -0.531819
  0.173453   0.232029
  0.399575   0.29979 

but again, I am not sure what this corresponds to.

Next I tried the Julia wrapper to Python's SciKitLearn:

using ScikitLearn
@sk_import linear_model: LogisticRegression

model = ScikitLearn.fit!(LogisticRegression(multi_class="multinomial", solver = "lbfgs"), X, y)
model[:coef_]
3x2 Array{Float64,2}:
 -0.261902    -0.220771 
 -0.00453731   0.0540354
  0.266439     0.166735 

but I have not figured out how to extract the coefficients from this model. Updated with coefficients. These also don't look like the R results.

Any help trying to replicate R's results would be appreciate (using whatever package!)

Note the response variables are just the discretized time-lagged response i.e.

julia> dat[1:3,:]
3x3 DataFrames.DataFrame
| Row | y | X1 | X2 |
|-----|---|----|----|
| 1   | 3 | 1  | 0  |
| 2   | 3 | 0  | 1  |
| 3   | 1 | 0  | 1  |

for row 2 you can see that the response (0, 1) means the previous observation was a 3. Similarly (1,0) means previous observation was a 2 and (0,0) means previous observation was a 1.

Update: For Regression.jl it seems it does not fit an intercept by default (and they call it "bias" instead of an intercept). By adding this term we get results very similar to python (not sure what the third column is though..)

julia> ret = solve(mlogisticreg(X',y,3, bias=1.0), reg=ZeroReg(),  options=Options(verbosity=:iter))
julia> ret.sol
3x3 Array{Float64,2}:
 -0.263149    -0.221923   -0.309949
 -0.00427033   0.0543008   0.177753
  0.267419     0.167622    0.132196

UPDATE: Since the model coefficients are not identifiable I should not be expecting them to be the same acrossed these different implementations. However, the predicted probabilities should be the same, and in fact they are (using R, Regression.jl, or ScikitLearn).

bdeonovic
  • 4,130
  • 7
  • 40
  • 70
  • I'll be honest that I'm a little confused what you're attempting. Does your y data correspond to class labels 1/2/3? If so, you probably want to fit a multinomal logistic regression to the "one-hot" matrix of y. (i.e. y is a nx3 matrix of 0's and 1's where the y[i,j] is 1 when the ith class is j). – Tom Breloff Apr 07 '16 at 00:15
  • Thats exactly the case. I will give that a try. As I mentioned the documentation for these packages is quite lacking. – bdeonovic Apr 07 '16 at 11:31
  • For the Regression.jl package specifically states for multinomial regression: "[y is a vector of class labels (values in 1:k)](https://github.com/lindahua/Regression.jl)". Similarly the Python SciKitLearn states that y should be "[array-like, shape (n_samples,)](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit)" – bdeonovic Apr 07 '16 at 19:30
  • Can you check which level each implementation picks as reference? Different baseline value would lead to different parameters -- even theoretically they are equivalent. – Gang Liang Jun 10 '16 at 22:31

0 Answers0