4

I am trying to implement a logistic multinomial regression (AKA softmax regression). In this example I am trying to classify the iris dataset

I have a problem specifying the model, I get an optimization error with find_MAP(). If I avoid using find_MAP() I get a “sample” of all zero vectors if I use a Categorical for the likelihood, or a posterior exactly the same as the priors if I use Mutinomial(n=1, p=p).

import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

iris = sns.load_dataset("iris")
y_2 = pd.Categorical(iris['species']).labels
x_n = iris.columns[:-1]
x_2 = iris[x_n].values
x_2 = (x_2 - x_2.mean(axis=0))/x_2.std(axis=0)
indice = list(set(y_2))

with pm.Model() as modelo_s:

    alfa = pm.Normal('alfa', mu=0, sd=100, shape=3)
    beta = pm.Normal('beta', mu=0, sd=100, shape=(4,3))

    mu = (alfa[indice] + pm.dot(x_2, beta[:,indice])).T
    p = pm.exp(mu)/pm.sum(pm.exp(mu), axis=0)

    yl = pm.Categorical('yl', p=p, observed=y_2)
    #yl = pm.Multinomial('yl', n=1, p=p, observed=y_2)

    start = pm.find_MAP()
    step = pm.Metropolis()
    trace_s = pm.sample(1000, step, start)
OmG
  • 18,337
  • 10
  • 57
  • 90
aloctavodia
  • 2,040
  • 21
  • 28

1 Answers1

0

The issue is probably the lack of gibbs updating of vector-valued variables. Thus, a jump is only accepted if all binary values produce a good logp. This PR might be helpful: #799

So you can try: pip install git+https://github.com/pymc-devs/pymc3@gibbs and then do Metropolis(gibbs='random').

twiecki
  • 1,316
  • 8
  • 11