I'm attempting to locate a switchpoint and getting some extremely high values for my posteriors. Specifically lambda_1 and tau don't seem to make much sense. The dataset looks like this:
I've been using a method similar to the cellphone data example found here: https://nbviewer.jupyter.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Ch1_Introduction_PyMC2.ipynb
My model looks like this:
with pm.Model() as model:
alpha = 1.0/np.array(pos_df['positiveIncrease']).mean()
lambda_1 = pm.Exponential("lambda_1", alpha)
lambda_2 = pm.Exponential("lambda_2", alpha)
tau = pm.DiscreteUniform("tau", lower=0, upper=n_date)
idx = np.arange(n_date)
lambda_ = pm.math.switch(tau > idx, lambda_1, lambda_2)
observation = pm.Poisson("obs", lambda_, observed = pos_df['positiveIncrease'])
step = pm.Metropolis()
trace = pm.sample(10000, tune=5000, step=step)
when I run model.check_test_point() I get the following:
lambda_1_log__ -1.06
lambda_2_log__ -1.06
tau -5.03
obs -26857.07
Name: Log-probability of test_point, dtype: float64
My lambda_2_samples are [61.56487732, 61.56487732, 60.23909822, ..., 61.21167046, 61.39722331, 61.39722331]
Where as my lambda_1_samples are [715.19559043, 715.19559043, 716.98035641, ..., 717.35203171, 717.35203171, 717.35203171]
Also my tau_samples are: ([125, 125, 125, ..., 125, 125, 125], dtype=int64)
My expectation is that the two distributions would fall somewhere within the dataset much like the following example:
However, my results look like this mess:
I've been blindly tweaking variables like sample size, tuning amount, and testvals but they don't seem to improve the results in any meaningful way. I would appreciate advise on how to fix the problem as well as information to help me better understand why it occurred in the first place.