This is a properly articulated version of an old question.
I'm working on condensing some code written for a neuroscience paper: https://doi.org/10.1371/journal.pcbi.1007481. Without delving into unnecessary detail, here are some definitions:
Definitions
A
is the number of neuronal assemblies, an arbitrary assembly depicted by mu
. Its distribution is unknown and its value at any point in time should change as the system organizes the neurons into the correct number of assemblies.
p_mu
is the probability of a neuronal assembly mu
to activate. Its prior distribution is a Beta distribution whose parameters alpha_p
and beta_p
are of unknown distribution and are specific to mu
.
w
is a binary matrix containing all assemblies A
over all time M
. 0
corresponds to "off" and 1
corresponds to "on". Its prior is a Bernoulli distribution with assembly-specific probability p_mu
.
n
is the relative size of each assembly. Its prior distribution is a Dirichlet distribution with A
parameters {gamma_1, ..., gamma_n}
of unknown distribution. I'm not sure if this is necessary.
lambda(0/1)_mu
is the likelihood that a specific neuron is on at a specific time, given that the assembly is (on/off) at that time. Its prior is a Beta distribution whose parameters delta(0/1)_mu
and kappa(0/1)_mu
are unknown and specific to each assembly
s
is a binary matrix containing all neurons N
over all time M
. 0
corresponds to "off" and 1
corresponds to "on". This is our input data, and its distribution is lambda(0/1)_mu
for every point in time.
The Goal
Use Bayesian Inference with PyMC3 to uncover the number of assemblies A
.
Assumptions
I have decided to give A
a discrete uniform prior between 0
and the total_number_of_cells
, because its value must be in that set and I have no knowledge which makes me prefer a bias in any direction.
I have decided to assume alpha
, beta
, gamma
, delta(0/1)
, and kappa(0/1)
for a specific mu
are from a half normal distribution because they must be positive.
My Initial Plan
Assume the number of assemblies, A
, is known and not from a distribution, just to make sure I can find it again.
Loop over each assembly in order to populate the set of distributions for alpha
, beta
, gamma
, delta(0/1)
, and kappa(0/1)
, which are different for each assembly.
Loop over every time frame and create a custom distribution based on lambda(0/1)_mu
from which to sample the observed data, s
at every point in time.
My Questions
If the number of assemblies is unknown, how do I integrate that uncertainty into my diagram? Currently I am planning to loop over a known number of assemblies, but I don't see a straightforward way to make that process dynamic with PyMC3. The paper uses a Dirichlet Process for this - how does PyMC3 handle this dynamic situation?
Is there a way to not have to loop over every neuron and time step? This seems painfully slow.