How to get HMM working with real-valued data in Tensorflow

Question

I'm working with a dataset that contains data from IoT devices and I have found that Hidden Markov Models work pretty well for my use case. As such, I'm trying to alter some code from a Tensorflow tutorial I've found here. The dataset contains real-values for the observed variable compared to the count data shown in the tutorial.

In particular, I believe the following needs to be changed so that the HMM has Normally distributed emissions. Unfortunately, I can't find any code on how to alter the model to have a different emission other than Poisson.

How should I change the code to emit normally distributed values?

# Define variable to represent the unknown log rates.
trainable_log_rates = tf.Variable(
  np.log(np.mean(observed_counts)) + tf.random.normal([num_states]),
  name='log_rates')

hmm = tfd.HiddenMarkovModel(
  initial_distribution=tfd.Categorical(
      logits=initial_state_logits),
  transition_distribution=tfd.Categorical(probs=transition_probs),
  observation_distribution=tfd.Poisson(log_rate=trainable_log_rates),
  num_steps=len(observed_counts))

rate_prior = tfd.LogNormal(5, 5)

def log_prob():
 return (tf.reduce_sum(rate_prior.log_prob(tf.math.exp(trainable_log_rates))) +
         hmm.log_prob(observed_counts))

optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)

@tf.function(autograph=False)
def train_op():
  with tf.GradientTape() as tape:
    neg_log_prob = -log_prob()
  grads = tape.gradient(neg_log_prob, [trainable_log_rates])[0]
  optimizer.apply_gradients([(grads, trainable_log_rates)])
  return neg_log_prob, tf.math.exp(trainable_log_rates)

Sorry if this is obvious... but couldn't you just pass a Normal distribution to `observation_distribution`? (e.g. [MultivariateNormalDiag](https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/MultivariateNormalDiag) or [MultivariateNormalTriL](https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/MultivariateNormalTriL)) — rvinas, Dec 06 '20 at 19:53
@rvinas unfortunately not as some of the functions need to be changed int eir example — Black, Dec 08 '20 at 03:06
What functions? I might be able to help if you show what is the precise issue — rvinas, Dec 08 '20 at 09:03

score 1 · Accepted Answer · answered Feb 23 '23 at 22:38

@mCoding's answer is right, in the example posted in by Tensorflow, you have a Hidden Markov model with a uniform zero distribution ([0.,0.,0.,0.]), a heavy diagonal transition matrix, and the emission probabilities are Poisson distributed.

In order to adapt it to your "Normal" example, you only have to change those probabilities to the Normal one. As an example, consider as a starting point that your emission probabilities are distributed Normal with parameters:

training_loc =  tf.Variable([0.,0.,0.,0.])
training_scale = tf.Variable([1.,1.,1.,1.])

then your observation_distribution will be:

observation_distribution = tfp.distributions.Normal(loc= training_loc, scale=training_scale )

Finally, you also have to change your prior knowledge about these parameters, setting a prior_loc, prior_scale. You might want to consider uninformative/weakly informative priors as I see that you are fitting the model afterwards.

So your code should be similar to:

# Define the emission probabilities.
training_loc =  tf.Variable([0.,0.,0.])
training_scale = tf.Variable([1.,1.,1.])
observation_distribution = tfp.distributions.Normal(loc= training_loc, scale=training_scale ) #Change this to your desired distribution

hmm = tfd.HiddenMarkovModel(
  initial_distribution=tfd.Categorical(
      logits=initial_state_logits),
  transition_distribution=tfd.Categorical(probs=transition_probs),
  observation_distribution=observation_distribution,
  num_steps=len(observed_counts))

# Prior distributions
prior_loc = tfd.Normal(loc=0., scale=1.)
prior_scale = tfd.HalfNormal(scale=1.)

def log_prob():
  log_probability = hmm.log_prob(data)#Use your training data right here
  # Compute the log probability of the prior on the mean and standard deviation of the observation distribution
  log_probability += tf.reduce_sum(prior_mean.log_prob(observation_distribution.loc))
  log_probability += tf.reduce_sum(prior_scale.log_prob(observation_distribution.scale))
  # Return the negative log probability, since we want to minimize this quantity
  return log_probability 

optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)

# Finally train the model like in the example

losses = tfp.math.minimize(
    lambda: -log_prob(),
    optimizer=tf.optimizers.Adam(learning_rate=0.1),
    num_steps=100)

So now if you look at your params training_loc and training_scale, they should have fitted values.

score 0 · Answer 2 · answered Dec 09 '20 at 00:39

The example model assumes that emissions x are Poisson distributed with one of four rates determined by the latent variable z. Therefore it defines trainable rates (or log rates), defines the HMM with uniform initial distributions on z, transition probabilities, and observations from the Poisson distribution with log rates given by the trainable ones.

In order to change to a normal distribution, you are saying x should be Normally distributed with trainable mean and standard deviation determined by the latent variable z. Thus, you need to replace trainable_log_rates with a trainable_loc and trainable_scale and change the

observation_distribution=tfd.Poisson(log_rate=trainable_log_rates)

to

observation_distribution=tfd.Normal(loc=trainable_loc, scale=trainable_scale)

You then need to replace your rate_prior with a loc_prior and scale_prior of your choosing and use them to calculate your new log_prob function.

How to get HMM working with real-valued data in Tensorflow

2 Answers2