1

I would like to use Hidden Markov Model implementing by Pomegranate(a python API https://pomegranate.readthedocs.io/en/latest/index.html) and I would like to initialize my Markov model by specifying a discrete distribution.

Since it is discrete, when I fit the learned model using new data(of string datatype), I may have encountered some characters that is not appeared in the distributions of my learned model. So is there a way I could 'parse' my input/distribution so anything that is not in my 'learned' distribution is classified into a new group with assigned probability ?

For example, I may want to define a discrete distribution like this to avoid the problem:

d1 = DiscreteDistribution({'A' : 0.35, 'B' : 0.20, 'C' : 0.05, 'the-rest-of-char' : 0.40})

So basically how can I defined something like regular expression when using the discrete distribution to the HMM ??

Any help is appreciated !!

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
starry1990
  • 121
  • 1
  • 6

0 Answers0