3

I am using a RcppHMM package to make a GHMM(Multivariate gaussian mixture HMM model) with continuous observation.

I want to learn an EM algorithm using continuous observations with different sequence lengths. To be specific, each observation has a different sequence length from 3 to 6. I tried to fit the model using the whole observation dataset at once (I made the dataset with ncol=6(maximum sequence length) and filled the empty part with all zero), but it didn't work so I separated observations as groups with the same lengths [O3, O4, O5, O6] and updated the model by each group.

Each observation group looks like this

O3
           [,1]       [,2]       [,3]
[1,]  0.8550940  0.3231340  0.8639223
[2,]  0.4453262  0.5840305  0.4356958
[3,]  0.4344789 -1.2234760  0.4344789
[4,] -0.5003085  3.0322560 -0.5003085
[5,] -0.1459598 -0.4661041 -0.1459598
[6,] -0.1977263 -0.6352724 -0.1977263

O4
           [,1]       [,2]       [,3]       [,4]
[1,]  0.8965332  0.3338220  0.7270241  0.8824540
[2,]  0.4033438  0.4131293  0.1593136  0.4187023
[3,] -0.7329015 -1.6828296 -0.1550487 -0.1550487
[4,] -0.3213490  7.3449076 -0.2787857 -0.2787857
[5,] -0.2868067 -0.3743332 -0.1340566 -0.1340566
[6,]  2.6832742 -0.5844305  0.2320774  0.2320774

O5
            [,1]       [,2]        [,3]       [,4]        [,5]
[1,]  0.83401341  0.2492370  0.47493190  0.6440035  0.84985396
[2,]  0.37988234  0.2335883  0.17043570  0.2116066  0.36260248
[3,] -0.05240445 -0.3034002 -0.05240445 -0.3034002 -0.05240445
[4,] -0.37240867  1.1500528 -0.37240867  1.1500528 -0.37240867
[5,] -0.02056839  0.9343497 -0.02056839  0.9343497 -0.02056839
[6,] -0.27586584 -0.4406833 -0.27586584 -0.4406833 -0.27586584

O6
           [,1]        [,2]       [,3]       [,4]       [,5]       [,6]
[1,]  0.9287066  0.35065802  0.4493442  0.6142040  0.7423286  0.9217381
[2,]  0.3852644  0.09612516  0.1623447  0.1320334  0.1875127  0.3928661
[3,]  0.1436024 -0.08326038  0.7800491  0.1436024  0.1926751  0.1436024
[4,] -0.4284304 -0.27916609 -0.5224586 -0.4284304  0.1267840 -0.4284304
[5,] -0.8846364 -0.81131525 -0.1781479 -0.8846364 -0.1266250 -0.8846364
[6,] -0.2141231 -0.78377461 -0.4440142 -0.2141231 -0.7888260 -0.2141231

nrow is the number of dimension of observation, and ncol is lengths of sequences.

When I updated the model with the first group that has sequence length 3, it operated. But when I tried to re-update model with second group that has sequence length 4, the warning message came out as below,

In learnEM(newModel, O4[, 1:4, ], iter = 20, delta = 1e-05, print = TRUE) : It is recommended to have a covariance matrix with a determinant bigger than 1/ ((2*PI)^k) .

Does anyone know how to fix this warning message? And is there any proper way to learn a EM algorithm with observations that have different sequence lengths using this package?

dobby
  • 31
  • 2

0 Answers0