Questions tagged [expectation-maximization]

Expectation Maximization (often abbreviated EM) is an iterative algorithm that can be used for maximum likelihood estimation in the presence of missing data or hidden variables.

125 questions
3
votes
1 answer

Should duplicate entries be removed prior to running an EM Record Linkage algorithm?

Example Set up I am linking a dataset to find duplicate entries within it. I do not know the number of times a duplicate may appear within my dataset. Following my blocking, I end up with the following dataset: [This is an example dataset, not my…
Chuck
  • 3,664
  • 7
  • 42
  • 76
3
votes
3 answers

How to speed this kind of double for-loop?

I am programming an expectation-maximization algorithm with R. In order to speed-up the computation, I would like to vectorize this bottleneck. I know that N is about a hundred times k. MyLoglik = 0 for (i in c(1:N)) { for (j in c(1:k)) { …
Wok
  • 4,956
  • 7
  • 42
  • 64
3
votes
2 answers

python - Recolor image

I'd like to implement an image recoloring algorithm to produce results similar to what is shown here: http://www.morethantechnical.com/2010/06/24/image-recoloring-using-gaussian-mixture-model-and-expectation-maximization-opencv-wcode/ but using…
3
votes
1 answer

OpenCV Expectation Maximization

I am trying to use EM on OpenCV 2.4.5 for background and foreground image separation. However, unlike the previous version of C class, the c++ is very confusing to me and several routines are rather confusing due to lack of documentation (from my…
kcc__
  • 1,638
  • 4
  • 30
  • 59
3
votes
1 answer

MAP Expectation Maximization for mixture models

I am trying to write down the MAP updates for the EM in case of mixtures of Bernoulli distributions. I know that for ML estimates, we have: E-step: compute P(Z|X,p,t) M-Step: (p,t)<-argmax sum(over Z): p(Z|X,p,t)log p(X,Z|p,t) where p are the…
2
votes
3 answers

Expectation Maximization in Matlab on Missing Data

I have to use EM to estimate the mean and covariance of the Gaussian distribution for each of the two classes. They have some missing attributes too. Classes of each object is known. Therefore the problem basically reduces to fitting a gaussian…
2
votes
0 answers

sketching a Gaussian Mixture plot

So basically I have these points and weights and covariance matrices and 1st mean=x1 and 2nd mean=x2. What I am trying to do is to plot the clusters of that Gaussian distribution with ellipses around the clusters, but for some reason I think the…
2
votes
0 answers

Using Kmeans to initialize EM-Algorithm

I've reading recently on Expectation Maximization (EM) and it keeps coming up that Initializing EM using K-Means is a good idea but i'm having difficulties in grasping this notion. So as far as i know when using kmeans, the result you get is…
Cesarior
  • 45
  • 1
  • 1
  • 9
2
votes
3 answers

The supplied model is not a clustering estimator in YellowBrick

I am trying to visualize an elbow plot for my data using YellowBrick's KElbowVisualizer and SKLearn's Expectation Maximization algorithm class: GaussianMixture. When I run this, I get the error in the title. (I have also tried ClassificationReport,…
2
votes
1 answer

How to vectorize likelihood calculation under multiple parameters?

I am trying to implement a bernoulli mixture and was wondering how to vectorize the calculations correctly without looping. I have tried various versions of apply but can't get the desired output (dim = c(5,4,2). Should my component parameters be…
amerikashka
  • 173
  • 1
  • 1
  • 6
2
votes
0 answers

Encoding record samples for expectation maximization algorithm

First, I'm a programmer without a data science background, so my working knowledge of statistics is quite limited. I'm creating an entity matching tool to match records across internal datasets. I want to use the probabilistic matching technique…
Casey
  • 6,166
  • 3
  • 35
  • 42
2
votes
1 answer

Got different EM::predict() results after EM::read() saved model in OpenCV

I'm new to OpenCV and C++ and I'm trying to build a classifier using Gaussian Mixture Model within the OpenCV. I figured out how it works and got it worked ... maybe. I got something like this now: If I classify the training samples just after the…
Artoria
  • 31
  • 6
2
votes
1 answer

EM algorithm for two sets of latent variables

In a typical clustering problem, the probability of a data point x is p(x) = sum_k p(k)p(x|k), where k is a latent variable specifying the cluster that x belongs to. We can use EM algorithm to maximize the log likelihood of the objective function…
2
votes
1 answer

Weka EM cluster get "Error: Could not find or load main class test" in eclipse

I want to use weka to cluster tweets in the database in JSP. In GUI, I find only HierarchiccalClusterer and Filteredcluster available for string clustering. Then I find this clusteringdemo sample code from weka official website:…
1
vote
1 answer

EM algorithm for a mixture of three normal distributions throws errors

I need to run an EM algorithm for a mixture of three normal distributions with unknows means and variances. My data points are a column with 500 rows. I am gonna take it as 'S'. First I need to write a function for negative log-likelihood of the…
user11607046
  • 177
  • 4
  • 10
1
2
3
8 9