I am trying to study the source code of the Mixture of Gaussian algorithm implemented in OpenCV 3.0 (the source code is in the contrib repository). I am focusing on the following code segment (retrieved from opencv_contrib-master\modules\bgsegm\src\bgfg_gaussmix.cpp, using the single-channel version process8uC1()
as example):
for( y = 0; y < rows; y++ )
{
const uchar* src = image.ptr<uchar>(y);
uchar* dst = fgmask.ptr<uchar>(y);
// ... removed some code as I only consider learning rate within [0, 1]
for( x = 0; x < cols; x++, mptr += K )
{
float wsum = 0;
float pix = src[x];
int kHit = -1, kForeground = -1;
for( k = 0; k < K; k++ )
{
float w = mptr[k].weight;
wsum += w;
if( w < FLT_EPSILON )
break;
float mu = mptr[k].mean;
float var = mptr[k].var;
float diff = pix - mu;
float d2 = diff*diff;
if( d2 < vT*var )
{
wsum -= w;
float dw = alpha*(1.f - w);
mptr[k].weight = w + dw;
mptr[k].mean = mu + alpha*diff;
var = std::max(var + alpha*(d2 - var), minVar);
mptr[k].var = var;
mptr[k].sortKey = w/std::sqrt(var);
for( k1 = k-1; k1 >= 0; k1-- )
{
if( mptr[k1].sortKey >= mptr[k1+1].sortKey )
break;
std::swap( mptr[k1], mptr[k1+1] );
}
kHit = k1+1;
break;
}
}
if( kHit < 0 ) // no appropriate gaussian mixture found at all, remove the weakest mixture and create a new one
{
kHit = k = std::min(k, K-1);
wsum += w0 - mptr[k].weight;
mptr[k].weight = w0;
mptr[k].mean = pix;
mptr[k].var = var0;
mptr[k].sortKey = sk0;
}
else
for( ; k < K; k++ )
wsum += mptr[k].weight;
float wscale = 1.f/wsum;
wsum = 0;
for( k = 0; k < K; k++ )
{
wsum += mptr[k].weight *= wscale;
mptr[k].sortKey *= wscale;
if( wsum > T && kForeground < 0 )
kForeground = k+1;
}
dst[x] = (uchar)(-(kHit >= kForeground));
}
}
As far as I understand, the pixel is considered as foreground if kHit >= kForeground
at the end of the code segment. And this may occur in 2 cases:
- when
kHit == -1
after the firstfor(k)
loop. This is easy to interpret, as none of the Gaussian curves can fit the current pixel, i.e. all Gaussian curves treat the current pixel as foreground. Or - when
kHit > -1
after the firstfor(k)
loop, andkHit >= kForeground
, wherekForeground
is found in the finalfor(k)
loop.
What I do not understand are:
- What is the physical interpretation of the condition
if( wsum > T && kForeground < 0 )
? I know thatwsum
is the running cumulative sum of the weights of the firstk
curves, andT
is the background ratio set by user between [0, 1], but why do we need to calculate the cumulative sum of the weights and compare with the background ratio? - why do we need to sort the curves according to the descending order of
sortKey
, which is calculated asw/std::sqrt(var)
(weight / standard deviation)? Why not simply sort according to the descending order of weight of the curves?