1

I am trying to study the source code of the Mixture of Gaussian algorithm implemented in OpenCV 3.0 (the source code is in the contrib repository). I am focusing on the following code segment (retrieved from opencv_contrib-master\modules\bgsegm\src\bgfg_gaussmix.cpp, using the single-channel version process8uC1() as example):

for( y = 0; y < rows; y++ )
{
    const uchar* src = image.ptr<uchar>(y);
    uchar* dst = fgmask.ptr<uchar>(y);
    // ... removed some code as I only consider learning rate within [0, 1]
        for( x = 0; x < cols; x++, mptr += K )
        {
            float wsum = 0;
            float pix = src[x];
            int kHit = -1, kForeground = -1;

            for( k = 0; k < K; k++ )
            {
                float w = mptr[k].weight;
                wsum += w;
                if( w < FLT_EPSILON )
                    break;
                float mu = mptr[k].mean;
                float var = mptr[k].var;
                float diff = pix - mu;
                float d2 = diff*diff;
                if( d2 < vT*var )
                {
                    wsum -= w;
                    float dw = alpha*(1.f - w);
                    mptr[k].weight = w + dw;
                    mptr[k].mean = mu + alpha*diff;
                    var = std::max(var + alpha*(d2 - var), minVar);
                    mptr[k].var = var;
                    mptr[k].sortKey = w/std::sqrt(var);

                    for( k1 = k-1; k1 >= 0; k1-- )
                    {
                        if( mptr[k1].sortKey >= mptr[k1+1].sortKey )
                            break;
                        std::swap( mptr[k1], mptr[k1+1] );
                    }

                    kHit = k1+1;
                    break;
                }
            }

            if( kHit < 0 ) // no appropriate gaussian mixture found at all, remove the weakest mixture and create a new one
            {
                kHit = k = std::min(k, K-1);
                wsum += w0 - mptr[k].weight;
                mptr[k].weight = w0;
                mptr[k].mean = pix;
                mptr[k].var = var0;
                mptr[k].sortKey = sk0;
            }
            else
                for( ; k < K; k++ )
                    wsum += mptr[k].weight;

            float wscale = 1.f/wsum;
            wsum = 0;
            for( k = 0; k < K; k++ )
            {
                wsum += mptr[k].weight *= wscale;
                mptr[k].sortKey *= wscale;
                if( wsum > T && kForeground < 0 )
                    kForeground = k+1;
            }

            dst[x] = (uchar)(-(kHit >= kForeground));
        }
}

As far as I understand, the pixel is considered as foreground if kHit >= kForeground at the end of the code segment. And this may occur in 2 cases:

  1. when kHit == -1 after the first for(k) loop. This is easy to interpret, as none of the Gaussian curves can fit the current pixel, i.e. all Gaussian curves treat the current pixel as foreground. Or
  2. when kHit > -1 after the first for(k) loop, and kHit >= kForeground, where kForeground is found in the final for(k) loop.

What I do not understand are:

  1. What is the physical interpretation of the condition if( wsum > T && kForeground < 0 )? I know that wsum is the running cumulative sum of the weights of the first k curves, and T is the background ratio set by user between [0, 1], but why do we need to calculate the cumulative sum of the weights and compare with the background ratio?
  2. why do we need to sort the curves according to the descending order of sortKey, which is calculated as w/std::sqrt(var) (weight / standard deviation)? Why not simply sort according to the descending order of weight of the curves?
Miki
  • 40,887
  • 13
  • 123
  • 202
GreenPenguin
  • 167
  • 2
  • 15

0 Answers0