0

I am new to DirectShow API.

I want to decode a media file and get uncompressed RGB video frames using DirectShow.

I noted that all such operations should be completed through a GraphBuilder. Also, every the processing block is called a filter and there are many different filters for different media files. For example, for decoding H264 we should use "Microsoft MPEG-2 Video Decoder", for AVI files "AVI Splitter Filter" etc.

I would like to know if there is a general way (decoder) that can handle all those file types?

I would really appreciate if someone can point out an example that goes from importing a local file to decoding it into uncompressed RGB frames. All the examples I found are dealing with window handles and they just configure it and call pGraph->run(). I have also surfed through Windows SDK samples, but couldn't find useful samples.

Thanks very much in advance.

mbaros
  • 825
  • 8
  • 31

2 Answers2

1

Universal DirectShow decoder in general is against the concept of DirectShow API. The whole idea is that individual filters are responsible for individual task (esp. decoding certain encoding or demultiplexing certain container format). The registry of the filters and Intelligent Connect let one to have the filters built in chain to do certain requested processing, in particular decoding from compressed format to 24-bit RGB for video.

From this standpoint you don't need a universal decoder and it is not expected that such decoder exists. However, such decoder (or close) does exist and it's a ffdshow or one of its derivatives. Presently, you might want to look at LAVFilters, for example. They wrap FFmpeg, which itself can handle many formats, and connect it to DirectShow API so that, as as filter, ffdshow could handle many formats/encodings.

There is no general rule to use or not use such codec pack, in most cases you take into consideration various factors and decide what to do. If your application handles various scenarios, a good starting point into graph building would be Overview of Graph Building.

My goal is to accomplish the task using DirectShow in order to have no external dependencies. Do you know a particular example that does uncompressing frames for some file type?

Your request is too broad and in the same time typical and, to some extent, fairy simple. If you spend some time playing with GraphEdit SDK tool, or rather GraphStudioNext, which is a more powerful version of the former, you will be able to build filter graph interactively, also render media files of different types and see what filters participate in rendering. You can accomplish the very same programmatically too, since the interactive actions basically all have matching API calls individually.

You will be able to see that specific formats are handled by different filters and Intelligent Connect mentioned above is building chains of filters in combinations in order to satisfy the requests and get the pipeline together.

Default use case is playback, and if you want to get video rendered to 24/32-bit RGB, your course of actions is pretty much similar: you are to build a graph, which just needs to terminate with something else. More flexible, sophisticated and typical for advanced development approach is to supply a custom video renderer filter and accept decompressed RGB frames on it.

A simple and so much popular version of the solution is to use Sample Grabber filter, initialize it to accept RGB, setup a callback on it so that your SampleCB callback method is called every time RGB frame is decompressed, and use Sample Grabber in the graph. (You will find really a lot of attempts to accomplish that if you search open source code and/or web for keywords ISampleGrabber, ISampleGrabberCB, SampleCB or BufferCB, MEDIASUBTYPE_RGB24).

Another more or less popular approach is to setup a playback pipeline, play a file, and read back frames from video presenter. This is suggested in another answer to the question, is relatively easy to do, and does the job if you don't have performance requirement and requirements to extract every single frame. That is, it is a good way to get a random RGB frame from the feed but not every/all frames. See related on this:

Roman R.
  • 68,205
  • 6
  • 94
  • 158
  • Thanks for the answer. My goal is to accomplish the task using DirectShow in order to have no external dependencies. Do you know a particular example that does uncompressing frames for some file type? – mbaros Aug 08 '17 at 12:07
  • again thank you for your answer. I will dig more into this. The hints I received should be enough to solve the problem soon. Let me ask the final question about this. I also want to extract some metadata of the media file: FrameRate, BitRate and FrameCount. Can you give me a hind where and how I can find these information? – mbaros Aug 08 '17 at 17:23
  • 1
    When you see the graphs built, you will notice pin connections and media types on them. Thy typically contain the data you are looking for. There is also separate non-DirectShow API for metadata as well (property handlers - [see here](https://stackoverflow.com/a/41859499/868014)), Note that data such as frame count is not necessarily available via metadata (that is, you might need to go through all frames to count them). – Roman R. Aug 08 '17 at 17:32
  • Look, here is the problem. For a while, I was working on the same task through Media Foundation. I succeeded it. But the problem was that there were many files that could not be opened via MF since they need external codecs which are available only through DirectShow. So I doubt I can get metadata of those files through Media Foundation. Anyways, I really appreciate your help all these time. – mbaros Aug 08 '17 at 18:04
  • Thanks. I have successfully implemented it through SampleGrabber – mbaros Aug 11 '17 at 14:53
1

You are looking for vmr9 example in DirectShow library.

In your Windows SDK's install, look for this example:

Microsoft SDKs\Windows\v7.0\Samples\multimedia\directshow\vmr9\windowless\windowless.sln

And search this function: CaptureImage, in this method, see IVMRWindowlessControl9::GetCurrentImage, is exactly what you want.

This method captures a video frame in bitmap format (RGB). Next, this is a copy of CaptureImage code:

BOOL CaptureImage(LPCTSTR szFile)
{
HRESULT hr;

if(pWC && !g_bAudioOnly)
{
    BYTE* lpCurrImage = NULL;

    // Read the current video frame into a byte buffer.  The information
    // will be returned in a packed Windows DIB and will be allocated
    // by the VMR.
    if(SUCCEEDED(hr = pWC->GetCurrentImage(&lpCurrImage)))
    {
        BITMAPFILEHEADER    hdr;
        DWORD               dwSize, dwWritten;
        LPBITMAPINFOHEADER  pdib = (LPBITMAPINFOHEADER) lpCurrImage;

        // Create a new file to store the bitmap data
        HANDLE hFile = CreateFile(szFile, GENERIC_WRITE, FILE_SHARE_READ, NULL,
                                  CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0);

        if (hFile == INVALID_HANDLE_VALUE)
            return FALSE;

        // Initialize the bitmap header
        dwSize = DibSize(pdib);
        hdr.bfType          = BFT_BITMAP;
        hdr.bfSize          = dwSize + sizeof(BITMAPFILEHEADER);
        hdr.bfReserved1     = 0;
        hdr.bfReserved2     = 0;
        hdr.bfOffBits       = (DWORD)sizeof(BITMAPFILEHEADER) + pdib->biSize +
            DibPaletteSize(pdib);

        // Write the bitmap header and bitmap bits to the file
        WriteFile(hFile, (LPCVOID) &hdr, sizeof(BITMAPFILEHEADER), &dwWritten, 0);
        WriteFile(hFile, (LPCVOID) pdib, dwSize, &dwWritten, 0);

        // Close the file
        CloseHandle(hFile);

        // The app must free the image data returned from GetCurrentImage()
        CoTaskMemFree(lpCurrImage);

        // Give user feedback that the write has completed
        TCHAR szDir[MAX_PATH];

        GetCurrentDirectory(MAX_PATH, szDir);

        // Strip off the trailing slash, if it exists
        int nLength = (int) _tcslen(szDir);
        if (szDir[nLength-1] == TEXT('\\'))
            szDir[nLength-1] = TEXT('\0');

        Msg(TEXT("Captured current image to %s\\%s."), szDir, szFile);
        return TRUE;
    }
    else
    {
        Msg(TEXT("Failed to capture image!  hr=0x%x"), hr);
        return FALSE;
    }
}

return FALSE;
}
Ing. Gerardo Sánchez
  • 1,607
  • 15
  • 14
  • Thanks for the answer. This is almost what I want. The only issue is that the pWC (IVMRWindowlessControl9) is being initialized on top of a HWND object. Is there a way to avoid it and get the current image right from the graph? So we don't have any window handles? – mbaros Aug 09 '17 at 11:36
  • 1
    ok, you have 2 options: 1) simple one: use IVMRWindowlessControl9::SetVideoPosition method to move video outside screen. 2) complex and elegant solution: build your own filter (and catch IMediaSample data) and use "Null Renderer" filter to hide video – Ing. Gerardo Sánchez Aug 09 '17 at 15:02
  • Thanks for the suggestions. I went the ISampleGrabber method which Roman suggested. It seemed the most accurate and the easiest solution. Hopefully, that API would not get removed from Windows. Otherwise, I have to follow your advice and implement a filter from scratch. – mbaros Aug 11 '17 at 14:54