0

Given that FFmpeg is the leading multimedia framework and most of the video/audio players uses it, I'm wondering somethings about audio/video players using FFmpeg as intermediate.

I'm studying and I want to know how audio/video players works and I have some questions.

I was reading the ffplay source code and I saw that ffplay handles the subtitle stream. I tried to use a mkv file with a subtitle on it and doesn't work. I tried using arguments such as -sst but nothing happened. - I was reading about subtitles and how video files uses it (or may I say containers?). I saw that there's two ways putting a subtitle: hardsubs and softsubs - roughly speaking hardsubs mode is burned and becomes part of the video, and softsubs turns a stream of subtitles (I might be wrong - please, correct me).

  • The question is: How does they handle this? I mean, when the subtitle is part of the video there's nothing to do, the video stream itself shows the subtitle, but what about the softsubs? how are they handled? (I heard something about text subs as well). - How does the subtitle appears on the screen and can be configured changing fonts, size, colors, without encoding everything again?

  • I was studying some video players source codes and some or most of them uses OpenGL as renderer of the frame and others uses (such as Qt's QWidget) (kind of or for sure) canvas. - What is the most used and which one is fastest and better? OpenGL with shaders and stuffs? Handling YUV or RGB and so on? How does that work?

  • It might be a dump question but what is the format that AVFrame returns? For example, when we want to save frames as images first we need the frame and then we convert, from which format we are converting from? Does it change according with the video codec or it's always the same?

  • Most of the videos I've been trying to handle is using YUV720P, I tried to save the frames as png and I need to convert to RGB first. I did a test with the players and I put at the same frame and I took also screenshots and compared. The video players shows the frames more colorful. I tried the same with ffplay that uses SDL (OpenGL) and the colors (quality) of the frames seems to be really low. What might be? What they do? Is it shaders (or a kind of magic? haha).

Well, I think that is it for now. I hope you help me with that.

If this isn't the correct place, please let me know where. I haven't found another place in Stack Exchange communities.

yayuj
  • 2,194
  • 3
  • 17
  • 29
  • 1
    ffmpeg can use libass to render softsubs, you can also just use ffmpeg to demux the subtitle stream without rendering it. The "optimal" solution for rendering video streams is platform dependent, OpenGL is just pretty much the best "common denominator" for hardware accelerated rendering. As for color magic, OSs and drivers sometimes use filters on video streams, go into the settings of those and disable the "enhancements" – PeterT Oct 26 '14 at 14:03
  • Thank you for answering @PeterT. The subtitle rendering transforms the subtitle into image or the rendering must be made after getting the subtitle stream? Is it texts or images? - For example, if the ffmpeg demux the subtitle stream and returns, if it's a text stream, the programmer is the responsible to render it somehow, how normally it's done? Is it possible to use OpenGL and draw the subtitle stream text? - I hope you understand my question. – yayuj Oct 26 '14 at 14:12
  • If you already studied `ffplay`'s sources why not go ahead and deep dive into ffmpeg's libs themselfs? – alk Oct 26 '14 at 16:01
  • @alk - I'm doing that (at least trying) but any help will be welcome. – yayuj Oct 26 '14 at 16:06
  • ffmpeg support of subtitles is not entirely neat. – UmNyobe May 31 '15 at 20:44

1 Answers1

0

There are a lot of question in one post:

How are 'soft subtitles' handled

The same way as any other stream :

  1. read packets from a stream to the container
  2. Give the packet to a decoder
  3. Use the decoded frame as you wish. Here with most containers supporting subtitles the presentation time will be present. All you need at this time is get the text and burn it onto the image at the same presentation time. There are a lot of ways to print the text on the video, with ffmpeg or another library

What is the most used renderer and which one is fastest and better?

  1. most used depend on the underlying system. For instance Qt only wrap native renderers, and even has a openGL version
  2. You can only be as fast as the underlying system allows. Does it support ouble-buffering? Can it render in your decoded pixel format or do you have to perform color conversion before? This topic is too broad
  3. Better only depend on the use case. this is too broad

what is the format that AVFrame returns? It is a raw format (enum AVPixelFormat), and depends on the codec. There is a list of YUV and RGB FOURCCs which cover most formats in ffmpeg. Programmatically you can access the table AVCodec::pix_fmts to obtain the pixel format a specific codec support.

UmNyobe
  • 22,539
  • 9
  • 61
  • 90