0

My goal is to splice video fragments from several video files. Fragments are defined by the arbitrary start time and end times. Initially I wanted to do it using a library like mp4parser but it can only cut streams at sync (IFRAME) points, while I need higher precision.

My scenario is Extract encoded stream from file -> Decode -> Encode -> Mux the result into mp4 file. Right now generally code works but resulted video is white noise. Tested on Nexus-S and Galaxy-S3. My code is a combination of several examples:

  • Read previously recorded files based on MoviePlayer.java
  • Decode-Encode: DecodeEditEncodeTest.java
  • Mux video stream into mp4 - yet another example, not relevant here

I want to simplify the examples because I do not need to process frames in the middle. I've tried to feed buffers from decoder output to encoder input without Surface in the middle. The overall process worked in the sense that code run to completion and resulted in playable video file. However the contents of the file is white noise.

That's a snippet of code that feeds the frames from decoder to encoder. What is wrong and how to make it work?

...
} else { // decoderStatus >= 0
    if (VERBOSE) Log.d(TAG, "surface decoder given buffer "
                            + decoderStatus + " (size=" + info.size + ")");
    // The ByteBuffers are null references, but we still get a nonzero
    // size for the decoded data.
    boolean doRender = (info.size != 0);
    // As soon as we call releaseOutputBuffer, the buffer will be forwarded
    // to SurfaceTexture to convert to a texture.  The API doesn't
    // guarantee that the texture will be available before the call
    // returns, so we need to wait for the onFrameAvailable callback to
    // fire.  If we don't wait, we risk rendering from the previous frame.
    //   decoder.releaseOutputBuffer(decoderStatus, doRender);
    if (doRender) {
    // This waits for the image and renders it after it arrives.
//                  if (VERBOSE) Log.d(TAG, "awaiting frame");
//                          outputSurface.awaitNewImage();
//                          outputSurface.drawImage();
//                          // Send it to the encoder.
//                              inputSurface.setPresentationTime(info.presentationTimeUs * 1000);
//                          if (VERBOSE) Log.d(TAG, "swapBuffers");
//                          inputSurface.swapBuffers();

            encoderStatus = encoder.dequeueInputBuffer(-1);

            if (encoderStatus >= 0) {
                                encoderInputBuffers[encoderStatus].clear();

                                decoderOutputBuffers[decoderStatus].position(info.offset);
                                decoderOutputBuffers[decoderStatus].limit(info.offset + info.size);

                                encoderInputBuffers[encoderStatus].put(decoderOutputBuffers[decoderStatus]);
                                encoder.queueInputBuffer(encoderStatus, 0, info.size, info.presentationTimeUs*1000, 0);
                }
            }

                        decoder.releaseOutputBuffer(decoderStatus, false);
...
Kirill K
  • 771
  • 6
  • 17
  • `grafika` is invalid tag (русизм от графика), it has to be `graphics` instead – Stan Apr 21 '15 at 13:30
  • 2
    @Stan, grafika is a correct term. It doesn't refer to a Polish or Russian word but ruther to this: [https://github.com/google/grafika](https://github.com/google/grafika) – Kirill K Apr 21 '15 at 14:36
  • Oh, its a lib. Thank you to pointing me. – Stan Apr 21 '15 at 14:45
  • My understanding is that this is device dependent - on some devices, the decoder will output frame formats which the encoder cannot accept without intermediate conversion. Check the format of the decoded data you are getting, and see if it is one that the encoder reports as accepting. If you wanted to do it right, you would process the stream being added forward from the last preceding keyframe, and synthesize a new one, but that will require really digging into the details of each compressed stream format to be supported. – Chris Stratton Apr 21 '15 at 14:52
  • 1
    @ChrisStratton, what aspect of the format would you recommend to check and how to query for the encoder input format? In any case `MediaCodec` only allows configuration of encoder output format in `MediaCodec.configure(MediaFormat format, Surface surface, MediaCrypto crypto, int flags)`. Even if the answer is somewhere in `MediaCodecInfo.CodecCapabilities`, I cant get how to determine/distinguish between input/output capabilities? – Kirill K Apr 21 '15 at 15:03
  • 2
    See MediaCodec's getOutputFormat() and getInputFormat() methods. – Chris Stratton Apr 21 '15 at 15:09
  • Thanks @ChrisStratton, but getInputFormat() only available in API 21 and it's not yet supported on majority of the devices :(. – Kirill K Apr 21 '15 at 15:21
  • 1
    FWIW, Grafika was named for the Polish word for "graphics". – fadden Apr 21 '15 at 15:46

1 Answers1

5

It's much better to use a Surface than a ByteBuffer. It's faster as well as more portable. Surfaces are queues of buffers, not just framebuffers for pixel data; decoded video frames are passed around by handle. If you use ByteBuffers, the video data has to be copied a couple of times, which will slow you down.

Create the MediaCodec encoder, get the input surface, and pass that to the decoder as its output surface.

If you need to work with API 16/17, you're stuck with ByteBuffers. If you search around you can find reverse-engineered converters for the wacky qualcomm formats, but bear in mind that there were no CTS tests until API 18, so there are no guarantees.

fadden
  • 51,356
  • 5
  • 116
  • 166
  • Thank you! I will stay with Surfaces. Did I understand correctly that by saying "Create the MediaCodec encoder, get the input surface, and pass that to the decoder as its output surface.", you mean using `InputSurface and OutputSurface` helper classes from the examples? (I've tried to use "naked" `Surface` and it did not work). Otherwise how to set `presentationTimeUs` for the next encoder frame, wait for decoder to complete it's drawing, etc? I apologize if my questions are too naive: I am trying to implement apparently straight forward functionality with fewer lines/simpler code. – Kirill K Apr 21 '15 at 22:48
  • 1
    You shouldn't need the helper classes. Call http://developer.android.com/reference/android/media/MediaCodec.html#createInputSurface() on the encoder, and pass the Surface it returns to the decoder `configure()`. You can have only one decoder feeding the encoder at a time, but if you `stop()` and `release()` the decoder it should release the Surface and allow you to attach a different MediaCodec decoder instance. – fadden Apr 21 '15 at 22:56
  • Thanks again, it is amazing that world's No1 expert on the matter answers in real time. – Kirill K Apr 21 '15 at 23:13