I am using v4l2 api to grab images from a Microsoft Lifecam and then transferring these images over TCP to a remote computer. I am also encoding the video frames into a MPEG2VIDEO using ffmpeg API. These recorded videos play too fast which is probably because not enough frames have been captured and due to incorrect FPS settings.
The following is the code which converts a YUV422 source to a RGB888 image. This code fragment is the bottleneck in my code as it takes nearly 100 - 150 ms to execute which means I can't log more than 6 - 10 FPS at 1280 x 720 resolution. The CPU usage is 100% as well.
for (int line = 0; line < image_height; line++) {
for (int column = 0; column < image_width; column++) {
*dst++ = CLAMP((double)*py + 1.402*((double)*pv - 128.0)); // R - first byte
*dst++ = CLAMP((double)*py - 0.344*((double)*pu - 128.0) - 0.714*((double)*pv - 128.0)); // G - next byte
*dst++ = CLAMP((double)*py + 1.772*((double)*pu - 128.0)); // B - next byte
vid_frame->data[0][line * frame->linesize[0] + column] = *py;
// increment py, pu, pv here
}
'dst' is then compressed as jpeg and sent over TCP and 'vid_frame' is saved to the disk.
How can I make this code fragment faster so that I can get atleast 30 FPS at 1280x720 resolution as compared to the present 5-6 FPS?
I've tried parallelizing the for loop across three threads using p_thread, processing one third of the rows in each thread.
for (int line = 0; line < image_height/3; line++) // thread 1
for (int line = image_height/3; line < 2*image_height/3; line++) // thread 2
for (int line = 2*image_height/3; line < image_height; line++) // thread 3
This gave me only a minor improvement of 20-30 milliseconds per frame. What would be the best way to parallelize such loops? Can I use GPU computing or something like OpenMP? Say spwaning some 100 threads to do the calculations?
I also noticed higher frame rates with my laptop webcam as compared to the Microsoft USB Lifecam.
Here are other details:
- Ubuntu 12.04, ffmpeg 2.6
- AMG-A8 quad core processor with 6GB RAM
- Encoder settings:
- codec: AV_CODEC_ID_MPEG2VIDEO
- bitrate: 4000000
- time_base: (AVRational){1, 20}
- pix_fmt: AV_PIX_FMT_YUV420P
- gop: 10
- max_b_frames: 1