CVOpenGLESTextureCacheCreateTextureFromImage on iPad2 is too slow ,it needs almost 30 ms, too crazy

Question

I use opengl es to display bgr24 data on iPad, I am new about opengl es ,so in display video part I use code from RosyWriter one APPLE sample. It works, but the CVOpenGLESTextureCacheCreateTextureFromImage function cost more than 30ms, while in RosyWriter its cost is negligible. what I do is first transform BGR24 to BGRA pixel format, then use CVPixelBufferCreateWithBytes function create a CVPixelBufferRef, and then get a CVOpenGLESTextureRef by CVOpenGLESTextureCacheCreateTextureFromImage. My codes as following,

- (void)transformBGRToBGRA:(const UInt8 *)pict width:(int)width height:(int)height
{
rgb.data = (void *)pict;

vImage_Error error = vImageConvert_RGB888toARGB8888(&rgb,NULL,0,&argb,NO,kvImageNoFlags);
if (error != kvImageNoError) {
    NSLog(@"vImageConvert_RGB888toARGB8888 error");
}

const uint8_t permuteMap[4] = {1,2,3,0};

error = vImagePermuteChannels_ARGB8888(&argb,&bgra,permuteMap,kvImageNoFlags);
if (error != kvImageNoError) {
    NSLog(@"vImagePermuteChannels_ARGB8888 error");
}

free((void *)pict);
}

and after transform, will generate CVPixelBufferRef, codes as following,

[self transformBGRToBGRA:pict width:width height:height];

CVPixelBufferRef pixelBuffer;
CVReturn err = CVPixelBufferCreateWithBytes(NULL,
                             width,
                             height,
                             kCVPixelFormatType_32BGRA, 
                             (void*)bgraData, 
                             bytesByRow, 
                             NULL, 
                             0,
                             NULL, 
                             &pixelBuffer);

if(!pixelBuffer || err)
{
    NSLog(@"CVPixelBufferCreateWithBytes failed (error: %d)", err);  
    return;
}

CVOpenGLESTextureRef texture = NULL;
err = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault, 
                                                            videoTextureCache,
                                                            pixelBuffer,
                                                            NULL,
                                                            GL_TEXTURE_2D,
                                                            GL_RGBA,
                                                            width,
                                                            height,
                                                            GL_BGRA,
                                                            GL_UNSIGNED_BYTE,
                                                            0,
                                                            &texture);


if (!texture || err) {
    NSLog(@"CVOpenGLESTextureCacheCreateTextureFromImage failed (error: %d)", err);  
    CVPixelBufferRelease(pixelBuffer);
    return;
}

The other codes is almost similar RosyWriter sample, include shaders. So I want to know why, how to fix this problem.

What size is the image you are trying to upload? Are you sure that you're not measuring the time of your `-transformBGRToBGRA:` method in that 30 ms? — Brad Larson, Jul 23 '12 at 15:00
Yes, I am sure.It is 1024 * 768, the time of transformBGRToBGRA: I have measured is 10ms. — zhzhy, Jul 24 '12 at 01:01
OK, so the 30 ms that you measure is from right before the `CVPixelBufferCreateWithBytes()` call to right after the `CVOpenGLESTextureCacheCreateTextureFromImage()` call? That seems extremely high, because I've seen an iPad 2 upload 1080p frames (2.6X more pixels) much faster than in 30 ms. What are your times if you just use `glTexImage2D()` with this data? — Brad Larson, Jul 24 '12 at 15:25
It is really high, in fact the 30ms is got only call **CVOpenGLESTextureCacheCreateTextureFromImage()**,the function **CVPixelBufferCreateWithBytes()** cost nothing. I just rewrite codes with **glTexImage2D()**, it costs about 5ms, so I think this is also higher ,I spend more than one day to find why, and try to resolve this problem, but no answer. — zhzhy, Jul 25 '12 at 00:50
Seemly I find the answer, could you give me some suggestions? — zhzhy, Jul 30 '12 at 05:33

zhzhy · Accepted Answer · 2012-07-31T05:47:19.170

With my research in these day, I find why CVOpenGLESTextureCacheCreateTextureFromImage cost much time, when the data is big, here is 3M, the allocation, copy and move operation which cost is considerable, especially Copy operation. Then with pixel buffer pool greatly improve performance of CVOpenGLESTextureCacheCreateTextureFromImage from 30ms to 5ms, the same level with glTexImage2D(). My solution as following:

NSMutableDictionary*     attributes;
attributes = [NSMutableDictionary dictionary];


[attributes setObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA] forKey:(NSString*)kCVPixelBufferPixelFormatTypeKey];
[attributes setObject:[NSNumber numberWithInt:videoWidth] forKey: (NSString*)kCVPixelBufferWidthKey];
[attributes setObject:[NSNumber numberWithInt:videoHeight] forKey: (NSString*)kCVPixelBufferHeightKey];

CVPixelBufferPoolCreate(kCFAllocatorDefault, NULL, (CFDictionaryRef) attributes, &bufferPool);

CVPixelBufferPoolCreatePixelBuffer (NULL,bufferPool,&pixelBuffer);

CVPixelBufferLockBaseAddress(pixelBuffer,0);

UInt8 * baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer);

memcpy(baseAddress, bgraData, bytesByRow * videoHeight);

CVPixelBufferUnlockBaseAddress(pixelBuffer,0);

with this new created pixelBuffer you can make it fast.

Add following configures to attribtes can make its performance to the best, less than 1ms.

 NSDictionary *IOSurfaceProperties = [NSDictionary dictionaryWithObjectsAndKeys:
                                                                        [NSNumber numberWithBool:YES], @"IOSurfaceOpenGLESFBOCompatibility",[NSNumber numberWithBool:YES], @"IOSurfaceOpenGLESTextureCompatibility",nil];

[attributes setObject:IOSurfaceProperties forKey:(NSString*)kCVPixelBufferIOSurfacePropertiesKey];

I implemented this suggestion, adding the surface properties and making use of a pixel buffer pool. But, I am seeing no improvement in the performance on an iPad2. Now my code just runs memcpy() 89% of the time. I do not see how this approach could improve things, since the texture upload is the time critical part of this code. If the pixel data is already sitting in a user buffer, what would be gained by doing another copy to move the memory into a CVPixelBuffer ? — MoDJ, Jul 30 '13 at 18:57
Same question here. If you've already got BGRA data in memory, why can't you just wrap this with a CVPixelBufferRef? — jjxtra, Jan 25 '15 at 03:44

CVOpenGLESTextureCacheCreateTextureFromImage on iPad2 is too slow ,it needs almost 30 ms, too crazy

1 Answers1

Linked