Why is CGBitmapContextCreateImage slower than [UIImage initWithData:]?

Question

I'm currently working on an application which displays many images one after another. I don't have the luxury of using video for this unfortunately, however, I can choose the image codec in use. The data is sent from a server to the application, already encoded.

If I use PNG or JPEG for example I can convert the data I receive into a UIImage using [[UIImage alloc] initWithData:some_data]. When I use a raw byte array, or another custom codec which has to decode to a raw byte array first, I have to create a bitmap context, then use CGBitmapContextCreateImage(bitmapContext) which gives a CGImageRef, which then is fed into [[UIImage alloc] initWithImage:cg_image]. This is much slower.

Image Conversion Times

The above chart (time is measured in seconds) is the time it takes to perform the conversion from NSData to UIImage. PNG, JPEG, BMP, and GIF are all approximately the same. Null is simply not bothering with the conversion and returning nil instead. Raw is a raw RGBA byte array which is converted using the bitmap context method. The custom one decompresses into a Raw format and then does the same thing. LZ4 is the raw data, compressed using the LZ4 algorithm and so it also runs through the bitmap context method.

PNG images for example, are simply bitmapped images which are compressed. This decompression and then render takes less time than my render for Raw images. iOS must be doing something behind the scenes to make this faster.

If we look at the chart of how long it takes to convert each type as well as how long it takes to draw (to a graphics context) we get the following:

Drawing and conversion times

We can see that most images take very different times to convert, but are fairly similar in drawing times. This rules out any performance boost of UIImage being lazy and converting only when needed.

My question is essentially: is the faster speeds for well known codecs something I can exploit? Or, if not, is there another way I can render my Raw data faster?

Edit: For the record, I am drawing these images on top of another UIImage whenever I get a new one. It may be that there is an alternative which is faster which I am willing to look into. However, OpenGL is not an option unfortunately.

Further edit: This question is fairly important and I would like the best possible answer. The bounty will not be awarded until the time expires to ensure the best possible answers are given.

Final edit: My question was why isn't decompressing and drawing a raw RGBA array faster than drawing a PNG for example since PNG has to decompress to a RGBA array and then draw. The results are that it is in fact faster. However, this only appears to be the case in release builds. Debug builds are not optimised for this, but the UIImage code which runs behind the scenes clearly is. By compiling as a release build RGBA array images were much faster than other codecs.

This question cannot be answered in depth without more information. What are you doing with the images exactly (post some code)? Why is conversion from NSData to UIImage relevant to you (seems that you are excluding the most relevant part, display)? How do you measure the time needed? — Nikolai Ruhe, Aug 11 '14 at 09:10
Am I understanding correct that you want to find the fastest way to send images over the network, decompress and display them on the screen, without caching on disk? — Nikolai Ruhe, Aug 11 '14 at 09:14
It can be assumed I already have the images on the device. They are a raw RGBA byte array. I want to turn that into a UIImage as quickly as possible. — Dale Myers, Aug 11 '14 at 09:18
Why is turning data into `UIImage` of relevance for you? Are you aware that the time it takes to display a `UIImage` on screen is heavily dependent on how it was created. You might find the fastest way to create a `UIImage` but it's unlikely that this is the fastest way to display this image. — Nikolai Ruhe, Aug 11 '14 at 09:22
I was not aware of that. Do you have more information about this? — Dale Myers, Aug 11 '14 at 09:23
You really give us a hard time answering your question. On one hand you provide specific measurements of several competing options. Then you want us to find out about the reasons for the differences without disclosing what happens. Your description of what you're doing is not at all reproducible. Please post a small example which at least compares the raw memory with the PNG approaches. — Nikolai Ruhe, Aug 12 '14 at 11:21
I'm trying to make this as simple as possible. What I've given you is entirely reproducible. Converting a RGBA array to a UIImage vs converting the same data represented as a PNG to a UIImage for example. What more do you need? — Dale Myers, Aug 12 '14 at 11:34
I already presented the reason for why "conversion to UIImage" (by using `initWithContentsOfFile:` on a PNG) is faster than working with raw bytes and a CGBitmapContext. But your conversion seems to include other steps. You don't say how and what you draw, for example. — Nikolai Ruhe, Aug 12 '14 at 11:40

Nikolai Ruhe · Answer 1 · 2014-08-11T09:49:31.723

1

When measuring performance it's important to measure the full pipeline in order to find the bottleneck.

In your case that means you cannot isolate UIImage creation. You will have to include image display—otherwise you fall into the trap of measuring only part of what you're interested.

UIImage is not a thin wrapper around bitmap data but a rather complex and optimized system. The underlying CGImage can, for example, be only a reference to some compressed data on disk. That's why initializing a UIImage using initWithContentsOfFile: or initWithData: is fast. There are more hidden performance optimization in the ImageIO and Quartz frameworks in iOS that all will add to your measuring.

The only reliable way to get solid measurements is to do what you really want to do (getting data from network or disk, create a UIImage somehow, and display it on screen for at least one frame).

Here are some considerations you should be aware of:

Apple's graphics frameworks go to great lengths to perform the minimal work necessary. If an image is not displayed it might never be decompressed.
If an image is displayed in a lower resolution than it's original pixels, it might be only partly decompressed (especially possible with JPEGs). This can be a good thing to help with optimization but of course can't be used when creating the images from a CGBitmapContext of full image resolution. So don't do this unless necessary.
When measuring with Instruments you might not see all relevant CPU cycles. Decompression of images can happen in backboardd (the kind-of-window-server used in iOS).
Using uncompressed images might seem like the fastest possible idea. But this does ignore the fact that memory might be the bottleneck and less data (compressed images) can help with that.

Conclusion:

Your aim should be to find the bottleneck for your real scenario. So don't test using made-up test data and contrived code. You might end up optimizing performance for a code path not taken in your app.

When you change your testing code to measuring the full pipeline it would be nice if you could update your question with the results.

edited Aug 11 '14 at 09:49

answered Aug 11 '14 at 09:43

Nikolai Ruhe

81,520
17
180
200

I believe that CIImages are the underlying data types which may be in memory or on disk. It is those which can cause UIImage initWithContentsofFile to be extremely fast. They also discard the data when there is memory pressure. This is also a benchmark of a real world scenario. The images are decompressed and drawn to a completely different UIImage. This will force a decompression no matter what. The time it takes to draw these converted UIImages to the larger UIImage takes the same time, no matter what the original image compression was, except GIF which takes up to twice as long. – Dale Myers Aug 11 '14 at 10:32
@Velox `CIImage` is used in Core Image for GPU based image manipulations. It's usually not used during loading or decompression. – Nikolai Ruhe Aug 11 '14 at 11:14
@Velox When optimizing for speed it seems like a bad idea to draw a CPU based intermediate image. Can't you use views/layers to compose the images? – Nikolai Ruhe Aug 11 '14 at 11:15
It is a possibility. But for now, we are more interested in why the image decompression takes longer than it really should. – Dale Myers Aug 11 '14 at 11:31
@Velox If you want help with that please post code or reproducible instructions. – Nikolai Ruhe Aug 11 '14 at 11:33
That's what this entire question is about? – Dale Myers Aug 11 '14 at 11:41
@Velox I can't see reproducible instructions of your setup. What are the sizes of the image? How sre they being displayed? What is it with the composed background image? – Nikolai Ruhe Aug 11 '14 at 19:45
To reproduce, have many images in RAM on the device as an RGBA array. Then convert to a UIImage, and draw 4 of them to a larger UIImage. I can't go into any further details than this unfortunately. – Dale Myers Aug 12 '14 at 08:20
@Velox If images are already in RGBA arrays (decompressed) how can I compare against JPEG or other compressed formats. – Nikolai Ruhe Aug 12 '14 at 09:01
That is for rendering the raw ones. For JPEG, just use the same data, but pre-rendered as a JPEG. – Dale Myers Aug 12 '14 at 10:51
So the graphs I had displayed didn't support this, however, when running in release instead of debug mode a lot more of it fell into place. It was much faster to draw the RGBA images compared to the others. This clearly isn't optimised at all when in debug mode. However, the remaining parts of the answer weren't relevant and this was not a contrived example. I had measured everything, but I didn't provide any other information as none of it was correlated with the results which are being discussed. – Dale Myers Aug 16 '14 at 10:35

score 0 · Answer 2 · answered Aug 11 '14 at 09:04

0

UIImage uses and abstract internal representation best suited for the actual source, thus the good performance. PNG images are not converted to bitmap and then displayed by UIImage but a more performant drawing.

On the other hand bitmaps are the biggest and less efficient and heavy way to handle images so there's not much you can do bout it besides converting them to another format.

answered Aug 11 '14 at 09:04

Rivera

10,792
3
58
102

This is what I am trying to get around though. I know that iOS does something else behind the scenes. I'm trying to find out what, and how to exploit that. – Dale Myers Aug 11 '14 at 09:06

Abhi Beckert · Answer 3 · 2014-08-11T11:44:55.193

[UIImage initWithData:] does not copy any memory around. It just leaves the memory there where it is, then when you draw it dumps the memory on the GPU to do it's stuff - without the CPU or RAM being involved much in decoding the image. It's all being done in the GPU's dedicated hardware.

Remember, Apple designs their own CPU/GPU by licensing other manufacturer's technology and customising to suit their needs. They've got more than a thousand CPU hardware engineers working on just a single chipset, and efficiently processing images is a priority.

Your lower level code is probably doing lots of memory copying and math, and that's why it's so much slower.

UIImage and NSData are very intelligent high performance APIs that have been developed over decades by people who truly understand (or even built) the hardware and kernel. They're much more efficient than you can achieve with lower level APIs unless you're prepared to write many thousands of lines of code and spend months or even years testing and tweaking to get better performance.

NSData for example can effortlessly work with terabytes of data with good performance even though only a few gigabytes of RAM might be available — used correctly it will seamlessly combine RAM and SSD/HDD storage often with performance similar to what you'd get if you actually had terabytes of RAM, and UIImage can detect low memory situations and free almost all it's RAM without any code on your behalf — if it knows the URL the image was originally loaded from (works better for file:// URLs than http:// URLs).

If you can do what you want with UIImage and NSData, then you should. Only go with the lower level APIs if you have a feature you can't otherwise implement.

I believe this is what is happening, however I have no proof of this. Is there anything to support this anywhere? As for your final paragraph, I'm not entirely sure what you mean. — Dale Myers, Aug 12 '14 at 08:19
I'm just saying in the final paragraph you should use the high level APIs if you can. — Abhi Beckert, Aug 12 '14 at 22:56

Why is CGBitmapContextCreateImage slower than [UIImage initWithData:]?

3 Answers3