I am trying to accelerate some video frame conversion using the NEON of an ARM-based embedded system (Gumstix Overo). Source is monochrome (Y12 or Y10) and destination is RGB565, RGB888 or RGB32. Are there some specific techniques/tricks for one to learn and make use of the ARM NEON to accelerate such conversion and benchmark it against the standard C implementation.
Asked
Active
Viewed 346 times
0
-
You'll need to learn to write ARM NEON SIMD code, either using asm or C/C++ intrinsics. Search for the `[neon]` tag here on SO as there are plenty of examples. – Paul R Jul 25 '12 at 09:27
-
SIMD programming with NEON is very similar to programming with SSE on x86. Properly written code should get you 3-5X speed improvement in your specific function. ARM provides some example code and you can probably find some good examples here on StackOverflow. – BitBank Jul 25 '12 at 17:11
-
1You can also check [libyuv](http://code.google.com/p/libyuv/) to see if they do what you want already. – auselen Sep 12 '12 at 11:59