NEON shortcut for flat colour blending into 16-bit

This is a shortcut for the needs descriptor
00000077:03515104_00000000_00000000.  It requires blending a single 32-bit
colour value into a 16-bit framebuffer.
It's used when fading out the screen, eg. when a modal requester pops-up.

The PF JIT produces code for this using 24 instructions/pixel. The NEON
implementation requires 2.1 instructions/pixel. Performance hasn't been
benchmarked, but the improvement is quite visible.

This code has only been tested by inspection of the fading effect described
above, when press+holding a finger on the home screen to pop up the
Shortcuts/Widgets/Folders/Wallpaper requester.

Along with the NEON version, a fallback v5TE implementation is also provided.

This ARM version of col32cb16blend is not fully optimised, but is a reasonable
implementation, and better than the version produced by the JIT. It is here as
a fallback, if NEON is not available.
4 files changed