Optimize/cleanup BPIALL workaround

In the initial implementation of this workaround we used a dedicated
workaround context to save/restore state.  This patch reduces the
footprint as no additional context is needed.

Additionally, this patch reduces the memory loads and stores by 20%,
reduces the instruction count and exploits static branch prediction to
optimize the SMC path.

Change-Id: Ia9f6bf06fbf8a9037cfe7f1f1fb32e8aec38ec7d
Signed-off-by: Dimitris Papastamos <dimitris.papastamos@arm.com>
2 files changed