Align all allocations to a 16-byte boundary.

This change also fixes an issue in the Blur intrinsic, where we mis-cast a
float array to float4 (and thus encountered some new alignment errors with
the updated LLVM).

Change-Id: I3955b38f156c35f4d160652c75ab416bae09b2c8
2 files changed