Replace _mm_cvtps_epi32(x) with _mm_cvttps_epi32(_mm_add_ps(0.5f), x).

We don't have control over which way _mm_cvtps_epi32 rounds.

  - This makes the SSE SkPMFloat rounding consistent with _neon and _none.
  - Sk4f::cast<Sk4i>() is closer to (int)float's behavior.  (Correct when >=0).

Add tests that would fail at head.

BUG=skia:

Review URL: https://codereview.chromium.org/1029163002
5 files changed