cut another multiply in SSE2 bilerp

I think that's as good as it gets now,
but it's still not as fast as the SSSE3 path.

Change-Id: I3bcfefeddfc2940eca66dfdeb8a0876d768e7d3d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/244242
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
1 file changed