crc32 x86: Reverse the order of the 128 to 64 bit fold mask

Longer explaination added on bug, but to better align the x86 SSE
crc32 code with NEON (pmull) implementation of same, and to avoid
possible confusion, reverse the sense of the fold mask.

No change in behavior, no new tests: reserving the sense does not
change the value of the computed crc32.

Bug: 796178
Change-Id: I35f772ab4df414d6f5f808d92bc7f896528e07ef
Reviewed-on: https://chromium-review.googlesource.com/901223
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Noel Gordon <noel@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#534488}
Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
Cr-Mirrored-Commit: 5fdfc741f157923aa5f10415de243ae1c1f9ba72
diff --git a/crc32_simd.c b/crc32_simd.c
index c2d4255..6538652 100644
--- a/crc32_simd.c
+++ b/crc32_simd.c
@@ -126,7 +126,7 @@
      * Fold 128-bits to 64-bits.
      */
     x2 = _mm_clmulepi64_si128(x1, x0, 0x10);
-    x3 = _mm_set_epi32(0, ~0, 0, ~0);
+    x3 = _mm_setr_epi32(~0, 0, ~0, 0);
     x1 = _mm_srli_si128(x1, 8);
     x1 = _mm_xor_si128(x1, x2);