Cherry-pick of "Fix AArch64 ABI conformance issue in SIMD code"

In the AArch64 ABI, the high (unused) DWORD of a 32-bit argument's
register is undefined, so it was incorrect to use 64-bit
instructions to transfer a JDIMENSION argument in the 64-bit NEON SIMD
functions.  The code worked thus far only because the existing compiler
optimizers weren't smart enough to do anything else with the register in
question, so the upper 32 bits happened to be all zeroes.

The latest builds of Clang/LLVM have a smarter optimizer, and under
certain circumstances, it will attempt to load-combine adjacent 32-bit
integers from one of the libjpeg structures into a single 64-bit integer
and pass that 64-bit integer as a 32-bit argument to one of the SIMD
functions (which is allowed by the ABI, since the upper 32 bits of the
32-bit argument's register are undefined.)  This caused the
libjpeg-turbo regression tests to crash.

This patch tries to use the Wn registers whenever possible.  Otherwise,
it uses a zero-extend instruction to avoid using the upper 32 bits of
the 64-bit registers, which are not guaranteed to be valid for 32-bit
arguments.

Based on sebpop@1fbae13

Closes #91.  Refer also to android-ndk/ndk#110 and
https://llvm.org/bugs/show_bug.cgi?id=28393

BUG:31780857

Change-Id: Id80143ac13ba8d427196daf04f00be2214f85c86
2 files changed
tree: dc70acf69ac3d2dd550732958b2d6ec5816324c1
  1. cmakescripts/
  2. doc/
  3. java/
  4. md5/
  5. release/
  6. sharedlib/
  7. simd/
  8. testimages/
  9. win/
  10. .gitignore
  11. acinclude.m4
  12. Android.mk
  13. bmp.c
  14. bmp.h
  15. BUILDING.txt
  16. cderror.h
  17. cdjpeg.c
  18. cdjpeg.h
  19. change.log
  20. ChangeLog.txt
  21. cjpeg.1
  22. cjpeg.c
  23. CMakeLists.txt
  24. coderules.txt
  25. configure.ac
  26. djpeg.1
  27. djpeg.c
  28. doxygen-extra.css
  29. doxygen.config
  30. example.c
  31. jaricom.c
  32. jcapimin.c
  33. jcapistd.c
  34. jcarith.c
  35. jccoefct.c
  36. jccolext.c
  37. jccolor.c
  38. jcdctmgr.c
  39. jchuff.c
  40. jchuff.h
  41. jcinit.c
  42. jcmainct.c
  43. jcmarker.c
  44. jcmaster.c
  45. jcomapi.c
  46. jconfig.h
  47. jconfig.txt
  48. jconfigint.h
  49. jcparam.c
  50. jcphuff.c
  51. jcprepct.c
  52. jcsample.c
  53. jcstest.c
  54. jctrans.c
  55. jdapimin.c
  56. jdapistd.c
  57. jdarith.c
  58. jdatadst-tj.c
  59. jdatadst.c
  60. jdatasrc-tj.c
  61. jdatasrc.c
  62. jdcoefct.c
  63. jdcoefct.h
  64. jdcol565.c
  65. jdcolext.c
  66. jdcolor.c
  67. jdct.h
  68. jddctmgr.c
  69. jdhuff.c
  70. jdhuff.h
  71. jdinput.c
  72. jdmainct.c
  73. jdmainct.h
  74. jdmarker.c
  75. jdmaster.c
  76. jdmaster.h
  77. jdmerge.c
  78. jdmrg565.c
  79. jdmrgext.c
  80. jdphuff.c
  81. jdpostct.c
  82. jdsample.c
  83. jdsample.h
  84. jdtrans.c
  85. jerror.c
  86. jerror.h
  87. jfdctflt.c
  88. jfdctfst.c
  89. jfdctint.c
  90. jidctflt.c
  91. jidctfst.c
  92. jidctint.c
  93. jidctred.c
  94. jinclude.h
  95. jmemmgr.c
  96. jmemnobs.c
  97. jmemsys.h
  98. jmorecfg.h
  99. jpeg_nbits_table.h
  100. jpegcomp.h
  101. jpegint.h
  102. jpeglib.h
  103. jpegtran.1
  104. jpegtran.c
  105. jquant1.c
  106. jquant2.c
  107. jsimd.h
  108. jsimd_none.c
  109. jsimddct.h
  110. jstdhuff.c
  111. jutils.c
  112. jversion.h
  113. libjpeg.map.in
  114. libjpeg.txt
  115. LICENSE.txt
  116. Makefile.am
  117. rdbmp.c
  118. rdcolmap.c
  119. rdgif.c
  120. rdjpgcom.1
  121. rdjpgcom.c
  122. rdppm.c
  123. rdrle.c
  124. rdswitch.c
  125. rdtarga.c
  126. README
  127. README-turbo.txt
  128. README.android
  129. README.version
  130. structure.txt
  131. tjbench.c
  132. tjbenchtest.in
  133. tjbenchtest.java.in
  134. tjexampletest.in
  135. tjunittest.c
  136. tjutil.c
  137. tjutil.h
  138. transupp.c
  139. transupp.h
  140. turbojpeg-jni.c
  141. turbojpeg-mapfile
  142. turbojpeg-mapfile.jni
  143. turbojpeg.c
  144. turbojpeg.h
  145. usage.txt
  146. wizard.txt
  147. wrbmp.c
  148. wrgif.c
  149. wrjpgcom.1
  150. wrjpgcom.c
  151. wrppm.c
  152. wrppm.h
  153. wrrle.c
  154. wrtarga.c