Remove sk_memcpy32

It's only implemented on x86, where the exisiting benchmark says memcpy() is
faster for all cases:

Timer overhead: 24ns
curr/maxrss    loops    min    median    mean    max    stddev    samples       config    bench
  10/10  MB    1    35.9µs    36.2µs    36.2µs    36.6µs    1%    ▁▂▄▅▅▃█▄▄▅    nonrendering    sk_memcpy32_100000
  10/10  MB    13    2.27µs    2.28µs    2.28µs    2.29µs    0%    █▄▃▅▃▁▃▅▁▄    nonrendering    sk_memcpy32_10000
  11/11  MB    677    91.6ns    95.9ns    94.5ns    99.4ns    3%    ▅▅▅▅▅█▁▁▁▁    nonrendering    sk_memcpy32_1000
  11/11  MB    1171    20ns    20.9ns    21.3ns    23.4ns    6%    ▁▁▇▃▃▃█▇▃▃    nonrendering    sk_memcpy32_100
  11/11  MB    1952    14ns    14ns    14.3ns    15.2ns    3%    ▁▁██▁▁▁▁▁▁    nonrendering    sk_memcpy32_10
  11/11  MB    5    33.6µs    33.7µs    34.1µs    35.2µs    2%    ▆▇█▁▁▁▁▁▁▁    nonrendering    memcpy32_memcpy_100000
  11/11  MB    18    2.12µs    2.22µs    2.24µs    2.39µs    5%    ▂█▄▇█▄▇▁▁▁    nonrendering    memcpy32_memcpy_10000
  11/11  MB    1112    87.3ns    87.3ns    89.1ns    93.7ns    3%    ▄██▄▁▁▁▁▁▁    nonrendering    memcpy32_memcpy_1000
  11/11  MB    2124    12.8ns    13.3ns    13.5ns    14.8ns    6%    ▁▁▁█▃▃█▇▃▃    nonrendering    memcpy32_memcpy_100
  11/11  MB    3077    9ns    9.41ns    9.52ns    10.2ns    4%    ▃█▁█▃▃▃▃▃▃    nonrendering    memcpy32_memcpy_10

(Why?  One fewer thing to port to SkOpts.)

BUG=skia:4117

Review URL: https://codereview.chromium.org/1256763003
10 files changed