libFLAC/lpc_intrin_sse.c : New SSE code to calculate autocorrelation.

Accelerate FLAC__lpc_compute_autocorrelation_intrin_sse_lag_NN routines for
AMD and newer Intel CPUs (means Core i aka Nehalem and newer). Unfortunately
it's slower on older Intel CPUs.

According to tests at HA:

    <http://www.hydrogenaud.io/forums/index.php?s=&showtopic=101082&view=findpost&p=870753>

  CPU                 flac -5           flac -8

  Athlon XP           +5 %              +2.4 %
  Athlon 64 X2        +9 %              +4 %
  Core i              +7 %              +1 % ... +2.7 %
  Core 2              ?                 -3.5 %

According to Steam HW survey <http://store.steampowered.com/hwsurvey/>
69% of Steam users have SSE4.2 which means that the new code is faster for
them. There are also AMD users that don't have SSE4.2, so 75% of Steam users
should benefit from this patch.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
1 file changed