stream_encoder : Improve selection of residual accumulator width In the precompute_partition_info_sums_ function, instead of selecting 64-bit accumulator when the signal bps is larger than 16, revert to the original approach based on partition size, but make room for few extra bits to not overflow with unusual signals where the average residual magnitude may be larger than bps. It slightly improves the performance with standard encoding levels and 16-bit files as the 17-bit side channel can still be processed with the 32-bit accumulator and correctly selects the 64-bit accumulator with very large 16-bit partitions. This is related to commits 6f7ec60c and 187e596e. Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>

commit: f081524c19eeafd08f4db6ee5d52a9634c60f475 [log] [tgz]
author: Miroslav Lichvar <mlichvar@redhat.com> Thu Jun 19 13:04:33 2014 +0200
committer: Erik de Castro Lopo <erikd@mega-nerd.com> Fri Jul 04 21:22:44 2014 +1000
tree: f5c8bd631a69072019e127d576ca7a8b2d38b105
parent: 71246dcc8146ea0b8152e18d2627e6b6b0f56273 [diff] [blame]
diff --git a/src/libFLAC/stream_encoder_intrin_sse2.c b/src/libFLAC/stream_encoder_intrin_sse2.c
index bef5545..4e9d5db 100644
--- a/src/libFLAC/stream_encoder_intrin_sse2.c
+++ b/src/libFLAC/stream_encoder_intrin_sse2.c

@@ -37,6 +37,7 @@
 #ifndef FLAC__NO_ASM
 #if (defined FLAC__CPU_IA32 || defined FLAC__CPU_X86_64) && defined FLAC__HAS_X86INTRIN
 #include "private/stream_encoder.h"
+#include "private/bitmath.h"
 #ifdef FLAC__SSE2_SUPPORTED
 
 #include <stdlib.h>    /* for abs() */
@@ -58,7 +59,7 @@
 		unsigned e1, e3;
 		__m128i mm_res, mm_sum, mm_mask;
 
-		if(bps <= 16) {
+		if(FLAC__bitmath_ilog2(default_partition_samples) + bps + FLAC__MAX_EXTRA_RESIDUAL_BPS < 32) {
 			for(partition = residual_sample = 0; partition < partitions; partition++) {
 				end += default_partition_samples;
 				mm_sum = _mm_setzero_si128();
commit	f081524c19eeafd08f4db6ee5d52a9634c60f475	[log] [tgz]
author	Miroslav Lichvar <mlichvar@redhat.com>	Thu Jun 19 13:04:33 2014 +0200
committer	Erik de Castro Lopo <erikd@mega-nerd.com>	Fri Jul 04 21:22:44 2014 +1000
tree	f5c8bd631a69072019e127d576ca7a8b2d38b105
parent	71246dcc8146ea0b8152e18d2627e6b6b0f56273 [diff] [blame]