Blame - drivers/staging/echo/echo.c - kernel/msm-4.9

blob: 79d15c65bfe458a1f236a5960fdbba7ebcfce352 [file] [log] [blame]

David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	1	/*
				2	* SpanDSP - a series of DSP components for telephony
				3	*
				4	* echo.c - A line echo canceller. This code is being developed
				5	* against and partially complies with G168.
				6	*
				7	* Written by Steve Underwood <steveu@coppice.org>
				8	* and David Rowe <david_at_rowetel_dot_com>
				9	*
				10	* Copyright (C) 2001, 2003 Steve Underwood, 2007 David Rowe
				11	*
				12	* Based on a bit from here, a bit from there, eye of toad, ear of
				13	* bat, 15 years of failed attempts by David and a few fried brain
				14	* cells.
				15	*
				16	* All rights reserved.
				17	*
				18	* This program is free software; you can redistribute it and/or modify
				19	* it under the terms of the GNU General Public License version 2, as
				20	* published by the Free Software Foundation.
				21	*
				22	* This program is distributed in the hope that it will be useful,
				23	* but WITHOUT ANY WARRANTY; without even the implied warranty of
				24	* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
				25	* GNU General Public License for more details.
				26	*
				27	* You should have received a copy of the GNU General Public License
				28	* along with this program; if not, write to the Free Software
				29	* Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	30	*/
				31
				32	/! \file /
				33
				34	/* Implementation Notes
				35	David Rowe
				36	April 2007
				37
				38	This code started life as Steve's NLMS algorithm with a tap
				39	rotation algorithm to handle divergence during double talk. I
				40	added a Geigel Double Talk Detector (DTD) [2] and performed some
				41	G168 tests. However I had trouble meeting the G168 requirements,
				42	especially for double talk - there were always cases where my DTD
				43	failed, for example where near end speech was under the 6dB
				44	threshold required for declaring double talk.
				45
				46	So I tried a two path algorithm [1], which has so far given better
				47	results. The original tap rotation/Geigel algorithm is available
				48	in SVN http://svn.rowetel.com/software/oslec/tags/before_16bit.
				49	It's probably possible to make it work if some one wants to put some
				50	serious work into it.
				51
				52	At present no special treatment is provided for tones, which
				53	generally cause NLMS algorithms to diverge. Initial runs of a
				54	subset of the G168 tests for tones (e.g ./echo_test 6) show the
				55	current algorithm is passing OK, which is kind of surprising. The
				56	full set of tests needs to be performed to confirm this result.
				57
				58	One other interesting change is that I have managed to get the NLMS
				59	code to work with 16 bit coefficients, rather than the original 32
				60	bit coefficents. This reduces the MIPs and storage required.
				61	I evaulated the 16 bit port using g168_tests.sh and listening tests
				62	on 4 real-world samples.
				63
				64	I also attempted the implementation of a block based NLMS update
				65	[2] but although this passes g168_tests.sh it didn't converge well
				66	on the real-world samples. I have no idea why, perhaps a scaling
				67	problem. The block based code is also available in SVN
				68	http://svn.rowetel.com/software/oslec/tags/before_16bit. If this
				69	code can be debugged, it will lead to further reduction in MIPS, as
				70	the block update code maps nicely onto DSP instruction sets (it's a
				71	dot product) compared to the current sample-by-sample update.
				72
				73	Steve also has some nice notes on echo cancellers in echo.h
				74
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	75	References:
				76
				77	[1] Ochiai, Areseki, and Ogihara, "Echo Canceller with Two Echo
				78	Path Models", IEEE Transactions on communications, COM-25,
				79	No. 6, June
				80	1977.
				81	http://www.rowetel.com/images/echo/dual_path_paper.pdf
				82
				83	[2] The classic, very useful paper that tells you how to
				84	actually build a real world echo canceller:
				85	Messerschmitt, Hedberg, Cole, Haoui, Winship, "Digital Voice
				86	Echo Canceller with a TMS320020,
				87	http://www.rowetel.com/images/echo/spra129.pdf
				88
				89	[3] I have written a series of blog posts on this work, here is
				90	Part 1: http://www.rowetel.com/blog/?p=18
				91
				92	[4] The source code http://svn.rowetel.com/software/oslec/
				93
				94	[5] A nice reference on LMS filters:
				95	http://en.wikipedia.org/wiki/Least_mean_squares_filter
				96
				97	Credits:
				98
				99	Thanks to Steve Underwood, Jean-Marc Valin, and Ramakrishnan
				100	Muthukrishnan for their suggestions and email discussions. Thanks
				101	also to those people who collected echo samples for me such as
				102	Mark, Pawel, and Pavel.
				103	*/
				104
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	105	#include <linux/kernel.h> /* We're doing kernel work */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	106	#include <linux/module.h>
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	107	#include <linux/slab.h>
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	108
				109	#include "bit_operations.h"
				110	#include "echo.h"
				111
				112	#define MIN_TX_POWER_FOR_ADAPTION 64
				113	#define MIN_RX_POWER_FOR_ADAPTION 64
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	114	#define DTD_HANGOVER 600 /* 600 samples, or 75ms */
				115	#define DC_LOG2BETA 3 /* log2() of DC filter Beta */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	116
				117	/-----------------------------------------------------------------------\
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	118	FUNCTIONS
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	119	\-----------------------------------------------------------------------/
				120
				121	/* adapting coeffs using the traditional stochastic descent (N)LMS algorithm */
				122
Tzafrir Cohen	f55ccbf	2008-10-12 08:13:21 +0200	[diff] [blame]	123	#ifdef __bfin__
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	124	static inline void lms_adapt_bg(struct oslec_state *ec, int clean,
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	125	int shift)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	126	{
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	127	int i, j;
				128	int offset1;
				129	int offset2;
				130	int factor;
				131	int exp;
				132	int16_t *phist;
				133	int n;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	134
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	135	if (shift > 0)
				136	factor = clean << shift;
				137	else
				138	factor = clean >> -shift;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	139
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	140	/* Update the FIR taps */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	141
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	142	offset2 = ec->curr_pos;
				143	offset1 = ec->taps - offset2;
				144	phist = &ec->fir_state_bg.history[offset2];
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	145
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	146	/* st: and en: help us locate the assembler in echo.s */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	147
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	148	/* asm("st:"); */
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	149	n = ec->taps;
				150	for (i = 0, j = offset2; i < n; i++, j++) {
				151	exp = phist++ factor;
				152	ec->fir_taps16[1][i] += (int16_t) ((exp + (1 << 14)) >> 15);
				153	}
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	154	/* asm("en:"); */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	155
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	156	/* Note the asm for the inner loop above generated by Blackfin gcc
				157	4.1.1 is pretty good (note even parallel instructions used):
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	158
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	159	R0 = W [P0++] (X);
				160	R0 *= R2;
				161	R0 = R0 + R3 (NS) \|\|
				162	R1 = W [P1] (X) \|\|
				163	nop;
				164	R0 >>>= 15;
				165	R0 = R0 + R1;
				166	W [P1++] = R0;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	167
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	168	A block based update algorithm would be much faster but the
				169	above can't be improved on much. Every instruction saved in
				170	the loop above is 2 MIPs/ch! The for loop above is where the
				171	Blackfin spends most of it's time - about 17 MIPs/ch measured
				172	with speedtest.c with 256 taps (32ms). Write-back and
				173	Write-through cache gave about the same performance.
				174	*/
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	175	}
				176
				177	/*
				178	IDEAS for further optimisation of lms_adapt_bg():
				179
				180	1/ The rounding is quite costly. Could we keep as 32 bit coeffs
				181	then make filter pluck the MS 16-bits of the coeffs when filtering?
				182	However this would lower potential optimisation of filter, as I
				183	think the dual-MAC architecture requires packed 16 bit coeffs.
				184
				185	2/ Block based update would be more efficient, as per comments above,
				186	could use dual MAC architecture.
				187
				188	3/ Look for same sample Blackfin LMS code, see if we can get dual-MAC
				189	packing.
				190
				191	4/ Execute the whole e/c in a block of say 20ms rather than sample
				192	by sample. Processing a few samples every ms is inefficient.
				193	*/
				194
				195	#else
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	196	static inline void lms_adapt_bg(struct oslec_state *ec, int clean,
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	197	int shift)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	198	{
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	199	int i;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	200
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	201	int offset1;
				202	int offset2;
				203	int factor;
				204	int exp;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	205
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	206	if (shift > 0)
				207	factor = clean << shift;
				208	else
				209	factor = clean >> -shift;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	210
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	211	/* Update the FIR taps */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	212
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	213	offset2 = ec->curr_pos;
				214	offset1 = ec->taps - offset2;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	215
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	216	for (i = ec->taps - 1; i >= offset1; i--) {
				217	exp = (ec->fir_state_bg.history[i - offset1] * factor);
				218	ec->fir_taps16[1][i] += (int16_t) ((exp + (1 << 14)) >> 15);
				219	}
				220	for (; i >= 0; i--) {
				221	exp = (ec->fir_state_bg.history[i + offset2] * factor);
				222	ec->fir_taps16[1][i] += (int16_t) ((exp + (1 << 14)) >> 15);
				223	}
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	224	}
				225	#endif
				226
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	227	struct oslec_state *oslec_create(int len, int adaption_mode)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	228	{
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	229	struct oslec_state *ec;
				230	int i;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	231
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	232	ec = kzalloc(sizeof(*ec), GFP_KERNEL);
				233	if (!ec)
				234	return NULL;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	235
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	236	ec->taps = len;
				237	ec->log2taps = top_bit(len);
				238	ec->curr_pos = ec->taps - 1;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	239
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	240	for (i = 0; i < 2; i++) {
				241	ec->fir_taps16[i] =
				242	kcalloc(ec->taps, sizeof(int16_t), GFP_KERNEL);
				243	if (!ec->fir_taps16[i])
				244	goto error_oom;
				245	}
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	246
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	247	fir16_create(&ec->fir_state, ec->fir_taps16[0], ec->taps);
				248	fir16_create(&ec->fir_state_bg, ec->fir_taps16[1], ec->taps);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	249
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	250	for (i = 0; i < 5; i++)
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	251	ec->xvtx[i] = ec->yvtx[i] = ec->xvrx[i] = ec->yvrx[i] = 0;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	252
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	253	ec->cng_level = 1000;
				254	oslec_adaption_mode(ec, adaption_mode);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	255
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	256	ec->snapshot = kcalloc(ec->taps, sizeof(int16_t), GFP_KERNEL);
				257	if (!ec->snapshot)
				258	goto error_oom;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	259
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	260	ec->cond_met = 0;
				261	ec->Pstates = 0;
				262	ec->Ltxacc = ec->Lrxacc = ec->Lcleanacc = ec->Lclean_bgacc = 0;
				263	ec->Ltx = ec->Lrx = ec->Lclean = ec->Lclean_bg = 0;
				264	ec->tx_1 = ec->tx_2 = ec->rx_1 = ec->rx_2 = 0;
				265	ec->Lbgn = ec->Lbgn_acc = 0;
				266	ec->Lbgn_upper = 200;
				267	ec->Lbgn_upper_acc = ec->Lbgn_upper << 13;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	268
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	269	return ec;
Pekka Enberg	db2af14	2008-10-17 20:55:03 +0300	[diff] [blame]	270
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	271	error_oom:
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	272	for (i = 0; i < 2; i++)
				273	kfree(ec->fir_taps16[i]);
Pekka Enberg	db2af14	2008-10-17 20:55:03 +0300	[diff] [blame]	274
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	275	kfree(ec);
				276	return NULL;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	277	}
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	278	EXPORT_SYMBOL_GPL(oslec_create);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	279
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	280	void oslec_free(struct oslec_state *ec)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	281	{
				282	int i;
				283
				284	fir16_free(&ec->fir_state);
				285	fir16_free(&ec->fir_state_bg);
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	286	for (i = 0; i < 2; i++)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	287	kfree(ec->fir_taps16[i]);
				288	kfree(ec->snapshot);
				289	kfree(ec);
				290	}
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	291	EXPORT_SYMBOL_GPL(oslec_free);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	292
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	293	void oslec_adaption_mode(struct oslec_state *ec, int adaption_mode)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	294	{
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	295	ec->adaption_mode = adaption_mode;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	296	}
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	297	EXPORT_SYMBOL_GPL(oslec_adaption_mode);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	298
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	299	void oslec_flush(struct oslec_state *ec)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	300	{
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	301	int i;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	302
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	303	ec->Ltxacc = ec->Lrxacc = ec->Lcleanacc = ec->Lclean_bgacc = 0;
				304	ec->Ltx = ec->Lrx = ec->Lclean = ec->Lclean_bg = 0;
				305	ec->tx_1 = ec->tx_2 = ec->rx_1 = ec->rx_2 = 0;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	306
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	307	ec->Lbgn = ec->Lbgn_acc = 0;
				308	ec->Lbgn_upper = 200;
				309	ec->Lbgn_upper_acc = ec->Lbgn_upper << 13;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	310
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	311	ec->nonupdate_dwell = 0;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	312
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	313	fir16_flush(&ec->fir_state);
				314	fir16_flush(&ec->fir_state_bg);
				315	ec->fir_state.curr_pos = ec->taps - 1;
				316	ec->fir_state_bg.curr_pos = ec->taps - 1;
				317	for (i = 0; i < 2; i++)
				318	memset(ec->fir_taps16[i], 0, ec->taps * sizeof(int16_t));
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	319
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	320	ec->curr_pos = ec->taps - 1;
				321	ec->Pstates = 0;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	322	}
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	323	EXPORT_SYMBOL_GPL(oslec_flush);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	324
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	325	void oslec_snapshot(struct oslec_state *ec)
				326	{
				327	memcpy(ec->snapshot, ec->fir_taps16[0], ec->taps * sizeof(int16_t));
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	328	}
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	329	EXPORT_SYMBOL_GPL(oslec_snapshot);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	330
				331	/* Dual Path Echo Canceller ------------------------------------------------*/
				332
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	333	int16_t oslec_update(struct oslec_state *ec, int16_t tx, int16_t rx)
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	334	{
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	335	int32_t echo_value;
				336	int clean_bg;
				337	int tmp, tmp1;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	338
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	339	/* Input scaling was found be required to prevent problems when tx
				340	starts clipping. Another possible way to handle this would be the
				341	filter coefficent scaling. */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	342
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	343	ec->tx = tx;
				344	ec->rx = rx;
				345	tx >>= 1;
				346	rx >>= 1;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	347
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	348	/*
				349	Filter DC, 3dB point is 160Hz (I think), note 32 bit precision required
				350	otherwise values do not track down to 0. Zero at DC, Pole at (1-Beta)
				351	only real axis. Some chip sets (like Si labs) don't need
				352	this, but something like a $10 X100P card does. Any DC really slows
				353	down convergence.
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	354
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	355	Note: removes some low frequency from the signal, this reduces
				356	the speech quality when listening to samples through headphones
				357	but may not be obvious through a telephone handset.
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	358
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	359	Note that the 3dB frequency in radians is approx Beta, e.g. for
				360	Beta = 2^(-3) = 0.125, 3dB freq is 0.125 rads = 159Hz.
				361	*/
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	362
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	363	if (ec->adaption_mode & ECHO_CAN_USE_RX_HPF) {
				364	tmp = rx << 15;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	365	#if 1
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	366	/* Make sure the gain of the HPF is 1.0. This can still saturate a little under
				367	impulse conditions, and it might roll to 32768 and need clipping on sustained peak
				368	level signals. However, the scale of such clipping is small, and the error due to
				369	any saturation should not markedly affect the downstream processing. */
				370	tmp -= (tmp >> 4);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	371	#endif
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	372	ec->rx_1 += -(ec->rx_1 >> DC_LOG2BETA) + tmp - ec->rx_2;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	373
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	374	/* hard limit filter to prevent clipping. Note that at this stage
				375	rx should be limited to +/- 16383 due to right shift above */
				376	tmp1 = ec->rx_1 >> 15;
				377	if (tmp1 > 16383)
				378	tmp1 = 16383;
				379	if (tmp1 < -16383)
				380	tmp1 = -16383;
				381	rx = tmp1;
				382	ec->rx_2 = tmp;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	383	}
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	384
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	385	/* Block average of power in the filter states. Used for
				386	adaption power calculation. */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	387
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	388	{
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	389	int new, old;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	390
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	391	/* efficient "out with the old and in with the new" algorithm so
				392	we don't have to recalculate over the whole block of
				393	samples. */
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	394	new = (int)tx * (int)tx;
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	395	old = (int)ec->fir_state.history[ec->fir_state.curr_pos] *
				396	(int)ec->fir_state.history[ec->fir_state.curr_pos];
				397	ec->Pstates +=
David Rowe	0f51010	2009-05-20 11:18:27 +0930	[diff] [blame]	398	((new - old) + (1 << (ec->log2taps-1))) >> ec->log2taps;
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	399	if (ec->Pstates < 0)
				400	ec->Pstates = 0;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	401	}
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	402
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	403	/* Calculate short term average levels using simple single pole IIRs */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	404
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	405	ec->Ltxacc += abs(tx) - ec->Ltx;
				406	ec->Ltx = (ec->Ltxacc + (1 << 4)) >> 5;
				407	ec->Lrxacc += abs(rx) - ec->Lrx;
				408	ec->Lrx = (ec->Lrxacc + (1 << 4)) >> 5;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	409
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	410	/* Foreground filter --------------------------------------------------- */
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	411
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	412	ec->fir_state.coeffs = ec->fir_taps16[0];
				413	echo_value = fir16(&ec->fir_state, tx);
				414	ec->clean = rx - echo_value;
				415	ec->Lcleanacc += abs(ec->clean) - ec->Lclean;
				416	ec->Lclean = (ec->Lcleanacc + (1 << 4)) >> 5;
				417
				418	/* Background filter --------------------------------------------------- */
				419
				420	echo_value = fir16(&ec->fir_state_bg, tx);
				421	clean_bg = rx - echo_value;
				422	ec->Lclean_bgacc += abs(clean_bg) - ec->Lclean_bg;
				423	ec->Lclean_bg = (ec->Lclean_bgacc + (1 << 4)) >> 5;
				424
				425	/* Background Filter adaption ----------------------------------------- */
				426
				427	/* Almost always adap bg filter, just simple DT and energy
				428	detection to minimise adaption in cases of strong double talk.
				429	However this is not critical for the dual path algorithm.
				430	*/
				431	ec->factor = 0;
				432	ec->shift = 0;
				433	if ((ec->nonupdate_dwell == 0)) {
				434	int P, logP, shift;
				435
				436	/* Determine:
				437
				438	f = Beta * clean_bg_rx/P ------ (1)
				439
				440	where P is the total power in the filter states.
				441
				442	The Boffins have shown that if we obey (1) we converge
				443	quickly and avoid instability.
				444
				445	The correct factor f must be in Q30, as this is the fixed
				446	point format required by the lms_adapt_bg() function,
				447	therefore the scaled version of (1) is:
				448
				449	(2^30) * f = (2^30) * Beta * clean_bg_rx/P
				450	factor = (2^30) * Beta * clean_bg_rx/P ----- (2)
				451
				452	We have chosen Beta = 0.25 by experiment, so:
				453
				454	factor = (2^30) * (2^-2) * clean_bg_rx/P
				455
				456	(30 - 2 - log2(P))
				457	factor = clean_bg_rx 2 ----- (3)
				458
				459	To avoid a divide we approximate log2(P) as top_bit(P),
				460	which returns the position of the highest non-zero bit in
				461	P. This approximation introduces an error as large as a
				462	factor of 2, but the algorithm seems to handle it OK.
				463
				464	Come to think of it a divide may not be a big deal on a
				465	modern DSP, so its probably worth checking out the cycles
				466	for a divide versus a top_bit() implementation.
				467	*/
				468
				469	P = MIN_TX_POWER_FOR_ADAPTION + ec->Pstates;
				470	logP = top_bit(P) + ec->log2taps;
				471	shift = 30 - 2 - logP;
				472	ec->shift = shift;
				473
				474	lms_adapt_bg(ec, clean_bg, shift);
				475	}
				476
				477	/* very simple DTD to make sure we dont try and adapt with strong
				478	near end speech */
				479
				480	ec->adapt = 0;
				481	if ((ec->Lrx > MIN_RX_POWER_FOR_ADAPTION) && (ec->Lrx > ec->Ltx))
				482	ec->nonupdate_dwell = DTD_HANGOVER;
				483	if (ec->nonupdate_dwell)
				484	ec->nonupdate_dwell--;
				485
				486	/* Transfer logic ------------------------------------------------------ */
				487
				488	/* These conditions are from the dual path paper [1], I messed with
				489	them a bit to improve performance. */
				490
				491	if ((ec->adaption_mode & ECHO_CAN_USE_ADAPTION) &&
				492	(ec->nonupdate_dwell == 0) &&
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	493	/* (ec->Lclean_bg < 0.875ec->Lclean) /
				494	(8 * ec->Lclean_bg < 7 * ec->Lclean) &&
				495	/* (ec->Lclean_bg < 0.125ec->Ltx) /
				496	(8 * ec->Lclean_bg < ec->Ltx)) {
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	497	if (ec->cond_met == 6) {
				498	/* BG filter has had better results for 6 consecutive samples */
				499	ec->adapt = 1;
				500	memcpy(ec->fir_taps16[0], ec->fir_taps16[1],
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	501	ec->taps * sizeof(int16_t));
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	502	} else
				503	ec->cond_met++;
				504	} else
				505	ec->cond_met = 0;
				506
				507	/* Non-Linear Processing --------------------------------------------------- */
				508
				509	ec->clean_nlp = ec->clean;
				510	if (ec->adaption_mode & ECHO_CAN_USE_NLP) {
				511	/* Non-linear processor - a fancy way to say "zap small signals, to avoid
				512	residual echo due to (uLaw/ALaw) non-linearity in the channel.". */
				513
				514	if ((16 * ec->Lclean < ec->Ltx)) {
				515	/* Our e/c has improved echo by at least 24 dB (each factor of 2 is 6dB,
				516	so 2222=16 is the same as 6+6+6+6=24dB) /
				517	if (ec->adaption_mode & ECHO_CAN_USE_CNG) {
				518	ec->cng_level = ec->Lbgn;
				519
				520	/* Very elementary comfort noise generation. Just random
				521	numbers rolled off very vaguely Hoth-like. DR: This
				522	noise doesn't sound quite right to me - I suspect there
				523	are some overlfow issues in the filtering as it's too
				524	"crackly". TODO: debug this, maybe just play noise at
				525	high level or look at spectrum.
				526	*/
				527
				528	ec->cng_rndnum =
				529	1664525U * ec->cng_rndnum + 1013904223U;
				530	ec->cng_filter =
				531	((ec->cng_rndnum & 0xFFFF) - 32768 +
				532	5 * ec->cng_filter) >> 3;
				533	ec->clean_nlp =
				534	(ec->cng_filter * ec->cng_level * 8) >> 14;
				535
				536	} else if (ec->adaption_mode & ECHO_CAN_USE_CLIP) {
				537	/* This sounds much better than CNG */
				538	if (ec->clean_nlp > ec->Lbgn)
				539	ec->clean_nlp = ec->Lbgn;
				540	if (ec->clean_nlp < -ec->Lbgn)
				541	ec->clean_nlp = -ec->Lbgn;
				542	} else {
				543	/* just mute the residual, doesn't sound very good, used mainly
				544	in G168 tests */
				545	ec->clean_nlp = 0;
				546	}
				547	} else {
				548	/* Background noise estimator. I tried a few algorithms
				549	here without much luck. This very simple one seems to
				550	work best, we just average the level using a slow (1 sec
				551	time const) filter if the current level is less than a
				552	(experimentally derived) constant. This means we dont
				553	include high level signals like near end speech. When
				554	combined with CNG or especially CLIP seems to work OK.
				555	*/
				556	if (ec->Lclean < 40) {
				557	ec->Lbgn_acc += abs(ec->clean) - ec->Lbgn;
				558	ec->Lbgn = (ec->Lbgn_acc + (1 << 11)) >> 12;
				559	}
				560	}
				561	}
				562
				563	/* Roll around the taps buffer */
				564	if (ec->curr_pos <= 0)
				565	ec->curr_pos = ec->taps;
				566	ec->curr_pos--;
				567
				568	if (ec->adaption_mode & ECHO_CAN_DISABLE)
				569	ec->clean_nlp = rx;
				570
				571	/* Output scaled back up again to match input scaling */
				572
				573	return (int16_t) ec->clean_nlp << 1;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	574	}
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	575	EXPORT_SYMBOL_GPL(oslec_update);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	576
				577	/* This function is seperated from the echo canceller is it is usually called
				578	as part of the tx process. See rx HP (DC blocking) filter above, it's
				579	the same design.
				580
				581	Some soft phones send speech signals with a lot of low frequency
				582	energy, e.g. down to 20Hz. This can make the hybrid non-linear
				583	which causes the echo canceller to fall over. This filter can help
				584	by removing any low frequency before it gets to the tx port of the
				585	hybrid.
				586
				587	It can also help by removing and DC in the tx signal. DC is bad
				588	for LMS algorithms.
				589
				590	This is one of the classic DC removal filters, adjusted to provide sufficient
				591	bass rolloff to meet the above requirement to protect hybrids from things that
				592	upset them. The difference between successive samples produces a lousy HPF, and
				593	then a suitably placed pole flattens things out. The final result is a nicely
				594	rolled off bass end. The filtering is implemented with extended fractional
				595	precision, which noise shapes things, giving very clean DC removal.
				596	*/
				597
Alexander Beregalov	dc57a3e	2009-03-12 03:32:45 +0300	[diff] [blame]	598	int16_t oslec_hpf_tx(struct oslec_state *ec, int16_t tx)
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	599	{
				600	int tmp, tmp1;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	601
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	602	if (ec->adaption_mode & ECHO_CAN_USE_TX_HPF) {
				603	tmp = tx << 15;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	604	#if 1
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	605	/* Make sure the gain of the HPF is 1.0. The first can still saturate a little under
				606	impulse conditions, and it might roll to 32768 and need clipping on sustained peak
				607	level signals. However, the scale of such clipping is small, and the error due to
				608	any saturation should not markedly affect the downstream processing. */
				609	tmp -= (tmp >> 4);
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	610	#endif
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	611	ec->tx_1 += -(ec->tx_1 >> DC_LOG2BETA) + tmp - ec->tx_2;
				612	tmp1 = ec->tx_1 >> 15;
				613	if (tmp1 > 32767)
				614	tmp1 = 32767;
				615	if (tmp1 < -32767)
				616	tmp1 = -32767;
				617	tx = tmp1;
				618	ec->tx_2 = tmp;
				619	}
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	620
J.R. Mauro	4460a86	2008-10-20 19:01:31 -0400	[diff] [blame]	621	return tx;
David Rowe	10602db	2008-10-06 21:41:46 -0700	[diff] [blame]	622	}
Tzafrir Cohen	9d8f2d5	2008-10-12 07:17:26 +0200	[diff] [blame]	623	EXPORT_SYMBOL_GPL(oslec_hpf_tx);
Tzafrir Cohen	68b8d9f	2008-10-12 06:55:40 +0200	[diff] [blame]	624
				625	MODULE_LICENSE("GPL");
				626	MODULE_AUTHOR("David Rowe");
				627	MODULE_DESCRIPTION("Open Source Line Echo Canceller");
				628	MODULE_VERSION("0.3.0");