LibAAH_RTP: Fix a stuttering audio bug.
Fix a bug discovered while working on adding unicast mode to the TX/RX
players. Also some general cleanup/consolidation regarding timeout
code.
The bug went like this. When a TX player had hit EOS, it would send
an EOS command payload to its receivers. Later, when application
level code shutdown and cleaned up the player, it would send another.
In situations where there is massive packet loss, there is a chance
that not only did both of the EOS packets get dropped, but that they
never got filled in by the retry algorithm because the receiver gave
up on the RTP gap due to an aboutToUnderflow situation in at least one
of its active substreams.
When this happens, there are two major problems. First, all of the
substreams associated with the TX player which has now gone away have
become effectively leaked. They will only get cleaned up if the
entire RTP stream (the TX Group) goes away for 10 seconds or more, or
when the RX Player itself is reset by application level code or a
fatal error. These substreams are holding decoder and renderer
resources which are probably in very short supply, which is a Bad
Thing.
Second, there is now at least one substream in the RX player which is
never going to receive another payload (its TX player source is gone),
but is still considered to be active by the rx player. Assuming that
this substream's program was in the play state when the track ended,
there is now at least one substream which is always
"aboutToUnderflow". From here on out, when the retry algorithm is
attempting to decide whether or not it has the time to attempt to fill
in a gap in the muxed RTP sequence, it always decides that it does not
have the time because of the orphaned substream which is stuck in its
about to underflow state. This effectively means that the retry
algorithm is completely shut off until the rx player gets reset
somehow (something which does not happen during normal operation).
Since the environment had to be extremely lossy to trigger this chain
of events in the first place, and its probably no better now, your
playback is just going to be chock full of gaps which produces
horrible stuttering in the presentation stage of the system.
Two new failsafes have been introduced to keep the double EOS drop
from causing this. First, a timeout has been introduced on the
substream level, in addition to the already existing RTP level
timeout. If a substream fails to receive an activity for 10 seconds
(same timeout as the master RTP timeout), it will be automatically
flushed and purged.
Second, the nature of the master RTP timeout on the transmitter side
has been changed. Instead of just sending an empty NOP command packet
to indicate that the main RTP stream is still alive, the transmitter
now sends a new time of command packet; the Active Program Update
packet. This packet contains a list of all the active program ID
attached to this TX group. Upon receiving one of these APU packets,
RX players reset the inactivity timers for all substreams which are
members of the programs listed in the packet, but they also
immediately purge any substreams associated with programs not present
in the APU.
Between the two of these, no matter how nasty and selective the packet
smashing gremlins in your system happen to be, substreams will always
eventually clean up and avoid getting stuck in a perma-stutter
situation.
Also in this CL:
+ Extract some common utility code into a utils.cpp file so that it
can be shared across the library.
+ Stop using custom timeout logic in the RXPlayer. Instead, use the
common Timeout helper class in utils.cpp.
Signed-off-by: John Grossman <johngro@google.com>
Change-Id: I350869942074f2cae020f719c2911d9092ba8055
13 files changed