Fix e100 on systems that have cache incoherent DMA

On the systems that have cache incoherent DMA, including ARM, there
is a race condition between software allocating a new receive buffer
and hardware writing into a buffer.  The two race on touching the last
Receive Frame Descriptor (RFD).  It has its el-bit set and its next
link equal to 0.  When hardware encounters this buffer it attempts to
write data to it and then update Status Word bits and Actual Count in
the RFD.  At the same time software may try to clear the el-bit and
set the link address to a new buffer.

Since the entire RFD is once cache-line, the two write operations can
collide.  This can lead to the receive unit stalling or interpreting
random memory as its receive area.

The fix is to set the el-bit on and the size to 0 on the next to last
buffer in the chain.  When the hardware encounters this buffer it stops
and does not write to it at all.  The hardware issues an RNR interrupt
with the receive unit in the No Resources state.  Software can write
to the tail of the list because it knows hardware will stop on the
previous descriptor that was marked as the end of list.

Once it has a new next to last buffer prepared, it can clear the el-bit
and set the size on the previous one.  The race on this buffer is safe
since the link already points to a valid next buffer and the software
can handle the race setting the size (assuming aligned 16 bit writes
are atomic with respect to the DMA read). If the hardware sees the
el-bit cleared without the size set, it will move on to the next buffer
and skip this one.  If it sees the size set but the el-bit still set,
it will complete that buffer and then RNR interrupt and wait.

Signed-off-by: David Acker <dacker@roinet.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
1 file changed