RDS/IB: track signaled sends
We're seeing bugs today where IB connection shutdown clears the send
ring while the tasklet is processing completed sends. Implementation
details cause this to dereference a null pointer. Shutdown needs to
wait for send completion to stop before tearing down the connection. We
can't simply wait for the ring to empty because it may contain
unsignaled sends that will never be processed.
This patch tracks the number of signaled sends that we've posted and
waits for them to complete. It also makes sure that the tasklet has
finished executing.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index 10f6a88..123c7d3 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -615,11 +615,18 @@
}
/*
- * Don't wait for the send ring to be empty -- there may be completed
- * non-signaled entries sitting on there. We unmap these below.
+ * We want to wait for tx and rx completion to finish
+ * before we tear down the connection, but we have to be
+ * careful not to get stuck waiting on a send ring that
+ * only has unsignaled sends in it. We've shutdown new
+ * sends before getting here so by waiting for signaled
+ * sends to complete we're ensured that there will be no
+ * more tx processing.
*/
wait_event(rds_ib_ring_empty_wait,
- rds_ib_ring_empty(&ic->i_recv_ring));
+ rds_ib_ring_empty(&ic->i_recv_ring) &&
+ (atomic_read(&ic->i_signaled_sends) == 0));
+ tasklet_kill(&ic->i_recv_tasklet);
if (ic->i_send_hdrs)
ib_dma_free_coherent(dev,
@@ -729,6 +736,7 @@
#ifndef KERNEL_HAS_ATOMIC64
spin_lock_init(&ic->i_ack_lock);
#endif
+ atomic_set(&ic->i_signaled_sends, 0);
/*
* rds_ib_conn_shutdown() waits for these to be emptied so they