IB/mthca: Fix potential AB-BA deadlock with CQ locks When destroying a QP, mthca locks both the QP's send CQ and receive CQ. However, the following scenario is perfectly valid: QP_a: send_cq == CQ_x, recv_cq == CQ_y QP_b: send_cq == CQ_y, recv_cq == CQ_x The old mthca code simply locked send_cq and then recv_cq, which in this case could lead to an AB-BA deadlock if QP_a and QP_b were destroyed simultaneously. We can fix this by changing the locking code to lock the CQ with the lower CQ number first, which will create a consistent lock ordering. Also, the second CQ is locked with spin_lock_nested() to tell lockdep that we know what we're doing with the lock nesting. This bug was found by lockdep. Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit: a19aa5c5fdda8b556ab238177ee27c5ef7873c94 [log] [tgz]
author: Roland Dreier <rolandd@cisco.com> Fri Aug 11 08:56:57 2006 -0700
committer: Roland Dreier <rolandd@cisco.com> Fri Aug 11 08:56:57 2006 -0700
tree: 6c770c8fbbe3270bf1416c3edd8894d214e62657
parent: e54b82d739d4a2ef992976c8c0692cdf89286420 [diff] [blame]
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.h b/drivers/infiniband/hw/mthca/mthca_provider.h
index 8de2887..9a5bece 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.h
+++ b/drivers/infiniband/hw/mthca/mthca_provider.h

@@ -136,8 +136,8 @@
  * We have one global lock that protects dev->cq/qp_table.  Each
  * struct mthca_cq/qp also has its own lock.  An individual qp lock
  * may be taken inside of an individual cq lock.  Both cqs attached to
- * a qp may be locked, with the send cq locked first.  No other
- * nesting should be done.
+ * a qp may be locked, with the cq with the lower cqn locked first.
+ * No other nesting should be done.
  *
  * Each struct mthca_cq/qp also has an ref count, protected by the
  * corresponding table lock.  The pointer from the cq/qp_table to the
commit	a19aa5c5fdda8b556ab238177ee27c5ef7873c94	[log] [tgz]
author	Roland Dreier <rolandd@cisco.com>	Fri Aug 11 08:56:57 2006 -0700
committer	Roland Dreier <rolandd@cisco.com>	Fri Aug 11 08:56:57 2006 -0700
tree	6c770c8fbbe3270bf1416c3edd8894d214e62657
parent	e54b82d739d4a2ef992976c8c0692cdf89286420 [diff] [blame]