IB/uverbs: Don't serialize with ib_uverbs_idr_mutex

Currently, all userspace verbs operations that call into the kernel
are serialized by ib_uverbs_idr_mutex.  This can be a scalability
issue for some workloads, especially for devices driven by the ipath
driver, which needs to call into the kernel even for datapath
operations.

Fix this by adding reference counts to the userspace objects, and then
converting ib_uverbs_idr_mutex into a spinlock that only protects the
idrs long enough to take a reference on the object being looked up.
Because remove operations may fail, we have to do a slightly funky
two-step deletion, which is described in the comments at the top of
uverbs_cmd.c.

This also still leaves ib_uverbs_idr_lock as a single lock that is
possibly subject to contention.  However, the lock hold time will only
be a single idr operation, so multiple threads should still be able to
make progress, even if ib_uverbs_idr_lock is being ping-ponged.

Surprisingly, these changes even shrink the object code:

add/remove: 23/5 grow/shrink: 4/21 up/down: 633/-693 (-60)

Signed-off-by: Roland Dreier <rolandd@cisco.com>
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 7ced208..ee1f3a3 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -697,8 +697,12 @@
 struct ib_uobject {
 	u64			user_handle;	/* handle given to us by userspace */
 	struct ib_ucontext     *context;	/* associated user context */
+	void		       *object;		/* containing object */
 	struct list_head	list;		/* link to context's list */
 	u32			id;		/* index into kernel idr */
+	struct kref		ref;
+	struct rw_semaphore	mutex;		/* protects .live */
+	int			live;
 };
 
 struct ib_umem {