target: use a workqueue for I/O completions
Instead of abusing the target processing thread for offloading I/O
completion in the backends to user context add a new workqueue. This means
completions can be processed as fast as available CPU time allows it,
including in parallel with other completions and more importantly I/O
submission or QUEUE FULL retries. This should give much better performance
especially on loaded systems.
As a fallout we can merge all the completed states into a single
one.
On the downside this change complicates lun reset handling a bit by
requiring us to cancel a work item only for those states that have it
initialized. The alternative would be to either always initialize the work
item to a dummy handler, or always use the same handler and do a switch on
the state. The long term solution will be a flag that says that the command
has an initialized work item, but that's only going to be useful once we
have more users.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c
index 532ce31..570b144 100644
--- a/drivers/target/target_core_tmr.c
+++ b/drivers/target/target_core_tmr.c
@@ -255,6 +255,16 @@
atomic_read(&cmd->t_transport_stop),
atomic_read(&cmd->t_transport_sent));
+ /*
+ * If the command may be queued onto a workqueue cancel it now.
+ *
+ * This is equivalent to removal from the execute queue in the
+ * loop above, but we do it down here given that
+ * cancel_work_sync may block.
+ */
+ if (cmd->t_state == TRANSPORT_COMPLETE)
+ cancel_work_sync(&cmd->work);
+
spin_lock_irqsave(&cmd->t_state_lock, flags);
target_stop_task(task, &flags);