dmaengine: Add transfer termination synchronization support

The DMAengine API has a long standing race condition that is inherent to
the API itself. Calling dmaengine_terminate_all() is supposed to stop and
abort any pending or active transfers that have previously been submitted.
Unfortunately it is possible that this operation races against a currently
running (or with some drivers also scheduled) completion callback.

Since the API allows dmaengine_terminate_all() to be called from atomic
context as well as from within a completion callback it is not possible to
synchronize to the execution of the completion callback from within
dmaengine_terminate_all() itself.

This means that a user of the DMAengine API does not know when it is safe
to free resources used in the completion callback, which can result in a
use-after-free race condition.

This patch addresses the issue by introducing an explicit synchronization
primitive to the DMAengine API called dmaengine_synchronize().

The existing dmaengine_terminate_all() is deprecated in favor of
dmaengine_terminate_sync() and dmaengine_terminate_async(). The former
aborts all pending and active transfers and synchronizes to the current
context, meaning it will wait until all running completion callbacks have
finished. This means it is only possible to call this function from
non-atomic context. The later function does not synchronize, but can still
be used in atomic context or from within a complete callback. It has to be
followed up by dmaengine_synchronize() before a client can free the
resources used in a completion callback.

In addition to this the semantics of the device_terminate_all() callback
are slightly relaxed by this patch. It is now OK for a driver to only
schedule the termination of the active transfer, but does not necessarily
have to wait until the DMA controller has completely stopped. The driver
must ensure though that the controller has stopped and no longer accesses
any memory when the device_synchronize() callback returns.

This was in part done since most drivers do not pay attention to this
anyway at the moment and to emphasize that this needs to be done when the
device_synchronize() callback is implemented. But it also helps with
implementing support for devices where stopping the controller can require
operations that may sleep.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
diff --git a/Documentation/dmaengine/client.txt b/Documentation/dmaengine/client.txt
index 11fb87f..d9f9f46 100644
--- a/Documentation/dmaengine/client.txt
+++ b/Documentation/dmaengine/client.txt
@@ -128,7 +128,7 @@
 	transaction.
 
 	For cyclic DMA, a callback function may wish to terminate the
-	DMA via dmaengine_terminate_all().
+	DMA via dmaengine_terminate_async().
 
 	Therefore, it is important that DMA engine drivers drop any
 	locks before calling the callback function which may cause a
@@ -166,12 +166,29 @@
 
 Further APIs:
 
-1. int dmaengine_terminate_all(struct dma_chan *chan)
+1. int dmaengine_terminate_sync(struct dma_chan *chan)
+   int dmaengine_terminate_async(struct dma_chan *chan)
+   int dmaengine_terminate_all(struct dma_chan *chan) /* DEPRECATED */
 
    This causes all activity for the DMA channel to be stopped, and may
    discard data in the DMA FIFO which hasn't been fully transferred.
    No callback functions will be called for any incomplete transfers.
 
+   Two variants of this function are available.
+
+   dmaengine_terminate_async() might not wait until the DMA has been fully
+   stopped or until any running complete callbacks have finished. But it is
+   possible to call dmaengine_terminate_async() from atomic context or from
+   within a complete callback. dmaengine_synchronize() must be called before it
+   is safe to free the memory accessed by the DMA transfer or free resources
+   accessed from within the complete callback.
+
+   dmaengine_terminate_sync() will wait for the transfer and any running
+   complete callbacks to finish before it returns. But the function must not be
+   called from atomic context or from within a complete callback.
+
+   dmaengine_terminate_all() is deprecated and should not be used in new code.
+
 2. int dmaengine_pause(struct dma_chan *chan)
 
    This pauses activity on the DMA channel without data loss.
@@ -197,3 +214,20 @@
 	a running DMA channel.  It is recommended that DMA engine users
 	pause or stop (via dmaengine_terminate_all()) the channel before
 	using this API.
+
+5. void dmaengine_synchronize(struct dma_chan *chan)
+
+  Synchronize the termination of the DMA channel to the current context.
+
+  This function should be used after dmaengine_terminate_async() to synchronize
+  the termination of the DMA channel to the current context. The function will
+  wait for the transfer and any running complete callbacks to finish before it
+  returns.
+
+  If dmaengine_terminate_async() is used to stop the DMA channel this function
+  must be called before it is safe to free memory accessed by previously
+  submitted descriptors or to free any resources accessed within the complete
+  callback of previously submitted descriptors.
+
+  The behavior of this function is undefined if dma_async_issue_pending() has
+  been called between dmaengine_terminate_async() and this function.