[media] mem2mem: add support for hardware buffered queue

On mem2mem decoders with a hardware bitstream ringbuffer, to drain the
buffer at the end of the stream, remaining frames might need to be decoded
from the bitstream buffer without additional input buffers being provided.
To achieve this, allow a queue to be marked as buffered by the driver, and
allow scheduling of device_runs when buffered ready queues are empty.
This also allows a driver to copy input buffers into their bitstream
ringbuffer and immediately mark them as done to be dequeued.
The motivation for this patch is hardware assisted h.264 reordering support
in the coda driver. For high profile streams, the coda can hold back
out-of-order frames, causing a few mem2mem device runs in the beginning, that
don't produce any decompressed buffer at the v4l2 capture side. At the same
time, the last few frames can be decoded from the bitstream with mem2mem device
runs that don't need a new input buffer at the v4l2 output side. The decoder
command ioctl can be used to put the decoder into the ringbuffer draining
end-of-stream mode.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
index e96497f..89b9067 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -196,6 +196,10 @@
  * 2) at least one destination buffer has to be queued,
  * 3) streaming has to be on.
  *
+ * If a queue is buffered (for example a decoder hardware ringbuffer that has
+ * to be drained before doing streamoff), allow scheduling without v4l2 buffers
+ * on that queue.
+ *
  * There may also be additional, custom requirements. In such case the driver
  * should supply a custom callback (job_ready in v4l2_m2m_ops) that should
  * return 1 if the instance is ready.
@@ -224,7 +228,8 @@
 	}
 
 	spin_lock_irqsave(&m2m_ctx->out_q_ctx.rdy_spinlock, flags_out);
-	if (list_empty(&m2m_ctx->out_q_ctx.rdy_queue)) {
+	if (list_empty(&m2m_ctx->out_q_ctx.rdy_queue)
+	    && !m2m_ctx->out_q_ctx.buffered) {
 		spin_unlock_irqrestore(&m2m_ctx->out_q_ctx.rdy_spinlock,
 					flags_out);
 		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags_job);
@@ -232,7 +237,8 @@
 		return;
 	}
 	spin_lock_irqsave(&m2m_ctx->cap_q_ctx.rdy_spinlock, flags_cap);
-	if (list_empty(&m2m_ctx->cap_q_ctx.rdy_queue)) {
+	if (list_empty(&m2m_ctx->cap_q_ctx.rdy_queue)
+	    && !m2m_ctx->cap_q_ctx.buffered) {
 		spin_unlock_irqrestore(&m2m_ctx->cap_q_ctx.rdy_spinlock,
 					flags_cap);
 		spin_unlock_irqrestore(&m2m_ctx->out_q_ctx.rdy_spinlock,