tty: Use unbound workqueue for all input workers

The commonly accepted wisdom that scheduling work on the same cpu
that handled interrupt i/o benefits from cache-locality is only
true if the cpu is idle (since bound kworkers are often the highest
vruntime and thus the lowest priority).

Measurements of scheduling via the unbound queue show lowered
worst-case latency responses of up to 5x over bound workqueue, without
increase in average latency or throughput.

pty i/o test measurements show >3x (!) reduced total running time; tests
previously taking ~8s now complete in <2.5s.

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 7cc16db..9a479e6 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -403,7 +403,7 @@
 	 * flush_to_ldisc() sees buffer data.
 	 */
 	smp_store_release(&buf->tail->commit, buf->tail->used);
-	schedule_work(&buf->work);
+	queue_work(system_unbound_wq, &buf->work);
 }
 EXPORT_SYMBOL(tty_schedule_flip);