mmc: core: Use delayed work in clock gating framework

Current clock gating framework disables the MCI clock as soon as the
request is completed and enables it when a request arrives. This aggressive
clock gating framework, when enabled, cause following issues:

When there are back-to-back requests from the Queue layer, we unnecessarily
end up disabling and enabling the clocks between these requests since 8MCLK
clock cycles is a very short duration compared to the time delay between
back to back requests reaching the MMC layer. This overhead can effect the
overall performance depending on how long the clock enable and disable
calls take which is platform dependent. For example on some platforms we
can have clock control not on the local processor, but on a different
subsystem and the time taken to perform the clock enable/disable can add
significant overhead.

Also if the host controller driver decides to disable the host clock too
when mmc_set_ios function is called with ios.clock=0, it adds additional
delay and it is highly possible that the next request had already arrived
and unnecessarily blocked in enabling the clocks. This is seen frequently
when the processor is executing at high speeds and in multi-core platforms
thus reduces the overall throughput compared to if clock gating is
disabled.

Fix this by delaying turning off the clocks by posting request on
delayed workqueue. Also cancel the unscheduled pending work, if any,
when there is access to card.

sysfs entry is provided to tune the delay as needed, default
value set to 200ms.

Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index cea064f..706f722 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -253,10 +253,12 @@
 	int			clk_requests;	/* internal reference counter */
 	unsigned int		clk_delay;	/* number of MCI clk hold cycles */
 	bool			clk_gated;	/* clock gated */
-	struct work_struct	clk_gate_work; /* delayed clock gate */
+	struct delayed_work	clk_gate_work; /* delayed clock gate */
 	unsigned int		clk_old;	/* old clock value cache */
 	spinlock_t		clk_lock;	/* lock for clk fields */
 	struct mutex		clk_gate_mutex;	/* mutex for clock gating */
+	struct device_attribute clkgate_delay_attr;
+	unsigned long           clkgate_delay;
 #endif
 
 	/* host specific block data */