Fix rate option with iodepth > 1

The rate option currently doesnt work when used with libaio engine.
The math currently, calculates the time t2 (when the I/O completed) -
t1 (when the io_u unit was created) as the time it takes for the I/O
and the bandwidth for the rate calculation is calculated from that.
This math will work correctly for sync engine as there is only one io
in progress at a time, but for libaio engine, when there are multiple
I/Os queued, the same time (as in from t1 to t2) could be attributed
to other I/Os as well so the actual bandwidth is actually higher.
I have a patch, but this is more brute force where I take the total
bytes read/written divided by the time since I/Os started to calculate
the bandwidth and decide on the time that needs to be spent sleeping
(if any).This is a little more heavy weight than the previous math. I
think there are probably simpler/cleaner solutions than this but this
is the current patch I have for it.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
diff --git a/init.c b/init.c
index 6ae78be..b9dee19 100644
--- a/init.c
+++ b/init.c
@@ -205,21 +205,19 @@
 static int __setup_rate(struct thread_data *td, enum fio_ddir ddir)
 {
 	unsigned int bs = td->o.min_bs[ddir];
-	unsigned long long rate;
-	unsigned long ios_per_msec;
+	unsigned long long bytes_per_sec;
 
-	if (td->o.rate[ddir]) {
-		rate = td->o.rate[ddir];
-		ios_per_msec = (rate * 1000LL) / bs;
-	} else
-		ios_per_msec = td->o.rate_iops[ddir] * 1000UL;
+	if (td->o.rate[ddir])
+		bytes_per_sec = td->o.rate[ddir];
+	else
+		bytes_per_sec = td->o.rate_iops[ddir] * bs;
 
-	if (!ios_per_msec) {
+	if (!bytes_per_sec) {
 		log_err("rate lower than supported\n");
 		return -1;
 	}
 
-	td->rate_usec_cycle[ddir] = 1000000000ULL / ios_per_msec;
+	td->rate_nsec_cycle[ddir] = 1000000000ULL / bytes_per_sec;
 	td->rate_pending_usleep[ddir] = 0;
 	return 0;
 }