Improve precision of the io_limit setting

For async engines, we look only at completions. But we could have
a bunch inflight with a high queue depth, making us go higher than
we should.

Signed-off-by: Jens Axboe <axboe@fb.com>
diff --git a/ioengines.c b/ioengines.c
index 6370a56..88f67d5 100644
--- a/ioengines.c
+++ b/ioengines.c
@@ -294,8 +294,10 @@
 					sizeof(struct timeval));
 	}
 
-	if (ddir_rw(acct_ddir(io_u)))
+	if (ddir_rw(acct_ddir(io_u))) {
 		td->io_issues[acct_ddir(io_u)]++;
+		td->io_issue_bytes[acct_ddir(io_u)] += io_u->xfer_buflen;
+	}
 
 	ret = td->io_ops->queue(td, io_u);