Add options to have fio latency profile a device

This adds three new options:

- latency_target. This defines a specific latency target, in usec.
- latency_window. This defines the period over which fio samples.
- latency_percentile. This defines the percentage of IOs that must
  meet the criteria specified by latency_target/latency_window.

With these options set, fio will run the described workload and
vary the queue depth between 1 and iodepth= to find the best
performing spot that meets the criteria specified by the three
options.

A sample job file is also added to demonstrate how to use this.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff --git a/cconv.c b/cconv.c
index 82383b2..dd61d10 100644
--- a/cconv.c
+++ b/cconv.c
@@ -199,6 +199,9 @@
 	o->flow_watermark = __le32_to_cpu(top->flow_watermark);
 	o->flow_sleep = le32_to_cpu(top->flow_sleep);
 	o->sync_file_range = le32_to_cpu(top->sync_file_range);
+	o->latency_target = le64_to_cpu(top->latency_target);
+	o->latency_window = le64_to_cpu(top->latency_window);
+	o->latency_percentile.u.f = fio_uint64_to_double(le64_to_cpu(top->latency_percentile.u.i));
 	o->compress_percentage = le32_to_cpu(top->compress_percentage);
 	o->compress_chunk = le32_to_cpu(top->compress_chunk);
 
@@ -348,6 +351,9 @@
 	top->flow_watermark = __cpu_to_le32(o->flow_watermark);
 	top->flow_sleep = cpu_to_le32(o->flow_sleep);
 	top->sync_file_range = cpu_to_le32(o->sync_file_range);
+	top->latency_target = __cpu_to_le64(o->latency_target);
+	top->latency_window = __cpu_to_le64(o->latency_window);
+	top->latency_percentile.u.i = __cpu_to_le64(fio_double_to_uint64(o->latency_percentile.u.f));
 	top->compress_percentage = cpu_to_le32(o->compress_percentage);
 	top->compress_chunk = cpu_to_le32(o->compress_chunk);