Reuse filled pattern

I made changes to fio so we wld re-use the already populated io_u
buffer (when there is a non-random pattern) during writes. That way
only the header will be re-calculated for every I/O. This way the
buffer wld get populated in the beginning and as long as the
subsequent ios using the same io_u structure are writes and have same
or less block size, it wld get re-used. If any of the subsequent i/o
is a read or has a block size greater than the pre-filled one, then
the buffer is invalidated and will be re-filled at the next write.

Reason for this risky change: (Performance)
I tested this change on a tmpfs(with no swap backing), with the
following config file:
[sscan_write]
filename=/mytmpfs/datafile.tmp
rw=write
bs=64k
size=3G
ioengine=libaio
iodepth=1024
iodepth_low=512
runtime=10800
bwavgtime=5000
thread=1
do_verify=0
verify=meta
verify_pattern=0x55aaa55a
verify_interval=4k
continue_on_error=1

fio-1-41-6 gave 306MB/s and the new change had a performance of 1546MB/s

Side effects/Risks:
There is a risk with this fix, that if the buffer gets corrupted then
the subsequent writes will also be corrupt. I think for both
sequential writes and random writes (with verify, where the I/O log is
replayed) we shld be able to find the first I/O that started with the
corruption and if the buffer is getting corrupted, there are other
issues here.

Testing:
I have tested this fix with sequential write(verify)/random read write
mix combination(with verify).

I think I have taken care of most of the case, but please let me know
if there is anything I have missed. I have attached the patch along
with this email. I think the performance improvement outweighs the
risk associated with the fix. But I will let you decide if you wld
like to pick it up.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
diff --git a/verify.c b/verify.c
index 265bd55..73c1262 100644
--- a/verify.c
+++ b/verify.c
@@ -22,7 +22,7 @@
 #include "crc/sha512.h"
 #include "crc/sha1.h"
 
-static void fill_pattern(struct thread_data *td, void *p, unsigned int len)
+void fill_pattern(struct thread_data *td, void *p, unsigned int len, struct io_u *io_u)
 {
 	switch (td->o.verify_pattern_bytes) {
 	case 0:
@@ -30,13 +30,24 @@
 		fill_random_buf(p, len);
 		break;
 	case 1:
+		if (io_u->buf_filled && io_u->buf_filled_len >= len) {
+			dprint(FD_VERIFY, "using already filled verify pattern b=0 len=%u\n", len);
+			return;
+		}
 		dprint(FD_VERIFY, "fill verify pattern b=0 len=%u\n", len);
 		memset(p, td->o.verify_pattern[0], len);
+		io_u->buf_filled = 1;
+		io_u->buf_filled_len = len;
 		break;
 	default: {
 		unsigned int i = 0, size = 0;
 		unsigned char *b = p;
 
+		if (io_u->buf_filled && io_u->buf_filled_len >= len) {
+			dprint(FD_VERIFY, "using already filled verify pattern b=%d len=%u\n",
+					td->o.verify_pattern_bytes, len);
+			return;
+		}
 		dprint(FD_VERIFY, "fill verify pattern b=%d len=%u\n",
 					td->o.verify_pattern_bytes, len);
 
@@ -47,6 +58,8 @@
 			memcpy(b+i, td->o.verify_pattern, size);
 			i += size;
 		}
+		io_u->buf_filled = 1;
+		io_u->buf_filled_len = len;
 		break;
 		}
 	}
@@ -675,7 +688,7 @@
 	if (td->o.verify == VERIFY_NULL)
 		return;
 
-	fill_pattern(td, p, io_u->buflen);
+	fill_pattern(td, p, io_u->buflen, io_u);
 
 	hdr_inc = io_u->buflen;
 	if (td->o.verify_interval)