Make the dprint() processing out-of-line

Instead of having the big macro inlined everywhere, only
inline the mask check and put the rest out-of-line. This reduces
the size of fio with 4% here, and speeds it up.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
diff --git a/log.h b/log.h
index 12c9a55..5ca37b3 100644
--- a/log.h
+++ b/log.h
@@ -1,6 +1,8 @@
 #ifndef FIO_LOG_H
 #define FIO_LOG_H
 
+#include <stdio.h>
+
 extern FILE *f_out;
 extern FILE *f_err;