i965: Add an INTEL_DEBUG=submit option for printing batch statistics.

When a batch is submitted, INTEL_DEBUG=bat prints a message indicating
which part of the code triggered the flush, and some statistics about
the batch/state buffer utilization.

It also decodes the batchbuffer in debug builds...which is so much
output that it drowns out the utilization messages, if that's all you
care about.

INTEL_DEBUG=submit now just does the utilization messages.
INTEL_DEBUG=bat continues to do both (as the message is a good indicator
that we're starting decode of a new batch).

v2: Rename from "flush" to "submit" (suggested by Chris) because we
    might want "flush" for PIPE_CONTROL debugging someday.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
diff --git a/docs/envvars.html b/docs/envvars.html
index 17d69dc..6c2bdab 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -197,6 +197,7 @@
    <li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>
    <li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>
    <li>state - emit messages about state flag tracking</li>
+   <li>submit - emit batchbuffer usage statistics</li>
    <li>sync - after sending each batch, emit a message and wait for that batch to finish rendering</li>
    <li>tcs - dump shader assembly for tessellation control shaders</li>
    <li>tes - dump shader assembly for tessellation evaluation shaders</li>
diff --git a/src/intel/common/gen_debug.c b/src/intel/common/gen_debug.c
index b604d56..4677bfd 100644
--- a/src/intel/common/gen_debug.c
+++ b/src/intel/common/gen_debug.c
@@ -57,6 +57,7 @@
    { "vert",        DEBUG_VERTS },
    { "dri",         DEBUG_DRI },
    { "sf",          DEBUG_SF },
+   { "submit",      DEBUG_SUBMIT },
    { "wm",          DEBUG_WM },
    { "urb",         DEBUG_URB },
    { "vs",          DEBUG_VS },
diff --git a/src/intel/common/gen_debug.h b/src/intel/common/gen_debug.h
index d290303..da98f85 100644
--- a/src/intel/common/gen_debug.h
+++ b/src/intel/common/gen_debug.h
@@ -57,7 +57,7 @@
 #define DEBUG_VERTS               (1ull << 13)
 #define DEBUG_DRI                 (1ull << 14)
 #define DEBUG_SF                  (1ull << 15)
-/* Hole - feel free to reuse      (1ull << 16) */
+#define DEBUG_SUBMIT              (1ull << 16)
 #define DEBUG_WM                  (1ull << 17)
 #define DEBUG_URB                 (1ull << 18)
 #define DEBUG_VS                  (1ull << 19)
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 08d35ac..515b595 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -731,7 +731,7 @@
       brw_bo_reference(brw->throttle_batch[0]);
    }
 
-   if (unlikely(INTEL_DEBUG & DEBUG_BATCH)) {
+   if (unlikely(INTEL_DEBUG & (DEBUG_BATCH | DEBUG_SUBMIT))) {
       int bytes_for_commands = 4 * USED_BATCH(brw->batch);
       int bytes_for_state = brw->batch.bo->size - brw->batch.state_batch_offset;
       int total_bytes = bytes_for_commands + bytes_for_state;