intel: move intel_pipeline_rmap creation to the compiler

Move rmap_*() from the driver to the compiler.  Other than some casting fixes
(C++ requires explicit casting for void *), the only difference should be that
we now get RT count from intel_ir instead of XGL_PIPELINE_CB_ATTACHMENT_STATE.
The change can be seen in intel_pipeline_shader_compile().

On the driver side, a generic pipeline_build_shader() is added to replace
pipeline_build_<stage>().  pipeline_tear_shader() is also removed in favor of
intel_pipeline_shader_cleanup() provided by the compiler.
diff --git a/icd/intel/pipeline.h b/icd/intel/pipeline.h
index 821984b..dd08e52 100644
--- a/icd/intel/pipeline.h
+++ b/icd/intel/pipeline.h
@@ -123,8 +123,9 @@
 
     XGL_GPU_SIZE per_thread_scratch_size;
 
-    /* these are set up by the driver */
     struct intel_pipeline_rmap *rmap;
+
+    /* these are set up by the driver */
     XGL_UINT max_threads;
     XGL_GPU_SIZE scratch_offset;
 };