3a-stats: cpu/gpu of cl 3a-stats work in parallel mode

 * cl-3a-stats calculation of GPU and CPU work together in
   parallel mode to improve performance. side-effect: may bring
   1 frame latency.
 * cl-framework changed to support output buffer resetable in
   post_execute.
 * fix compile warnings.

Signed-off-by: Wind Yuan <feng.yuan@intel.com>
22 files changed