aco/ngg: Put shader query reduction operand into a VGPR.

The p_reduce instruction only works if this operand is in a VGPR,
and otherwise gets lowered to incorrect code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>
diff --git a/src/amd/compiler/aco_instruction_selection.cpp b/src/amd/compiler/aco_instruction_selection.cpp
index 6717c6e..9532179 100644
--- a/src/amd/compiler/aco_instruction_selection.cpp
+++ b/src/amd/compiler/aco_instruction_selection.cpp
@@ -11342,6 +11342,8 @@
       Temp prm_cnt = gs_vtx_cnt;
       if (total_vtx_per_prim > 1)
          prm_cnt = bld.vop3(aco_opcode::v_mad_i32_i24, bld.def(v1), gs_prm_cnt, Operand(-1u * (total_vtx_per_prim - 1)), gs_vtx_cnt);
+      else
+         prm_cnt = as_vgpr(ctx, prm_cnt);
 
       /* Reduction calculates the primitive count for the entire subgroup. */
       sg_prm_cnt = bld.tmp(s1);