57bf4c2028cffe24ffb55b96592f7e33aa18f1ce - platform/external/mesa3d

commit	57bf4c2028cffe24ffb55b96592f7e33aa18f1ce	[log] [tgz]
author	Marek Olšák <marek.olsak@amd.com>	Fri Sep 04 05:55:25 2020 -0400
committer	Marge Bot <eric+marge@anholt.net>	Wed Sep 16 02:39:02 2020 +0000
tree	4d6f7fa77cf0f0e81c801e24a1703d86d1721a53
parent	a3512ddfdf7ff1dff0920568102bfaef99ab498e [diff]

nir,radeonsi: move ffma fusing to late optimizations for better codegen

The freedreno trace changes were suggested by Rob Clark.

ALU performance is higher, because ffma is used more often, but so is
register usage, because trinary opcodes (such as ffma) usually need
at least 3 live registers.

54793 shaders in 33659 tests
Totals:
SGPRS: 2639746 -> 2642938 (0.12 %)
VGPRS: 1534120 -> 1536392 (0.15 %)
Spilled SGPRs: 3541 -> 3618 (2.17 %)
Spilled VGPRs: 33 -> 44 (33.33 %)
Scratch size: 292 -> 312 (6.85 %) dwords per thread
Code Size: 55639836 -> 55620116 (-0.04 %) bytes
Max Waves: 964785 -> 963977 (-0.08 %)

Totals from affected shaders:
SGPRS: 1105800 -> 1108992 (0.29 %)
VGPRS: 635292 -> 637564 (0.36 %)
Spilled SGPRs: 3193 -> 3270 (2.41 %)
Spilled VGPRs: 33 -> 44 (33.33 %)
Scratch size: 36 -> 56 (55.56 %) dwords per thread
Code Size: 31568708 -> 31548988 (-0.06 %) bytes
Max Waves: 319991 -> 319183 (-0.25 %)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6596>

4 files changed

tree: 4d6f7fa77cf0f0e81c801e24a1703d86d1721a53