Avoid scale by alpha if unnecessary

~15% improvement for S32_alpha_D32_filter_DX on skylake-x.

nanobench result on 7900X(fixed frequency@3.2GHz)
                                before    after
bitmaprect_FF_filter_trans      524µs     453µs

Change-Id: I1c0c05915ecd3dc6f59da5eb49b5ae1c6cd98814
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/288436
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
1 file changed