f867d556dd8525fe6ff0d22a34249528e590f994 - kernel/msm-4.19

commit	f867d556dd8525fe6ff0d22a34249528e590f994	[log] [tgz]
author	Christophe Leroy <christophe.leroy@c-s.fr>	Tue Sep 22 16:34:32 2015 +0200
committer	Scott Wood <oss@buserror.net>	Fri Mar 04 23:03:45 2016 -0600
tree	32ebba9cfc1b00d1f394b480d5cfab443382864e
parent	48821a34b1bdc5d89505cb814b3f7c166940f200 [diff]

powerpc32: optimise csum_partial() loop

On the 8xx, load latency is 2 cycles and taking branches also takes
2 cycles. So let's unroll the loop.

This patch improves csum_partial() speed by around 10% on both:
* 8xx (single issue processor with parallel execution)
* 83xx (superscalar 6xx processor with dual instruction fetch
and parallel execution)

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>

arch/powerpc/lib/checksum_32.S[diff]

1 file changed

tree: 32ebba9cfc1b00d1f394b480d5cfab443382864e