BACKPORT: crypto: chacha20 - Fix chacha20_block() keystream alignment (again)

In commit 9f480faec58c ("crypto: chacha20 - Fix keystream alignment for
chacha20_block()"), I had missed that chacha20_block() can be called
directly on the buffer passed to get_random_bytes(), which can have any
alignment.  So, while my commit didn't break anything, it didn't fully
solve the alignment problems.

Revert my solution and just update chacha20_block() to use
put_unaligned_le32(), so the output buffer need not be aligned.
This is simpler, and on many CPUs it's the same speed.

But, I kept the 'tmp' buffers in extract_crng_user() and
_get_random_bytes() 4-byte aligned, since that alignment is actually
needed for _crng_backtrack_protect() too.

Reported-by: Stephan Müller <smueller@chronox.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

(cherry picked from commit a5e9f557098e54af44ade5d501379be18435bfbf)

Conflicts:
	drivers/char/random.c

Bug: 112008522
Test: As series, see Ic61c13b53facfd2173065be715a7ee5f3af8760b
Change-Id: Ic355d2416330ae2f4a50cb7064633810e35a93bf
Signed-off-by: Eric Biggers <ebiggers@google.com>
4 files changed