[ Upstream commit a5e9f557098e54af44ade5d501379be18435bfbf ]
In commit 9f480faec5 ("crypto: chacha20 - Fix keystream alignment for
chacha20_block()"), I had missed that chacha20_block() can be called
directly on the buffer passed to get_random_bytes(), which can have any
alignment. So, while my commit didn't break anything, it didn't fully
solve the alignment problems.
Revert my solution and just update chacha20_block() to use
put_unaligned_le32(), so the output buffer need not be aligned.
This is simpler, and on many CPUs it's the same speed.
But, I kept the 'tmp' buffers in extract_crng_user() and
_get_random_bytes() 4-byte aligned, since that alignment is actually
needed for _crng_backtrack_protect() too.
Reported-by: Stephan Müller <smueller@chronox.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
When chacha20_block() outputs the keystream block, it uses 'u32' stores
directly. However, the callers (crypto/chacha20_generic.c and
drivers/char/random.c) declare the keystream buffer as a 'u8' array,
which is not guaranteed to have the needed alignment.
Fix it by having both callers declare the keystream as a 'u32' array.
For now this is preferable to switching over to the unaligned access
macros because chacha20_block() is only being used in cases where we can
easily control the alignment (stack buffers).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that crypto_chacha20_setkey() and crypto_chacha20_init() use the
unaligned access macros and crypto_xor() also accepts unaligned buffers,
there is no need to have a cra_alignmask set for chacha20-generic.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The generic ChaCha20 implementation has a cra_alignmask of 3, which
ensures that the key passed into crypto_chacha20_setkey() and the IV
passed into crypto_chacha20_init() are 4-byte aligned. However, these
functions are also called from the ARM and ARM64 implementations of
ChaCha20, which intentionally do not have a cra_alignmask set. This is
broken because 32-bit words are being loaded from potentially-unaligned
buffers without the unaligned access macros.
Fix it by using the unaligned access macros when loading the key and IV.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The four 32-bit constants for the initial state of ChaCha20 were loaded
from a char array which is not guaranteed to have the needed alignment.
Fix it by just assigning the constants directly instead.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 9ae433bc79 ("crypto: chacha20 - convert generic and x86 versions
to skcipher") ported the existing chacha20 code to use the new skcipher
API, and introduced a bug along the way. Unfortunately, the tcrypt tests
did not catch the error, and it was only found recently by Tobias.
Stefan kindly diagnosed the error, and proposed a fix which is similar
to the one below, with the exception that 'walk.stride' is used rather
than the hardcoded block size. This does not actually matter in this
case, but it's a better example of how to use the skcipher walk API.
Fixes: 9ae433bc79 ("crypto: chacha20 - convert generic and x86 ...")
Cc: <stable@vger.kernel.org> # v4.11+
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Reported-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This converts the ChaCha20 code from a blkcipher to a skcipher, which
is now the preferred way to implement symmetric block and stream ciphers.
This ports the generic and x86 versions at the same time because the
latter reuses routines of the former.
Note that the skcipher_walk() API guarantees that all presented blocks
except the final one are a multiple of the chunk size, so we can simplify
the encrypt() routine somewhat.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As architecture specific drivers need a software fallback, export a
ChaCha20 en-/decryption function together with some helpers in a header
file.
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ChaCha20 is a high speed 256-bit key size stream cipher algorithm designed by
Daniel J. Bernstein. It is further specified in RFC7539 for use in IETF
protocols as a building block for the ChaCha20-Poly1305 AEAD.
This is a portable C implementation without any architecture specific
optimizations. It uses a 16-byte IV, which includes the 12-byte ChaCha20 nonce
prepended by the initial block counter. Some algorithms require an explicit
counter value, for example the mentioned AEAD construction.
Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>