As reported by a static code analyzer, the code for the ordering of
the linked list can be simplified.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
kvfree() helper is now available, use it instead of open code it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
At few places in caamhash and caamalg, after allocating a dmable
buffer for sg table , the buffer was being modified. As per
definition of DMA_FROM_DEVICE ,afer allocation the memory should
be treated as read-only by the driver. This patch shifts the
allocation of dmable buffer for sg table after it is populated
by the driver, making it read-only as per the DMA API's requirement.
Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
CAAM IP has certain 64 bit registers . 32 bit architectures cannot force
atomic-64 operations. This patch adds definition of these atomic-64
operations for little endian platforms. The definitions which existed
previously were for big endian platforms.
Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For platforms with virtualization enabled
1. The job ring registers can be written to only is the job ring has been
started i.e STARTR bit in JRSTART register is 1
2. For DECO's under direct software control, with virtualization enabled
PL, BMT, ICID and SDID values need to be provided. These are provided by
selecting a Job ring in start mode whose parameters would be used for the
DECO access programming.
Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some registers like SECVID, CHAVID, CHA Revision Number,
CTPR were defined as 64 bit resgisters. The IP provides
a DWT bit(Double word Transpose) to transpose the two words when
a double word register is accessed. However setting this bit
would also affect the operation of job descriptors as well as
other registers which are truly double word in nature.
So, for the IP to work correctly on big-endian as well as
little-endian SoC's, change is required to access all 32 bit
registers as 32 bit quantities.
Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
qat adds -I to the ccflags. Unfortunately it uses CURDIR which
breaks when make is invoked with O=. This patch replaces CURDIR
with $(src) which should work with/without O=.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This adds 4 test vectors for GHASH (of which one for chunked mode), making
a total of 5.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces "by8" AES CTR mode AVX optimization inspired by
Intel Optimized IPSEC Cryptograhpic library. For additional information,
please see:
http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972
The functions aes_ctr_enc_128_avx_by8(), aes_ctr_enc_192_avx_by8() and
aes_ctr_enc_256_avx_by8() are adapted from
Intel Optimized IPSEC Cryptographic library. When both AES and AVX features
are enabled in a platform, the glue code in AESNI module overrieds the
existing "by4" CTR mode en/decryption with the "by8"
AES CTR mode en/decryption.
On a Haswell desktop, with turbo disabled and all cpus running
at maximum frequency, the "by8" CTR mode optimization
shows better performance results across data & key sizes
as measured by tcrypt.
The average performance improvement of the "by8" version over the "by4"
version is as follows:
For 128 bit key and data sizes >= 256 bytes, there is a 10-16% improvement.
For 192 bit key and data sizes >= 256 bytes, there is a 20-22% improvement.
For 256 bit key and data sizes >= 256 bytes, there is a 20-25% improvement.
A typical run of tcrypt with AES CTR mode encryption of the "by4" and "by8"
optimization shows the following results:
tcrypt with "by4" AES CTR mode encryption optimization on a Haswell Desktop:
---------------------------------------------------------------------------
testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 343 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 336 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 491 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1130 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7309 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 346 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 361 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 543 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1321 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9649 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 369 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 366 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1531 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 10522 cycles (8192 bytes)
testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 336 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 350 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 487 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1129 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7287 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 350 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 359 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 635 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1324 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9595 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 364 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 377 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 604 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1527 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 10549 cycles (8192 bytes)
tcrypt with "by8" AES CTR mode encryption optimization on a Haswell Desktop:
---------------------------------------------------------------------------
testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 340 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 330 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 450 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1043 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 6597 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 339 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 352 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 539 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1153 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 8458 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 353 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 360 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 512 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1277 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 8745 cycles (8192 bytes)
testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 348 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 335 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 451 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1030 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 6611 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 354 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 346 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 488 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1154 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 8390 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 357 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 362 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 515 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1284 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 8681 cycles (8192 bytes)
crypto: Incorporate feed back to AES CTR mode optimization patch
Specifically, the following:
a) alignment around main loop in aes_ctrby8_avx_x86_64.S
b) .rodata around data constants used in the assembely code.
c) the use of CONFIG_AVX in the glue code.
d) fix up white space.
e) informational message for "by8" AES CTR mode optimization
f) "by8" AES CTR mode optimization can be simply enabled
if the platform supports both AES and AVX features. The
optimization works superbly on Sandybridge as well.
Testing on Haswell shows no performance change since the last.
Testing on Sandybridge shows that the "by8" AES CTR mode optimization
greatly improves performance.
tcrypt log with "by4" AES CTR mode optimization on Sandybridge
--------------------------------------------------------------
testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 408 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 707 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1864 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 12813 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 395 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 432 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 780 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 2132 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 15765 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 416 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 438 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 842 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 2383 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 16945 cycles (8192 bytes)
testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 389 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 409 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 704 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1865 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 12783 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 409 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 434 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 792 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 2151 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 15804 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 421 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 444 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 840 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 2394 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 16928 cycles (8192 bytes)
tcrypt log with "by8" AES CTR mode optimization on Sandybridge
--------------------------------------------------------------
testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 401 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 522 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1136 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7046 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 394 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 418 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 559 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1263 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9072 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 408 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 428 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1385 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 9224 cycles (8192 bytes)
testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 390 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 402 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 530 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1135 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7079 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 414 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 417 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 572 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1312 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9073 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 415 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 454 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 598 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1407 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 9288 cycles (8192 bytes)
crypto: Fix redundant checks
a) Fix the redundant check for cpu_has_aes
b) Fix the key length check when invoking the CTR mode "by8"
encryptor/decryptor.
crypto: fix typo in AES ctr mode transform
Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com>
Reviewed-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The FIFOST_CONT_MASK define is cut and pasted twice so we can delete the
second instance.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There's no need for the K_table to be made of 64-bit words. For some
reason, the original authors didn't fully reduce the values modulo the
CRC32C polynomial, and so had some 33-bit values in there. They can
all be reduced to 32 bits.
Doing that cuts the table size in half. Since the code depends on both
pclmulq and crc32, SSE 4.1 is obviously present, so we can use pmovzxdq
to fetch it in the correct format.
This adds (measured on Ivy Bridge) 1 cycle per main loop iteration
(CRC of up to 3K bytes), less than 0.2%. The hope is that the reduced
D-cache footprint will make up the loss in other code.
Two other related fixes:
* K_table is read-only, so belongs in .rodata, and
* There's no need for more than 8-byte alignment
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Update to makefiles etc.
Don't update the firmware/Makefile yet since there is no FW binary in
the crypto repo yet. This will be added later.
v3 - removed change to ./firmware/Makefile
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds DH895xCC hardware specific code.
It hooks to the common infrastructure and provides acceleration for crypto
algorithms.
Acked-by: John Griffin <john.griffin@intel.com>
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds acceleration engine handler part the firmware loader.
Acked-by: Bo Cui <bo.cui@intel.com>
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Karen Xiang <karen.xiang@intel.com>
Signed-off-by: Pingchaox Yang <pingchaox.yang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds microcode part of the firmware loader.
v4 - splits FW loader part into two smaller patches.
Acked-by: Bo Cui <bo.cui@intel.com>
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Karen Xiang <karen.xiang@intel.com>
Signed-off-by: Pingchaox Yang <pingchaox.yang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds qat crypto interface.
Acked-by: John Griffin <john.griffin@intel.com>
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds FW interface structure definitions.
Acked-by: John Griffin <john.griffin@intel.com>
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a code that implements communication channel between the
driver and the firmware.
Acked-by: John Griffin <john.griffin@intel.com>
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a common infractructure that will be used by all Intel(R)
QuickAssist Technology (QAT) devices.
v2 - added ./drivers/crypto/qat/Kconfig and ./drivers/crypto/qat/Makefile
v4 - splits common part into more, smaller patches
Acked-by: John Griffin <john.griffin@intel.com>
Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for the CCP on arm64 as a platform device.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch provides the documentation of the device bindings
for the AMD Cryptographic Coprocessor driver.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Modify the PCI device support in prep for supporting the
CCP as a platform device for arm64.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The DRBG test code implements the CAVS test approach.
As discussed for the test vectors, all DRBG types are covered with
testing. However, not every backend cipher is covered with testing. To
prevent the testmgr from logging missing testing, the NULL test is
registered for all backend ciphers not covered with specific test cases.
All currently implemented DRBG types and backend ciphers are defined
in SP800-90A. Therefore, the fips_allowed flag is set for all.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All types of the DRBG (CTR, HMAC, Hash) are covered with test vectors.
In addition, all permutations of use cases of the DRBG are covered:
* with and without predition resistance
* with and without additional information string
* with and without personalization string
As the DRBG implementation is agnositc of the specific backend cipher,
only test vectors for one specific backend cipher is used. For example:
the Hash DRBG uses the same code paths irrespectively of using SHA-256
or SHA-512. Thus, the test vectors for SHA-256 cover the testing of all
DRBG code paths of SHA-512.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The different DRBG types of CTR, Hash, HMAC can be enabled or disabled
at compile time. At least one DRBG type shall be selected.
The default is the HMAC DRBG as its code base is smallest.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The header file includes the definition of:
* DRBG data structures with
- struct drbg_state as main structure
- struct drbg_core referencing the backend ciphers
- struct drbg_state_ops callbach handlers for specific code
supporting the Hash, HMAC, CTR DRBG implementations
- struct drbg_conc defining a linked list for input data
- struct drbg_test_data holding the test "entropy" data for CAVS
testing and testmgr.c
- struct drbg_gen allowing test data, additional information
string and personalization string data to be funneled through
the kernel crypto API -- the DRBG requires additional
parameters when invoking the reset and random number
generation requests than intended by the kernel crypto API
* wrapper function to the kernel crypto API functions using struct
drbg_gen to pass through all data needed for DRBG
* wrapper functions to kernel crypto API functions usable for testing
code to inject test_data into the DRBG as needed by CAVS testing and
testmgr.c.
* DRBG flags required for the operation of the DRBG and for selecting
the particular DRBG type and backend cipher
* getter functions for data from struct drbg_core
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is a clean-room implementation of the DRBG defined in SP800-90A.
All three viable DRBGs defined in the standard are implemented:
* HMAC: This is the leanest DRBG and compiled per default
* Hash: The more complex DRBG can be enabled at compile time
* CTR: The most complex DRBG can also be enabled at compile time
The DRBG implementation offers the following:
* All three DRBG types are implemented with a derivation function.
* All DRBG types are available with and without prediction resistance.
* All SHA types of SHA-1, SHA-256, SHA-384, SHA-512 are available for
the HMAC and Hash DRBGs.
* All AES types of AES-128, AES-192 and AES-256 are available for the
CTR DRBG.
* A self test is implemented with drbg_healthcheck().
* The FIPS 140-2 continuous self test is implemented.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
References to __exit functions must be wrapped with __exit_p.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Marcelo Henrique Cerri <mhcerri@linux.vnet.ibm.com>
Cc: Fionnuala Gunter <fin@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch moves data allocated using kzalloc to managed data allocated
using devm_kzalloc and cleans now unnecessary kfrees in probe and remove
functions. Also, linux/device.h is added to make sure the devm_*()
routine declarations are unambiguously available. Earlier, in the probe
function ctrlpriv was leaked on the failure of ctrl = of_iomap(nprop, 0);
as well as on the failure of ctrlpriv->jrpdev = kzalloc(...); . These
two bugs have been fixed by the patch.
The following Coccinelle semantic patch was used for making the change:
identifier p, probefn, removefn;
@@
struct platform_driver p = {
.probe = probefn,
.remove = removefn,
};
@prb@
identifier platform.probefn, pdev;
expression e, e1, e2;
@@
probefn(struct platform_device *pdev, ...) {
<+...
- e = kzalloc(e1, e2)
+ e = devm_kzalloc(&pdev->dev, e1, e2)
...
?-kfree(e);
...+>
}
@rem depends on prb@
identifier platform.removefn;
expression e;
@@
removefn(...) {
<...
- kfree(e);
...>
}
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
zswap allocates one LZO context per online cpu.
Using vmalloc() for small (16KB) memory areas has drawback of slowing
down /proc/vmallocinfo and /proc/meminfo reads, TLB pressure and poor
NUMA locality, as default NUMA policy at boot time is to interleave
pages :
edumazet:~# grep lzo /proc/vmallocinfo | head -4
0xffffc90006062000-0xffffc90006067000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2
0xffffc90006067000-0xffffc9000606c000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2
0xffffc9000606c000-0xffffc90006071000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2
0xffffc90006071000-0xffffc90006076000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2
This patch tries a regular kmalloc() and fallback to vmalloc in case
memory is too fragmented.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use skcipher_givcrypt_cast(crypto_dequeue_request(queue)) instead, which
does the same thing in much cleaner way. The skcipher_givcrypt_cast()
actually uses container_of() instead of messing around with offsetof()
too.
Signed-off-by: Marek Vasut <marex@denx.de>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: Pantelis Antoniou <panto@antoniou-consulting.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It makes no sense for crypto_yield() to be defined in scatterwalk.h ,
move it into algapi.h as it's an internal function to crypto API.
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull networking fixes from David Miller:
1) Fix checksumming regressions, from Tom Herbert.
2) Undo unintentional permissions changes for SCTP rto_alpha and
rto_beta sysfs knobs, from Denial Borkmann.
3) VXLAN, like other IP tunnels, should advertize it's encapsulation
size using dev->needed_headroom instead of dev->hard_header_len.
From Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: sctp: fix permissions for rto_alpha and rto_beta knobs
vxlan: Checksum fixes
net: add skb_pop_rcv_encapsulation
udp: call __skb_checksum_complete when doing full checksum
net: Fix save software checksum complete
net: Fix GSO constants to match NETIF flags
udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup
vxlan: use dev->needed_headroom instead of dev->hard_header_len
MAINTAINERS: update cxgb4 maintainer
3.16. They are simply fixes and code refactoring for the OMAP clock
drivers. The sunxi clock driver changes include splitting out the one
mega-driver into several smaller pieces and adding support for the A31
SoC clocks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTnHqfAAoJEDqPOy9afJhJnI0P/1PvRHx7bmwNAD8b09pAVm2u
xTmhiH+zfHcRtKivKCAxFQ4FlkS3v69RB9FC+s6FIgn984K3FjkHRW2zgqe3K2h3
7tj6EoT6XJ6szK4AWDy/GVqekRF9kyADexSiYI4rIRP0rnSswvBKHZ485OR06Fs+
Jls0EMbGOEzMyB/B+pDNnTOznZOSd+lZbBznSh1zG+8QHQEzXwxPRr+G0/jxneO/
rTqUvDRqGC709YIaa+oBCH5ez/wVwrU68u/CpmrLQIPdFfaWl7YhYy/ZicwwJprE
Oi1AlQpRoBe1yYIz6oJ//+4D6b9Y/e6cqG4P37VhF6PiD9yDyN+ycEtGMqxNXjIa
OMGlairEU6V43ZrP/wDWvX6NLP7LCEqOG/PSo8zjuoZ/G1kw2jo6firRI5TVR/bY
uARHkBTUYQGjvwBU3QoLuHf+pOPAeBXfYVsi2n/b+HSueXkPQW+HdH4erktlahPh
2xkVhEDbMfCOeovOGcZhsQ8aDUIDUjZTJE7uU633DjsHY7P96OTRBHF8qirNpuOx
0GkAVOsFBU7wMt8tcO4it00i7z6PEKwqDIZBNQVq2F2DnOS9WTTcop7dmYPz95qp
8qTZIN++ROWaxok0H5SL7ER22GIJlTuGGynwPK5Aa/6v193rUW9pEZPlr7wYSf8u
RwP/J6OfN9t/rKxCsFCj
=9/Iv
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux
Pull more clock framework updates from Mike Turquette:
"This contains the second half the of the clk changes for 3.16.
They are simply fixes and code refactoring for the OMAP clock drivers.
The sunxi clock driver changes include splitting out the one
mega-driver into several smaller pieces and adding support for the A31
SoC clocks"
* tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux: (25 commits)
clk: sunxi: document PRCM clock compatible strings
clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support
clk: sun6i: Protect SDRAM gating bit
clk: sun6i: Protect CPU clock
clk: sunxi: Rework clock protection code
clk: sunxi: Move the GMAC clock to a file of its own
clk: sunxi: Move the 24M oscillator to a file of its own
clk: sunxi: Remove calls to clk_put
clk: sunxi: document new A31 USB clock compatible
clk: sunxi: Implement A31 USB clock
ARM: dts: OMAP5/DRA7: use omap5-mpu-dpll-clock capable of dealing with higher frequencies
CLK: TI: dpll: support OMAP5 MPU DPLL that need special handling for higher frequencies
ARM: OMAP5+: dpll: support Duty Cycle Correction(DCC)
CLK: TI: clk-54xx: Set the rate for dpll_abe_m2x2_ck
CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic)
dt:/bindings: DRA7 ATL (Audio Tracking Logic) clock bindings
ARM: dts: dra7xx-clocks: Correct name for atl clkin3 clock
CLK: TI: gate: add composite interface clock to OMAP2 only build
ARM: OMAP2: clock: add DT boot support for cpufreq_ck
CLK: TI: OMAP2: add clock init support
...
Pull NVMe update from Matthew Wilcox:
"Mostly bugfixes again for the NVMe driver. I'd like to call out the
exported tracepoint in the block layer; I believe Keith has cleared
this with Jens.
We've had a few reports from people who're really pounding on NVMe
devices at scale, hence the timeout changes (and new module
parameters), hotplug cpu deadlock, tracepoints, and minor performance
tweaks"
[ Jens hadn't seen that tracepoint thing, but is ok with it - it will
end up going away when mq conversion happens ]
* git://git.infradead.org/users/willy/linux-nvme: (22 commits)
NVMe: Fix START_STOP_UNIT Scsi->NVMe translation.
NVMe: Use Log Page constants in SCSI emulation
NVMe: Define Log Page constants
NVMe: Fix hot cpu notification dead lock
NVMe: Rename io_timeout to nvme_io_timeout
NVMe: Use last bytes of f/w rev SCSI Inquiry
NVMe: Adhere to request queue block accounting enable/disable
NVMe: Fix nvme get/put queue semantics
NVMe: Delete NVME_GET_FEAT_TEMP_THRESH
NVMe: Make admin timeout a module parameter
NVMe: Make iod bio timeout a parameter
NVMe: Prevent possible NULL pointer dereference
NVMe: Fix the buffer size passed in GetLogPage(CDW10.NUMD)
NVMe: Update data structures for NVMe 1.2
NVMe: Enable BUILD_BUG_ON checks
NVMe: Update namespace and controller identify structures to the 1.1a spec
NVMe: Flush with data support
NVMe: Configure support for block flush
NVMe: Add tracepoints
NVMe: Protect against badly formatted CQEs
...
Commit 3fd091e73b ("[SCTP]: Remove multiple levels of msecs
to jiffies conversions.") has silently changed permissions for
rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
this was to discourage users from tweaking rto_alpha and
rto_beta knobs in production environments since they are key
to correctly compute rtt/srtt.
RFC4960 under section 6.3.1. RTO Calculation says regarding
rto_alpha and rto_beta under rule C3 and C4:
[...]
C3) When a new RTT measurement R' is made, set
RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'|
and
SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R'
Note: The value of SRTT used in the update to RTTVAR
is its value before updating SRTT itself using the
second assignment. After the computation, update
RTO <- SRTT + 4 * RTTVAR.
C4) When data is in flight and when allowed by rule C5
below, a new RTT measurement MUST be made each round
trip. Furthermore, new RTT measurements SHOULD be
made no more than once per round trip for a given
destination transport address. There are two reasons
for this recommendation: First, it appears that
measuring more frequently often does not in practice
yield any significant benefit [ALLMAN99]; second,
if measurements are made more often, then the values
of RTO.Alpha and RTO.Beta in rule C3 above should be
adjusted so that SRTT and RTTVAR still adjust to
changes at roughly the same rate (in terms of how many
round trips it takes them to reflect new values) as
they would if making only one measurement per
round-trip and using RTO.Alpha and RTO.Beta as given
in rule C3. However, the exact nature of these
adjustments remains a research issue.
[...]
While it is discouraged to adjust rto_alpha and rto_beta
and not further specified how to adjust them, the RFC also
doesn't explicitly forbid it, but rather gives a RECOMMENDED
default value (rto_alpha=3, rto_beta=2). We have a couple
of users relying on the old permissions before they got
changed. That said, if someone really has the urge to adjust
them, we could allow it with a warning in the log.
Fixes: 3fd091e73b ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert says:
====================
Fixes related to some recent checksum modifications.
- Fix GSO constants to match NETIF flags
- Fix logic in saving checksum complete in __skb_checksum_complete
- Call __skb_checksum_complete from UDP if we are checksumming over
whole packet in order to save checksum.
- Fixes to VXLAN to work correctly with checksum complete
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet
header to work properly with checksum complete.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function is used by UDP encapsulation protocols in RX when
crossing encapsulation boundary. If ip_summed is set to
CHECKSUM_UNNECESSARY and encapsulation is not set, change to
CHECKSUM_NONE since the checksum has not been validated within the
encapsulation. Clears csum_valid by the same rationale.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In __udp_lib_checksum_complete check if checksum is being done over all
the data (len is equal to skb->len) and if it is call
__skb_checksum_complete instead of __skb_checksum_complete_head. This
allows checksum to be saved in checksum complete.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert reported issues regarding checksum complete and UDP.
The logic introduced in commit 7e3cead517
("net: Save software checksum complete") is not correct.
This patch:
1) Restores code in __skb_checksum_complete_header except for setting
CHECKSUM_UNNECESSARY. This function may be calculating checksum on
something less than skb->len.
2) Adds saving checksum to __skb_checksum_complete. The full packet
checksum 0..skb->len is calculated without adding in pseudo header.
This value is saved in skb->csum and then the pseudo header is added
to that to derive the checksum for validation.
3) In both __skb_checksum_complete_header and __skb_checksum_complete,
set skb->csum_valid to whether checksum of zero was computed. This
allows skb_csum_unnecessary to return true without changing to
CHECKSUM_UNNECESSARY which was done previously.
4) Copy new csum related bits in __copy_skb_header.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joseph Gasparakis reported that VXLAN GSO offload stopped working with
i40e device after recent UDP changes. The problem is that the
SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This
patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several
GSO constants that were missing to avoid the problem in the future.
Reported-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is just a couple of drivers (hpsa and lpfc) that got left out for further
testing in linux-next. We also have one fix to a prior submission (qla2xxx
sparse).
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJTm48MAAoJEDeqqVYsXL0M1YEH/iZyEILT4EIZxre/tspqX/LB
dxtGlmlF8AEU8/Eze3k/OB5nSuGcnYZ1hN1CgT2zZEv+sih6FekQOQV06qTwzwbo
DnWA3dOrPVgMzzSVvXFEjryroIUNhZvMy8TGu+DefE9b6FUs6B3VZlMR3A+TcSgV
cgknkG2Q6mWN8rO44pTSVlVDe2JpkvCYsHnqhO8uneQXVHNtsPpV7FfoLMLjBUDX
dgsaDiUjyrj0sdR1yOgRjDH68FPewEiEONdtKi63kkI6zWDFASiKDY9yc1eIyjVd
/1gbBJxwTRl4dWEdsigr/pOBxs6yjXGBSl/6PPDtuvdpWLFWUg4C2XtDLz0KLfU=
=tdDT
-----END PGP SIGNATURE-----
Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is just a couple of drivers (hpsa and lpfc) that got left out for
further testing in linux-next. We also have one fix to a prior
submission (qla2xxx sparse)"
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (36 commits)
qla2xxx: fix sparse warnings introduced by previous target mode t10-dif patch
lpfc: Update lpfc version to driver version 10.2.8001.0
lpfc: Fix ExpressLane priority setup
lpfc: mark old devices as obsolete
lpfc: Fix for initializing RRQ bitmap
lpfc: Fix for cleaning up stale ring flag and sp_queue_event entries
lpfc: Update lpfc version to driver version 10.2.8000.0
lpfc: Update Copyright on changed files from 8.3.45 patches
lpfc: Update Copyright on changed files
lpfc: Fixed locking for scsi task management commands
lpfc: Convert runtime references to old xlane cfg param to fof cfg param
lpfc: Fix FW dump using sysfs
lpfc: Fix SLI4 s abort loop to process all FCP rings and under ring_lock
lpfc: Fixed kernel panic in lpfc_abort_handler
lpfc: Fix locking for postbufq when freeing
lpfc: Fix locking for lpfc_hba_down_post
lpfc: Fix dynamic transitions of FirstBurst from on to off
hpsa: fix handling of hpsa_volume_offline return value
hpsa: return -ENOMEM not -1 on kzalloc failure in hpsa_get_device_id
hpsa: remove messages about volume status VPD inquiry page not supported
...
Pull more btrfs updates from Chris Mason:
"This has a few fixes since our last pull and a new ioctl for doing
btree searches from userland. It's very similar to the existing
ioctl, but lets us return larger items back down to the app"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix error handling in create_pending_snapshot
btrfs: fix use of uninit "ret" in end_extent_writepage()
btrfs: free ulist in qgroup_shared_accounting() error path
Btrfs: fix qgroups sanity test crash or hang
btrfs: prevent RCU warning when dereferencing radix tree slot
Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting
btrfs: new ioctl TREE_SEARCH_V2
btrfs: tree_search, search_ioctl: direct copy to userspace
btrfs: new function read_extent_buffer_to_user
btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW
btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer
btrfs: tree_search, search_ioctl: accept varying buffer
btrfs: tree_search: eliminate redundant nr_items check
Pull aio fix and cleanups from Ben LaHaise:
"This consists of a couple of code cleanups plus a minor bug fix"
* git://git.kvack.org/~bcrl/aio-next:
aio: cleanup: flatten kill_ioctx()
aio: report error from io_destroy() when threads race in io_destroy()
fs/aio.c: Remove ctx parameter in kiocb_cancel