Commit graph

3 commits

Author SHA1 Message Date
Eric Biggers
9f65592b7e lib/crypto: x86/poly1305: Fix performance regression on short messages
Restore the len >= 288 condition on using the AVX implementation, which
was incidentally removed by commit 318c53ae02 ("crypto: x86/poly1305 -
Add block-only interface").  This check took into account the overhead
in key power computation, kernel-mode "FPU", and tail handling
associated with the AVX code.  Indeed, restoring this check slightly
improves performance for len < 256 as measured using poly1305_kunit on
an "AMD Ryzen AI 9 365" (Zen 5) CPU:

    Length      Before       After
    ======  ==========  ==========
         1     30 MB/s     36 MB/s
        16    516 MB/s    598 MB/s
        64   1700 MB/s   1882 MB/s
       127   2265 MB/s   2651 MB/s
       128   2457 MB/s   2827 MB/s
       200   2702 MB/s   3238 MB/s
       256   3841 MB/s   3768 MB/s
       511   4580 MB/s   4585 MB/s
       512   5430 MB/s   5398 MB/s
      1024   7268 MB/s   7305 MB/s
      3173   8999 MB/s   8948 MB/s
      4096   9942 MB/s   9921 MB/s
     16384  10557 MB/s  10545 MB/s

While the optimal threshold for this CPU might be slightly lower than
288 (see the len == 256 case), other CPUs would need to be tested too,
and these sorts of benchmarks can underestimate the true cost of
kernel-mode "FPU".  Therefore, for now just restore the 288 threshold.

Fixes: 318c53ae02 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-11 14:29:42 -07:00
Eric Biggers
16f2c30e29 lib/crypto: x86/poly1305: Fix register corruption in no-SIMD contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02 ("crypto: x86/poly1305 - Add block-only
interface").

This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs.  Fixing this is also needed for poly1305_kunit to pass.

Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.

Fixes: 318c53ae02 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-11 14:29:42 -07:00
Eric Biggers
74750aa78d lib/crypto: x86: Move arch/x86/lib/crypto/ into lib/crypto/
Move the contents of arch/x86/lib/crypto/ into lib/crypto/x86/.

The new code organization makes a lot more sense for how this code
actually works and is developed.  In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination.  For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/

This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.

Add a gitignore entry for the removed directory arch/x86/lib/crypto/ so
that people don't accidentally commit leftover generated files.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30 09:26:20 -07:00
Renamed from arch/x86/lib/crypto/poly1305_glue.c (Browse further)