linux/lib/math
Kuan-Wei Chiu b3d5fd6f82 lib/math/gcd: use static key to select implementation at runtime
Patch series "Optimize GCD performance on RISC-V by selecting
implementation at runtime", v3.

The current implementation of gcd() selects between the binary GCD and the
odd-even GCD algorithm at compile time, depending on whether
CONFIG_CPU_NO_EFFICIENT_FFS is set.  On platforms like RISC-V, however,
this compile-time decision can be misleading: even when the compiler emits
ctz instructions based on the assumption that they are efficient (as is
the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual hardware may
lack support for the Zbb extension.  In such cases, ffs() falls back to a
software implementation at runtime, making the binary GCD algorithm
significantly slower than the odd-even variant.

To address this, we introduce a static key to allow runtime selection
between the binary and odd-even GCD implementations.  On RISC-V, the
kernel now checks for Zbb support during boot.  If Zbb is unavailable, the
static key is disabled so that gcd() consistently uses the more efficient
odd-even algorithm in that scenario.  Additionally, to further reduce code
size, we select CONFIG_CPU_NO_EFFICIENT_FFS automatically when
CONFIG_RISCV_ISA_ZBB is not enabled, avoiding compilation of the unused
binary GCD implementation entirely on systems where it would never be
executed.

This series ensures that the most efficient GCD algorithm is used in
practice and avoids compiling unnecessary code based on hardware
capabilities and kernel configuration.


This patch (of 3):

On platforms like RISC-V, the compiler may generate hardware FFS
instructions even if the underlying CPU does not actually support them. 
Currently, the GCD implementation is chosen at compile time based on
CONFIG_CPU_NO_EFFICIENT_FFS, which can result in suboptimal behavior on
such systems.

Introduce a static key, efficient_ffs_key, to enable runtime selection
between the binary GCD (using ffs) and the odd-even GCD implementation. 
This allows the kernel to default to the faster binary GCD when FFS is
efficient, while retaining the ability to fall back when needed.

Link: https://lkml.kernel.org/r/20250606134758.1308400-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20250606134758.1308400-2-visitorckw@gmail.com
Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 19:08:28 -07:00
..
tests lib/prime_numbers: convert self-test to KUnit 2025-02-12 14:00:11 -08:00
cordic.c
div64.c mul_u64_u64_div_u64: fix the division-by-zero behavior 2025-07-09 22:57:53 -07:00
gcd.c lib/math/gcd: use static key to select implementation at runtime 2025-07-19 19:08:28 -07:00
int_log.c lib/math/int_log: Replace LGPL-2.1-or-later boilerplate with SPDX identifier 2023-07-09 22:47:50 +01:00
int_pow.c kernel.h: split out mathematical helpers 2020-12-15 22:46:15 -08:00
int_sqrt.c kernel.h: split out mathematical helpers 2020-12-15 22:46:15 -08:00
Kconfig math: make RATIONAL tristate 2021-09-08 11:50:26 -07:00
lcm.c
Makefile lib: math: Move KUnit tests into tests/ subdir 2025-02-10 18:24:57 -08:00
prime_numbers.c lib/prime_numbers: convert self-test to KUnit 2025-02-12 14:00:11 -08:00
prime_numbers_private.h lib/prime_numbers: convert self-test to KUnit 2025-02-12 14:00:11 -08:00
rational.c math: rational: add missing MODULE_DESCRIPTION() macro 2024-07-04 23:43:11 -07:00
reciprocal_div.c kernel.h: split out mathematical helpers 2020-12-15 22:46:15 -08:00
test_div64.c lib/math/test_div64: add some edge cases relevant to __div64_const32() 2024-10-28 21:44:28 +00:00
test_mul_u64_u64_div_u64.c mul_u64_u64_div_u64: basic sanity test 2024-09-01 20:43:22 -07:00