The direction test inside the loop was not always completely optimized,
resulting in a larger than necessary function. This change adds a
direction variable that is set out of the loop. Now the function is down
to 48 bytes on x86, 32 on ARM and 68 on mips. It's worth noting that other
approaches were attempted (including relying on the up and down functions)
but they were only slightly beneficial on x86 and cost more on others.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Till now memcpy() relies on memmove(), but it's always included for libgcc,
so we have a larger than needed function. Let's implement two unidirectional
variants to copy from bottom to top and from top to bottom, and use the
former for memcpy(). The variants are optimized to be compact, and at the
same time the compiler is sometimes able to detect the loop and to replace
it with a "rep movsb". The new function is 24 bytes instead of 52 on x86_64.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The string manipulation functions (mem*, str*) are now found in
string.h. The file depends on almost nothing and will be
usable from other includes if needed. Maybe more functions could
be added.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>