mirror of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-09-18 22:14:16 +00:00
Optimize integer-to-string conversion in vsprintf.c for base 10. This is by far the most used conversion, and in some use cases it impacts performance. For example, top reads /proc/$PID/stat for every process, and with 4000 processes decimal conversion alone takes noticeable time. Using code from http://www.cs.uiowa.edu/~jones/bcd/decimal.html (with permission from the author, Douglas W. Jones) binary-to-decimal-string conversion is done in groups of five digits at once, using only additions/subtractions/shifts (with -O2; -Os throws in some multiply instructions). On i386 arch gcc 4.1.2 -O2 generates ~500 bytes of code. This patch is run tested. Userspace benchmark/test is also attached. I tested it on PIII and AMD64 and new code is generally ~2.5 times faster. On AMD64: # ./vsprintf_verify-O2 Original decimal conv: .......... 151 ns per iteration Patched decimal conv: .......... 62 ns per iteration Testing correctness 12895992590592 ok... [Ctrl-C] # ./vsprintf_verify-O2 Original decimal conv: .......... 151 ns per iteration Patched decimal conv: .......... 62 ns per iteration Testing correctness 26025406464 ok... [Ctrl-C] More realistic test: top from busybox project was modified to report how many us it took to scan /proc (this does not account any processing done after that, like sorting process list), and then I test it with 4000 processes: #!/bin/sh i=4000 while test $i != 0; do sleep 30 & let i-- done busybox top -b -n3 >/dev/null on unpatched kernel: top: 4120 processes took 102864 microseconds to scan top: 4120 processes took 91757 microseconds to scan top: 4120 processes took 92517 microseconds to scan top: 4120 processes took 92581 microseconds to scan on patched kernel: top: 4120 processes took 75460 microseconds to scan top: 4120 processes took 66451 microseconds to scan top: 4120 processes took 67267 microseconds to scan top: 4120 processes took 67618 microseconds to scan The speedup comes from much faster generation of /proc/PID/stat by sprintf() calls inside the kernel. Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|---|---|---|
| .. | ||
| lzo | ||
| reed_solomon | ||
| zlib_deflate | ||
| zlib_inflate | ||
| .gitignore | ||
| audit.c | ||
| bitmap.c | ||
| bitrev.c | ||
| bug.c | ||
| bust_spinlocks.c | ||
| check_signature.c | ||
| cmdline.c | ||
| cpumask.c | ||
| crc-ccitt.c | ||
| crc-itu-t.c | ||
| crc16.c | ||
| crc32.c | ||
| crc32defs.h | ||
| ctype.c | ||
| debug_locks.c | ||
| dec_and_lock.c | ||
| devres.c | ||
| div64.c | ||
| dump_stack.c | ||
| extable.c | ||
| fault-inject.c | ||
| find_next_bit.c | ||
| gen_crc32table.c | ||
| genalloc.c | ||
| halfmd4.c | ||
| hexdump.c | ||
| hweight.c | ||
| idr.c | ||
| inflate.c | ||
| int_sqrt.c | ||
| iomap.c | ||
| iomap_copy.c | ||
| ioremap.c | ||
| irq_regs.c | ||
| Kconfig | ||
| Kconfig.debug | ||
| kernel_lock.c | ||
| klist.c | ||
| kobject.c | ||
| kobject_uevent.c | ||
| kref.c | ||
| libcrc32c.c | ||
| list_debug.c | ||
| locking-selftest-hardirq.h | ||
| locking-selftest-mutex.h | ||
| locking-selftest-rlock-hardirq.h | ||
| locking-selftest-rlock-softirq.h | ||
| locking-selftest-rlock.h | ||
| locking-selftest-rsem.h | ||
| locking-selftest-softirq.h | ||
| locking-selftest-spin-hardirq.h | ||
| locking-selftest-spin-softirq.h | ||
| locking-selftest-spin.h | ||
| locking-selftest-wlock-hardirq.h | ||
| locking-selftest-wlock-softirq.h | ||
| locking-selftest-wlock.h | ||
| locking-selftest-wsem.h | ||
| locking-selftest.c | ||
| Makefile | ||
| parser.c | ||
| percpu_counter.c | ||
| plist.c | ||
| prio_tree.c | ||
| radix-tree.c | ||
| random32.c | ||
| rbtree.c | ||
| reciprocal_div.c | ||
| rwsem-spinlock.c | ||
| rwsem.c | ||
| semaphore-sleepers.c | ||
| sha1.c | ||
| smp_processor_id.c | ||
| sort.c | ||
| spinlock_debug.c | ||
| string.c | ||
| swiotlb.c | ||
| textsearch.c | ||
| ts_bm.c | ||
| ts_fsm.c | ||
| ts_kmp.c | ||
| vsprintf.c | ||