kernel-fxtec-pro1x/kernel
Lai Jiangshan 5127bed588 rcu classic: new algorithm for callbacks-processing(v2)
This is v2, it's a little deference from v1 that I
had send to lkml.
use ACCESS_ONCE
use rcu_batch_after/rcu_batch_before for batch # comparison.

rcutorture test result:
(hotplugs: do cpu-online/offline once per second)

No CONFIG_NO_HZ:           OK, 12hours
No CONFIG_NO_HZ, hotplugs: OK, 12hours
CONFIG_NO_HZ=y:            OK, 24hours
CONFIG_NO_HZ=y, hotplugs:  Failed.
(Failed also without my patch applied, exactly the same bug occurred,
http://lkml.org/lkml/2008/7/3/24)

v1's email thread:
http://lkml.org/lkml/2008/6/2/539

v1's description:

The code/algorithm of the implement of current callbacks-processing
is very efficient and technical. But when I studied it and I found
a disadvantage:

In multi-CPU systems, when a new RCU callback is being
queued(call_rcu[_bh]), this callback will be invoked after the grace
period for the batch with batch number = rcp->cur+2 has completed
very very likely in current implement. Actually, this callback can be
invoked after the grace period for the batch with
batch number = rcp->cur+1 has completed. The delay of invocation means
that latency of synchronize_rcu() is extended. But more important thing
is that the callbacks usually free memory, and these works are delayed
too! it's necessary for reclaimer to free memory as soon as
possible when left memory is few.

A very simple way can solve this problem:
a field(struct rcu_head::batch) is added to record the batch number for
the RCU callback. And when a new RCU callback is being queued, we
determine the batch number for this callback(head->batch = rcp->cur+1)
and we move this callback to rdp->donelist if we find
that head->batch <= rcp->completed when we process callbacks.
This simple way reduces the wait time for invocation a lot. (about
2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems)

This is my algorithm. But I do not add any field for struct rcu_head
in my implement. We just need to memorize the last 2 batches and
their batch number, because these 2 batches include all entries that
for whom the grace period hasn't completed. So we use a special
linked-list rather than add a field.
Please see the comment of struct rcu_data.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Gautham Shenoy <ego@in.ibm.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 16:07:33 +02:00
..
irq genirq: remove extraneous checks in manage.c 2008-07-10 07:01:13 +02:00
power Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2008-07-16 17:25:46 -07:00
time Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
trace Merge branch 'tracing/ftrace' into auto-ftrace-next 2008-07-14 15:58:35 +02:00
.gitignore Update kernel/.gitignore with new auto-generated files 2008-02-09 23:27:01 -08:00
acct.c bsd_acct: using task_struct->tgid is not right in pid-namespaces 2008-03-24 19:22:20 -07:00
audit.c [PATCH] remove useless argument type in audit_filter_user() 2008-06-24 23:36:35 -04:00
audit.h [PATCH 1/2] audit: move extern declarations to audit.h 2008-04-28 06:28:04 -04:00
audit_tree.c [PATCH] list_for_each_rcu must die: audit 2008-05-17 03:30:23 -04:00
auditfilter.c [PATCH] remove useless argument type in audit_filter_user() 2008-06-24 23:36:35 -04:00
auditsc.c [PATCH] new predicate - AUDIT_FILETYPE 2008-04-28 06:28:37 -04:00
backtracetest.c backtrace: replace timer with tasklet + completions 2008-06-27 18:09:16 +02:00
bounds.c Add kbuild.h that contains common definitions for kbuild users 2008-04-29 08:06:29 -07:00
capability.c security: filesystem capabilities: fix fragile setuid fixup code 2008-07-04 10:40:08 -07:00
cgroup.c cgroups: remove node_ prefix_from ns subsystem 2008-05-24 09:56:14 -07:00
cgroup_debug.c CGroup API files: move "releasable" to cgroup_debug subsystem 2008-04-29 08:06:09 -07:00
compat.c ntp: support for TAI 2008-05-01 08:03:59 -07:00
configs.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
cpu.c force offline the processor during hot-removal 2008-07-16 23:27:01 +02:00
cpuset.c Merge commit 'v2.6.26' into sched/devel 2008-07-14 12:19:19 +02:00
delayacct.c
dma.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
exec_domain.c
exit.c fix dangling zombie when new parent ignores children 2008-07-16 18:02:34 -07:00
extable.c
fork.c ptrace children revamp 2008-07-16 18:02:33 -07:00
futex.c futexes: fix fault handling in futex_lock_pi 2008-06-23 13:31:15 +02:00
futex_compat.c futex_compat __user annotation 2008-03-30 14:18:41 -07:00
hrtimer.c Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
itimer.c ITIMER_REAL: convert to use struct pid 2008-02-08 09:22:29 -08:00
kallsyms.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
Kconfig.hz
Kconfig.preempt rcu: move PREEMPT_RCU config option back under PREEMPT 2008-03-10 18:01:20 -07:00
kexec.c kexec: make extended crashkernel= syntax less confusing 2008-05-01 08:04:00 -07:00
kfifo.c
kgdb.c kgdb: sparse fix 2008-06-24 10:52:55 -05:00
kmod.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
kprobes.c kernel/kprobes.c: Made kprobe_blacklist static. 2008-07-10 10:13:51 -07:00
ksysfs.c
kthread.c Freezer: Introduce PF_FREEZER_NOSIG 2008-07-16 23:27:03 +02:00
latencytop.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
lockdep.c Merge branch 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-14 14:55:13 -07:00
lockdep_internals.h lockdep: add lock_class information to lock_chain and output it 2008-06-24 01:28:20 +02:00
lockdep_proc.c lockdep: add lock_class information to lock_chain and output it 2008-06-24 01:28:20 +02:00
Makefile ftrace: do not trace scheduler functions 2008-07-17 17:40:11 +02:00
marker.c Markers - remove extra format argument 2008-05-23 22:25:27 +02:00
module.c modules: proper cleanup of kobject without CONFIG_SYSFS 2008-05-23 13:09:33 +10:00
mutex-debug.c mutex-debug: check mutex magic before owner 2008-05-16 16:53:35 +02:00
mutex-debug.h
mutex.c __mutex_lock_common: use signal_pending_state() 2008-06-10 11:45:09 +02:00
mutex.h
notifier.c ipc: re-enable msgmni automatic recomputing msgmni if set to negative 2008-04-29 08:06:13 -07:00
ns_cgroup.c cgroups: kernel/ns_cgroup.c should #include <linux/nsproxy.h> 2008-04-29 08:06:07 -07:00
nsproxy.c ipc: sysvsem: refuse clone(CLONE_SYSVSEM|CLONE_NEWIPC) 2008-04-29 08:06:14 -07:00
panic.c Taint kernel after WARN_ON(condition) 2008-04-29 08:05:59 -07:00
params.c Add new string functions strict_strto* and convert kernel params to use them 2008-02-08 09:22:41 -08:00
pid.c rcu: split list.h and move rcu-protected lists into rculist.h 2008-05-19 10:01:37 +02:00
pid_namespace.c pidns: make pid->level and pid_ns->level unsigned 2008-04-30 08:29:49 -07:00
pm_qos_params.c pm_qos_params: BKL pushdown 2008-07-02 15:06:24 -06:00
posix-cpu-timers.c posix-timers: print RT watchdog message 2008-05-24 18:49:22 +02:00
posix-timers.c signals: join send_sigqueue() with send_group_sigqueue() 2008-04-30 08:29:36 -07:00
printk.c Merge branch 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-14 15:27:43 -07:00
profile.c on_each_cpu(): kill unused 'retry' parameter 2008-06-26 11:24:38 +02:00
ptrace.c ptrace children revamp 2008-07-16 18:02:33 -07:00
rcuclassic.c rcu classic: new algorithm for callbacks-processing(v2) 2008-07-18 16:07:33 +02:00
rcupdate.c Merge branch 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-15 14:12:03 -07:00
rcupreempt.c Merge branch 'core/rcu' into core/rcu-for-linus 2008-07-15 21:10:12 +02:00
rcupreempt_trace.c rcu: remove duplicated include in kernel/rcupreempt_trace.c 2008-05-19 10:03:39 +02:00
rcutorture.c rcu: make rcutorture even more vicious: invoke RCU readers from irq handlers (timers) 2008-06-26 09:24:33 +02:00
relay.c splice: fix sendfile() issue with relay 2008-05-28 14:49:27 +02:00
res_counter.c memcgroup: add the max_usage member on the res_counter 2008-04-29 08:06:10 -07:00
resource.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
rtmutex-debug.c Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c hrtimer: more hrtimer_init_sleeper() fallout. 2008-02-13 15:45:36 +01:00
rtmutex.h
rtmutex_common.h Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rwsem.c
sched.c Merge branch 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-14 15:28:42 -07:00
sched_clock.c Merge branch 'sched/clock' into sched/devel 2008-07-14 12:19:13 +02:00
sched_cpupri.c sched: use a 2-d bitmap for searching lowest-pri CPU 2008-06-06 15:19:28 +02:00
sched_cpupri.h sched: fix the cpuprio count really 2008-06-06 15:19:44 +02:00
sched_debug.c sched: add full schedstats to /proc/sched_debug 2008-06-27 14:31:31 +02:00
sched_fair.c sched: add avg-overlap support to RT tasks 2008-07-04 12:50:22 +02:00
sched_features.h sched: bias effective_load() error towards failing wake_affine(). 2008-06-27 14:31:47 +02:00
sched_idletask.c sched: make rt_sched_class, idle_sched_class static 2008-05-05 23:56:17 +02:00
sched_rt.c sched: make sched_{rt,fair}.c ifdefs more readable 2008-06-27 14:32:05 +02:00
sched_stats.h sched: fix accounting in task delay accounting & migration 2008-07-04 12:50:23 +02:00
seccomp.c
semaphore.c mmiotrace broken in linux-next (8-bit writes only) 2008-07-01 10:14:06 +02:00
signal.c posix timers: discard SI_TIMER signals on exec 2008-05-26 10:37:07 -07:00
smp.c generic ipi function calls: wait on alloc failure fallback 2008-07-15 14:12:20 -07:00
softirq.c Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
softlockup.c softlockup: print a module list on being stuck 2008-07-05 08:51:24 +02:00
spinlock.c ftrace: lockdep notrace annotations 2008-05-23 20:39:40 +02:00
srcu.c
stacktrace.c stacktrace: fix modular build, export print_stack_trace and save_stack_trace 2008-06-30 09:20:55 +02:00
stop_machine.c sched: add new API sched_setscheduler_nocheck: add a flag to control access checks 2008-06-23 22:57:56 +02:00
sys.c sys_prctl(): fix return of uninitialized value 2008-05-24 09:56:13 -07:00
sys_ni.c
sysctl.c Merge branch 'core/rcu' into core/rcu-for-linus 2008-07-15 21:10:12 +02:00
sysctl_check.c constify tables in kernel/sysctl_check.c 2008-02-08 09:22:31 -08:00
taskstats.c Use find_task_by_vpid in taskstats 2008-04-30 08:29:48 -07:00
test_kprobes.c kprobes: kretprobe user entry-handler 2008-02-06 10:41:11 -08:00
time.c Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm 2008-07-14 16:06:58 -07:00
tsacct.c
uid16.c asmlinkage_protect replaces prevent_tail_call 2008-04-10 17:28:26 -07:00
user.c alloc_uid: cleanup 2008-04-30 08:29:53 -07:00
user_namespace.c eCryptfs: make key module subsystem respect namespaces 2008-04-29 08:06:07 -07:00
utsname.c kernel: explicitly include required header files under kernel/ 2008-04-29 08:06:04 -07:00
utsname_sysctl.c
wait.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
workqueue.c Christoph has moved 2008-07-04 10:40:04 -07:00