[PATCH] RCU documentation fixes (January 2006 update)
Updates to in-tree RCU documentation based on comments over the past few months. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This commit is contained in:
parent
53d8be5c14
commit
d19720a909
6 changed files with 68 additions and 47 deletions
|
@ -90,16 +90,20 @@ at OLS. The resulting abundance of RCU patches was presented the
|
||||||
following year [McKenney02a], and use of RCU in dcache was first
|
following year [McKenney02a], and use of RCU in dcache was first
|
||||||
described that same year [Linder02a].
|
described that same year [Linder02a].
|
||||||
|
|
||||||
Also in 2002, Michael [Michael02b,Michael02a] presented techniques
|
Also in 2002, Michael [Michael02b,Michael02a] presented "hazard-pointer"
|
||||||
that defer the destruction of data structures to simplify non-blocking
|
techniques that defer the destruction of data structures to simplify
|
||||||
synchronization (wait-free synchronization, lock-free synchronization,
|
non-blocking synchronization (wait-free synchronization, lock-free
|
||||||
and obstruction-free synchronization are all examples of non-blocking
|
synchronization, and obstruction-free synchronization are all examples of
|
||||||
synchronization). In particular, this technique eliminates locking,
|
non-blocking synchronization). In particular, this technique eliminates
|
||||||
reduces contention, reduces memory latency for readers, and parallelizes
|
locking, reduces contention, reduces memory latency for readers, and
|
||||||
pipeline stalls and memory latency for writers. However, these
|
parallelizes pipeline stalls and memory latency for writers. However,
|
||||||
techniques still impose significant read-side overhead in the form of
|
these techniques still impose significant read-side overhead in the
|
||||||
memory barriers. Researchers at Sun worked along similar lines in the
|
form of memory barriers. Researchers at Sun worked along similar lines
|
||||||
same timeframe [HerlihyLM02,HerlihyLMS03].
|
in the same timeframe [HerlihyLM02,HerlihyLMS03]. These techniques
|
||||||
|
can be thought of as inside-out reference counts, where the count is
|
||||||
|
represented by the number of hazard pointers referencing a given data
|
||||||
|
structure (rather than the more conventional counter field within the
|
||||||
|
data structure itself).
|
||||||
|
|
||||||
In 2003, the K42 group described how RCU could be used to create
|
In 2003, the K42 group described how RCU could be used to create
|
||||||
hot-pluggable implementations of operating-system functions. Later that
|
hot-pluggable implementations of operating-system functions. Later that
|
||||||
|
@ -113,7 +117,6 @@ number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper
|
||||||
describing how to make RCU safe for soft-realtime applications [Sarma04c],
|
describing how to make RCU safe for soft-realtime applications [Sarma04c],
|
||||||
and a paper describing SELinux performance with RCU [JamesMorris04b].
|
and a paper describing SELinux performance with RCU [JamesMorris04b].
|
||||||
|
|
||||||
|
|
||||||
2005 has seen further adaptation of RCU to realtime use, permitting
|
2005 has seen further adaptation of RCU to realtime use, permitting
|
||||||
preemption of RCU realtime critical sections [PaulMcKenney05a,
|
preemption of RCU realtime critical sections [PaulMcKenney05a,
|
||||||
PaulMcKenney05b].
|
PaulMcKenney05b].
|
||||||
|
|
|
@ -177,3 +177,9 @@ over a rather long period of time, but improvements are always welcome!
|
||||||
|
|
||||||
If you want to wait for some of these other things, you might
|
If you want to wait for some of these other things, you might
|
||||||
instead need to use synchronize_irq() or synchronize_sched().
|
instead need to use synchronize_irq() or synchronize_sched().
|
||||||
|
|
||||||
|
12. Any lock acquired by an RCU callback must be acquired elsewhere
|
||||||
|
with irq disabled, e.g., via spin_lock_irqsave(). Failing to
|
||||||
|
disable irq on a given acquisition of that lock will result in
|
||||||
|
deadlock as soon as the RCU callback happens to interrupt that
|
||||||
|
acquisition's critical section.
|
||||||
|
|
|
@ -232,7 +232,7 @@ entry does not exist. For this to be helpful, the search function must
|
||||||
return holding the per-entry spinlock, as ipc_lock() does in fact do.
|
return holding the per-entry spinlock, as ipc_lock() does in fact do.
|
||||||
|
|
||||||
Quick Quiz: Why does the search function need to return holding the
|
Quick Quiz: Why does the search function need to return holding the
|
||||||
per-entry lock for this deleted-flag technique to be helpful?
|
per-entry lock for this deleted-flag technique to be helpful?
|
||||||
|
|
||||||
If the system-call audit module were to ever need to reject stale data,
|
If the system-call audit module were to ever need to reject stale data,
|
||||||
one way to accomplish this would be to add a "deleted" flag and a "lock"
|
one way to accomplish this would be to add a "deleted" flag and a "lock"
|
||||||
|
@ -275,8 +275,8 @@ flag under the spinlock as follows:
|
||||||
{
|
{
|
||||||
struct audit_entry *e;
|
struct audit_entry *e;
|
||||||
|
|
||||||
/* Do not use the _rcu iterator here, since this is the only
|
/* Do not need to use the _rcu iterator here, since this
|
||||||
* deletion routine. */
|
* is the only deletion routine. */
|
||||||
list_for_each_entry(e, list, list) {
|
list_for_each_entry(e, list, list) {
|
||||||
if (!audit_compare_rule(rule, &e->rule)) {
|
if (!audit_compare_rule(rule, &e->rule)) {
|
||||||
spin_lock(&e->lock);
|
spin_lock(&e->lock);
|
||||||
|
@ -304,9 +304,12 @@ function to reject newly deleted data.
|
||||||
|
|
||||||
|
|
||||||
Answer to Quick Quiz
|
Answer to Quick Quiz
|
||||||
|
Why does the search function need to return holding the per-entry
|
||||||
|
lock for this deleted-flag technique to be helpful?
|
||||||
|
|
||||||
If the search function drops the per-entry lock before returning, then
|
If the search function drops the per-entry lock before returning,
|
||||||
the caller will be processing stale data in any case. If it is really
|
then the caller will be processing stale data in any case. If it
|
||||||
OK to be processing stale data, then you don't need a "deleted" flag.
|
is really OK to be processing stale data, then you don't need a
|
||||||
If processing stale data really is a problem, then you need to hold the
|
"deleted" flag. If processing stale data really is a problem,
|
||||||
per-entry lock across all of the code that uses the value looked up.
|
then you need to hold the per-entry lock across all of the code
|
||||||
|
that uses the value that was returned.
|
||||||
|
|
|
@ -111,6 +111,11 @@ o What are all these files in this directory?
|
||||||
|
|
||||||
You are reading it!
|
You are reading it!
|
||||||
|
|
||||||
|
rcuref.txt
|
||||||
|
|
||||||
|
Describes how to combine use of reference counts
|
||||||
|
with RCU.
|
||||||
|
|
||||||
whatisRCU.txt
|
whatisRCU.txt
|
||||||
|
|
||||||
Overview of how the RCU implementation works. Along
|
Overview of how the RCU implementation works. Along
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
Refcounter design for elements of lists/arrays protected by RCU.
|
Reference-count design for elements of lists/arrays protected by RCU.
|
||||||
|
|
||||||
Refcounting on elements of lists which are protected by traditional
|
Reference counting on elements of lists which are protected by traditional
|
||||||
reader/writer spinlocks or semaphores are straight forward as in:
|
reader/writer spinlocks or semaphores are straightforward:
|
||||||
|
|
||||||
1. 2.
|
1. 2.
|
||||||
add() search_and_reference()
|
add() search_and_reference()
|
||||||
|
@ -28,12 +28,12 @@ release_referenced() delete()
|
||||||
...
|
...
|
||||||
}
|
}
|
||||||
|
|
||||||
If this list/array is made lock free using rcu as in changing the
|
If this list/array is made lock free using RCU as in changing the
|
||||||
write_lock in add() and delete() to spin_lock and changing read_lock
|
write_lock() in add() and delete() to spin_lock and changing read_lock
|
||||||
in search_and_reference to rcu_read_lock(), the atomic_get in
|
in search_and_reference to rcu_read_lock(), the atomic_get in
|
||||||
search_and_reference could potentially hold reference to an element which
|
search_and_reference could potentially hold reference to an element which
|
||||||
has already been deleted from the list/array. atomic_inc_not_zero takes
|
has already been deleted from the list/array. Use atomic_inc_not_zero()
|
||||||
care of this scenario. search_and_reference should look as;
|
in this scenario as follows:
|
||||||
|
|
||||||
1. 2.
|
1. 2.
|
||||||
add() search_and_reference()
|
add() search_and_reference()
|
||||||
|
@ -51,17 +51,16 @@ add() search_and_reference()
|
||||||
release_referenced() delete()
|
release_referenced() delete()
|
||||||
{ {
|
{ {
|
||||||
... write_lock(&list_lock);
|
... write_lock(&list_lock);
|
||||||
atomic_dec(&el->rc, relfunc) ...
|
if (atomic_dec_and_test(&el->rc)) ...
|
||||||
... delete_element
|
call_rcu(&el->head, el_free); delete_element
|
||||||
} write_unlock(&list_lock);
|
... write_unlock(&list_lock);
|
||||||
...
|
} ...
|
||||||
if (atomic_dec_and_test(&el->rc))
|
if (atomic_dec_and_test(&el->rc))
|
||||||
call_rcu(&el->head, el_free);
|
call_rcu(&el->head, el_free);
|
||||||
...
|
...
|
||||||
}
|
}
|
||||||
|
|
||||||
Sometimes, reference to the element need to be obtained in the
|
Sometimes, a reference to the element needs to be obtained in the
|
||||||
update (write) stream. In such cases, atomic_inc_not_zero might be an
|
update (write) stream. In such cases, atomic_inc_not_zero() might be
|
||||||
overkill since the spinlock serialising list updates are held. atomic_inc
|
overkill, since we hold the update-side spinlock. One might instead
|
||||||
is to be used in such cases.
|
use atomic_inc() in such cases.
|
||||||
|
|
||||||
|
|
|
@ -200,10 +200,11 @@ rcu_assign_pointer()
|
||||||
the new value, and also executes any memory-barrier instructions
|
the new value, and also executes any memory-barrier instructions
|
||||||
required for a given CPU architecture.
|
required for a given CPU architecture.
|
||||||
|
|
||||||
Perhaps more important, it serves to document which pointers
|
Perhaps just as important, it serves to document (1) which
|
||||||
are protected by RCU. That said, rcu_assign_pointer() is most
|
pointers are protected by RCU and (2) the point at which a
|
||||||
frequently used indirectly, via the _rcu list-manipulation
|
given structure becomes accessible to other CPUs. That said,
|
||||||
primitives such as list_add_rcu().
|
rcu_assign_pointer() is most frequently used indirectly, via
|
||||||
|
the _rcu list-manipulation primitives such as list_add_rcu().
|
||||||
|
|
||||||
rcu_dereference()
|
rcu_dereference()
|
||||||
|
|
||||||
|
@ -258,9 +259,11 @@ rcu_dereference()
|
||||||
locking.
|
locking.
|
||||||
|
|
||||||
As with rcu_assign_pointer(), an important function of
|
As with rcu_assign_pointer(), an important function of
|
||||||
rcu_dereference() is to document which pointers are protected
|
rcu_dereference() is to document which pointers are protected by
|
||||||
by RCU. And, again like rcu_assign_pointer(), rcu_dereference()
|
RCU, in particular, flagging a pointer that is subject to changing
|
||||||
is typically used indirectly, via the _rcu list-manipulation
|
at any time, including immediately after the rcu_dereference().
|
||||||
|
And, again like rcu_assign_pointer(), rcu_dereference() is
|
||||||
|
typically used indirectly, via the _rcu list-manipulation
|
||||||
primitives, such as list_for_each_entry_rcu().
|
primitives, such as list_for_each_entry_rcu().
|
||||||
|
|
||||||
The following diagram shows how each API communicates among the
|
The following diagram shows how each API communicates among the
|
||||||
|
@ -327,7 +330,7 @@ for specialized uses, but are relatively uncommon.
|
||||||
3. WHAT ARE SOME EXAMPLE USES OF CORE RCU API?
|
3. WHAT ARE SOME EXAMPLE USES OF CORE RCU API?
|
||||||
|
|
||||||
This section shows a simple use of the core RCU API to protect a
|
This section shows a simple use of the core RCU API to protect a
|
||||||
global pointer to a dynamically allocated structure. More typical
|
global pointer to a dynamically allocated structure. More-typical
|
||||||
uses of RCU may be found in listRCU.txt, arrayRCU.txt, and NMI-RCU.txt.
|
uses of RCU may be found in listRCU.txt, arrayRCU.txt, and NMI-RCU.txt.
|
||||||
|
|
||||||
struct foo {
|
struct foo {
|
||||||
|
@ -410,6 +413,8 @@ o Use synchronize_rcu() -after- removing a data element from an
|
||||||
data item.
|
data item.
|
||||||
|
|
||||||
See checklist.txt for additional rules to follow when using RCU.
|
See checklist.txt for additional rules to follow when using RCU.
|
||||||
|
And again, more-typical uses of RCU may be found in listRCU.txt,
|
||||||
|
arrayRCU.txt, and NMI-RCU.txt.
|
||||||
|
|
||||||
|
|
||||||
4. WHAT IF MY UPDATING THREAD CANNOT BLOCK?
|
4. WHAT IF MY UPDATING THREAD CANNOT BLOCK?
|
||||||
|
@ -513,7 +518,7 @@ production-quality implementation, and see:
|
||||||
|
|
||||||
for papers describing the Linux kernel RCU implementation. The OLS'01
|
for papers describing the Linux kernel RCU implementation. The OLS'01
|
||||||
and OLS'02 papers are a good introduction, and the dissertation provides
|
and OLS'02 papers are a good introduction, and the dissertation provides
|
||||||
more details on the current implementation.
|
more details on the current implementation as of early 2004.
|
||||||
|
|
||||||
|
|
||||||
5A. "TOY" IMPLEMENTATION #1: LOCKING
|
5A. "TOY" IMPLEMENTATION #1: LOCKING
|
||||||
|
@ -768,7 +773,6 @@ RCU pointer/list traversal:
|
||||||
rcu_dereference
|
rcu_dereference
|
||||||
list_for_each_rcu (to be deprecated in favor of
|
list_for_each_rcu (to be deprecated in favor of
|
||||||
list_for_each_entry_rcu)
|
list_for_each_entry_rcu)
|
||||||
list_for_each_safe_rcu (deprecated, not used)
|
|
||||||
list_for_each_entry_rcu
|
list_for_each_entry_rcu
|
||||||
list_for_each_continue_rcu (to be deprecated in favor of new
|
list_for_each_continue_rcu (to be deprecated in favor of new
|
||||||
list_for_each_entry_continue_rcu)
|
list_for_each_entry_continue_rcu)
|
||||||
|
@ -807,7 +811,8 @@ Quick Quiz #1: Why is this argument naive? How could a deadlock
|
||||||
Answer: Consider the following sequence of events:
|
Answer: Consider the following sequence of events:
|
||||||
|
|
||||||
1. CPU 0 acquires some unrelated lock, call it
|
1. CPU 0 acquires some unrelated lock, call it
|
||||||
"problematic_lock".
|
"problematic_lock", disabling irq via
|
||||||
|
spin_lock_irqsave().
|
||||||
|
|
||||||
2. CPU 1 enters synchronize_rcu(), write-acquiring
|
2. CPU 1 enters synchronize_rcu(), write-acquiring
|
||||||
rcu_gp_mutex.
|
rcu_gp_mutex.
|
||||||
|
@ -894,7 +899,7 @@ Answer: Just as PREEMPT_RT permits preemption of spinlock
|
||||||
ACKNOWLEDGEMENTS
|
ACKNOWLEDGEMENTS
|
||||||
|
|
||||||
My thanks to the people who helped make this human-readable, including
|
My thanks to the people who helped make this human-readable, including
|
||||||
Jon Walpole, Josh Triplett, Serge Hallyn, and Suzanne Wood.
|
Jon Walpole, Josh Triplett, Serge Hallyn, Suzanne Wood, and Alan Stern.
|
||||||
|
|
||||||
|
|
||||||
For more information, see http://www.rdrop.com/users/paulmck/RCU.
|
For more information, see http://www.rdrop.com/users/paulmck/RCU.
|
||||||
|
|
Loading…
Reference in a new issue