kernel-fxtec-pro1x

Author	SHA1	Message	Date
Eric Dumazet	00b13445f9	rcu: Avoid data-race in rcu_gp_fqs_check_wake() commit 6935c3983b246d5fbfebd3b891c825e65c118f2d upstream. The rcu_gp_fqs_check_wake() function uses rcu_preempt_blocked_readers_cgp() to read ->gp_tasks while other cpus might overwrite this field. We need READ_ONCE()/WRITE_ONCE() pairs to avoid compiler tricks and KCSAN splats like the following : BUG: KCSAN: data-race in rcu_gp_fqs_check_wake / rcu_preempt_deferred_qs_irqrestore write to 0xffffffff85a7f190 of 8 bytes by task 7317 on cpu 0: rcu_preempt_deferred_qs_irqrestore+0x43d/0x580 kernel/rcu/tree_plugin.h:507 rcu_read_unlock_special+0xec/0x370 kernel/rcu/tree_plugin.h:659 __rcu_read_unlock+0xcf/0xe0 kernel/rcu/tree_plugin.h:394 rcu_read_unlock include/linux/rcupdate.h:645 [inline] __ip_queue_xmit+0x3b0/0xa40 net/ipv4/ip_output.c:533 ip_queue_xmit+0x45/0x60 include/net/ip.h:236 __tcp_transmit_skb+0xdeb/0x1cd0 net/ipv4/tcp_output.c:1158 __tcp_send_ack+0x246/0x300 net/ipv4/tcp_output.c:3685 tcp_send_ack+0x34/0x40 net/ipv4/tcp_output.c:3691 tcp_cleanup_rbuf+0x130/0x360 net/ipv4/tcp.c:1575 tcp_recvmsg+0x633/0x1a30 net/ipv4/tcp.c:2179 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1864 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 read to 0xffffffff85a7f190 of 8 bytes by task 10 on cpu 1: rcu_gp_fqs_check_wake kernel/rcu/tree.c:1556 [inline] rcu_gp_fqs_check_wake+0x93/0xd0 kernel/rcu/tree.c:1546 rcu_gp_fqs_loop+0x36c/0x580 kernel/rcu/tree.c:1611 rcu_gp_kthread+0x143/0x220 kernel/rcu/tree.c:1768 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 10 Comm: rcu_preempt Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> [ paulmck: Added another READ_ONCE() for RCU CPU stall warnings. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:55 -08:00
Paul E. McKenney	506b28fb99	rcuperf: Fix cleanup path for invalid perf_type strings [ Upstream commit ad092c027713a68a34168942a5ef422e42e039f4 ] If the specified rcuperf.perf_type is not in the rcu_perf_init() function's perf_ops[] array, rcuperf prints some console messages and then invokes rcu_perf_cleanup() to set state so that a future torture test can run. However, rcu_perf_cleanup() also attempts to end the test that didn't actually start, and in doing so relies on the value of cur_ops, a value that is not particularly relevant in this case. This can result in confusing output or even follow-on failures due to attempts to use facilities that have not been properly initialized. This commit therefore sets the value of cur_ops to NULL in this case and inserts a check near the beginning of rcu_perf_cleanup(), thus avoiding relying on an irrelevant cur_ops value. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-05-31 06:46:30 -07:00
Paul E. McKenney	aa7919e37f	rcutorture: Fix cleanup path for invalid torture_type strings [ Upstream commit b813afae7ab6a5e91b4e16cc567331d9c2ae1f04 ] If the specified rcutorture.torture_type is not in the rcu_torture_init() function's torture_ops[] array, rcutorture prints some console messages and then invokes rcu_torture_cleanup() to set state so that a future torture test can run. However, rcu_torture_cleanup() also attempts to end the test that didn't actually start, and in doing so relies on the value of cur_ops, a value that is not particularly relevant in this case. This can result in confusing output or even follow-on failures due to attempts to use facilities that have not been properly initialized. This commit therefore sets the value of cur_ops to NULL in this case and inserts a check near the beginning of rcu_torture_cleanup(), thus avoiding relying on an irrelevant cur_ops value. Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-05-31 06:46:30 -07:00
Masami Hiramatsu	d53b295f78	kprobes: Prohibit probing on RCU debug routine [ Upstream commit a39f15b9644fac3f950f522c39e667c3af25c588 ] Since kprobe itself depends on RCU, probing on RCU debug routine can cause recursive breakpoint bugs. Prohibit probing on RCU debug routines. int3 ->do_int3() ->ist_enter() ->RCU_LOCKDEP_WARN() ->debug_lockdep_rcu_enabled() -> int3 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/154998807741.31052.11229157537816341591.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-04-05 22:33:08 +02:00
Zhang, Jun	e97a32a5a3	rcu: Do RCU GP kthread self-wakeup from softirq and interrupt commit 1d1f898df6586c5ea9aeaf349f13089c6fa37903 upstream. The rcu_gp_kthread_wake() function is invoked when it might be necessary to wake the RCU grace-period kthread. Because self-wakeups are normally a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from this kthread, it naturally refuses to do the wakeup. Unfortunately, natural though it might be, this heuristic fails when rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler that interrupted the grace-period kthread just after the final check of the wait-event condition but just before the schedule() call. In this case, a wakeup is required, even though the call to rcu_gp_kthread_wake() is within the RCU grace-period kthread's context. Failing to provide this wakeup can result in grace periods failing to start, which in turn results in out-of-memory conditions. This race window is quite narrow, but it actually did happen during real testing. It would of course need to be fixed even if it was strictly theoretical in nature. This patch does not Cc stable because it does not apply cleanly to earlier kernel versions. Fixes: `48a7639ce8` ("rcu: Make callers awaken grace-period kthread") Reported-by: "He, Bo" <bo.he@intel.com> Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com> Co-developed-by: "He, Bo" <bo.he@intel.com> Co-developed-by: "xiao, jin" <jin.xiao@intel.com> Co-developed-by: Bai, Jie A <jie.a.bai@intel.com> Signed-off: "Zhang, Jun" <jun.zhang@intel.com> Signed-off: "He, Bo" <bo.he@intel.com> Signed-off: "xiao, jin" <jin.xiao@intel.com> Signed-off: Bai, Jie A <jie.a.bai@intel.com> Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com> [ paulmck: Switch from !in_softirq() to "!in_interrupt() && !in_serving_softirq() to avoid redundant wakeups and to also handle the interrupt-handler scenario as well as the softirq-handler scenario that actually occurred in testing. ] Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-03-23 20:10:12 +01:00
Dennis Krein	b57b3b0082	srcu: Lock srcu_data structure in srcu_gp_start() commit eb4c2382272ae7ae5d81fdfa5b7a6c86146eaaa4 upstream. The srcu_gp_start() function is called with the srcu_struct structure's ->lock held, but not with the srcu_data structure's ->lock. This is problematic because this function accesses and updates the srcu_data structure's ->srcu_cblist, which is protected by that lock. Failing to hold this lock can result in corruption of the SRCU callback lists, which in turn can result in arbitrarily bad results. This commit therefore makes srcu_gp_start() acquire the srcu_data structure's ->lock across the calls to rcu_segcblist_advance() and rcu_segcblist_accelerate(), thus preventing this corruption. Reported-by: Bart Van Assche <bvanassche@acm.org> Reported-by: Christoph Hellwig <hch@infradead.org> Reported-by: Sebastian Kuzminsky <seb.kuzminsky@gmail.com> Signed-off-by: Dennis Krein <Dennis.Krein@netapp.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Tested-by: Dennis Krein <Dennis.Krein@netapp.com> Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-13 09:51:06 +01:00
Paul E. McKenney	016a8fc59d	rcu: Make need_resched() respond to urgent RCU-QS needs commit 92aa39e9dc77481b90cbef25e547d66cab901496 upstream. The per-CPU rcu_dynticks.rcu_urgent_qs variable communicates an urgent need for an RCU quiescent state from the force-quiescent-state processing within the grace-period kthread to context switches and to cond_resched(). Unfortunately, such urgent needs are not communicated to need_resched(), which is sometimes used to decide when to invoke cond_resched(), for but one example, within the KVM vcpu_run() function. As of v4.15, this can result in synchronize_sched() being delayed by up to ten seconds, which can be problematic, to say nothing of annoying. This commit therefore checks rcu_dynticks.rcu_urgent_qs from within rcu_check_callbacks(), which is invoked from the scheduling-clock interrupt handler. If the current task is not an idle task and is not executing in usermode, a context switch is forced, and either way, the rcu_dynticks.rcu_urgent_qs variable is set to false. If the current task is an idle task, then RCU's dyntick-idle code will detect the quiescent state, so no further action is required. Similarly, if the task is executing in usermode, other code in rcu_check_callbacks() and its called functions will report the corresponding quiescent state. Reported-by: Marius Hillenbrand <mhillenb@amazon.de> Reported-by: David Woodhouse <dwmw2@infradead.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Backported to make patch apply cleanly on older versions. ] Tested-by: Marius Hillenbrand <mhillenb@amazon.de> Cc: <stable@vger.kernel.org> # 4.12.x - 4.19.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-01 09:37:34 +01:00
Linus Torvalds	f7951c33f0	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - Cleanup and improvement of NUMA balancing - Refactoring and improvements to the PELT (Per Entity Load Tracking) code - Watchdog simplification and related cleanups - The usual pile of small incremental fixes and improvements * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) watchdog: Reduce message verbosity stop_machine: Reflow cpu_stop_queue_two_works() sched/numa: Move task_numa_placement() closer to numa_migrate_preferred() sched/numa: Use group_weights to identify if migration degrades locality sched/numa: Update the scan period without holding the numa_group lock sched/numa: Remove numa_has_capacity() sched/numa: Modify migrate_swap() to accept additional parameters sched/numa: Remove unused task_capacity from 'struct numa_stats' sched/numa: Skip nodes that are at 'hoplimit' sched/debug: Reverse the order of printing faults sched/numa: Use task faults only if numa_group is not yet set up sched/numa: Set preferred_node based on best_cpu sched/numa: Simplify load_too_imbalanced() sched/numa: Evaluate move once per node sched/numa: Remove redundant field sched/debug: Show the sum wait time of a task group sched/fair: Remove #ifdefs from scale_rt_capacity() sched/core: Remove get_cpu() from sched_fork() sched/cpufreq: Clarify sugov_get_util() sched/sysctl: Remove unused sched_time_avg_ms sysctl ...	2018-08-13 11:25:07 -07:00
Paul E. McKenney	18952651da	Merge branches 'fixes1.2018.07.12b' and 'torture1.2018.07.12b' into HEAD fixes1.2018.07.12b: Post-gp_seq miscellaneous fixes torture1.2018.07.12b: Post-gp_seq torture-test updates	2018-07-12 15:42:41 -07:00
Joel Fernandes (Google)	bf5b64355a	rcutorture: Fix rcu_barrier successes counter The rcutorture test module currently increments both successes and error for the barrier test upon error, which results in misleading statistics being printed. This commit therefore changes the code to increment the success counter only when the test actually passes. This change was tested by by returning from the barrier callback without incrementing the callback counter, thus introducing what appeared to rcutorture to be rcu_barrier() failures. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:08 -07:00
Joel Fernandes (Google)	4babd855fd	rcutorture: Add support to detect if boost kthread prio is too low When rcutorture is built in to the kernel, an earlier patch detects that and raises the priority of RCU's kthreads to allow rcutorture's RCU priority boosting tests to succeed. However, if rcutorture is built as a module, those priorities must be raised manually via the rcutree.kthread_prio kernel boot parameter. If this manual step is not taken, rcutorture's RCU priority boosting tests will fail due to kthread starvation. One approach would be to raise the default priority, but that risks breaking existing users. Another approach would be to allow runtime adjustment of RCU's kthread priorities, but that introduces numerous "interesting" race conditions. This patch therefore instead detects too-low priorities, and prints a message and disables the RCU priority boosting tests in that case. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:08 -07:00
Arnd Bergmann	622be33fcb	rcutorture: Use monotonic timestamp for stall detection The get_seconds() call is deprecated because it overflows on 32-bit architectures. The algorithm in rcu_torture_stall() can deal with the overflow, but another problem here is that using a CLOCK_REALTIME stamp can lead to a false-positive stall warning when a settimeofday() happens concurrently. Using ktime_get_seconds() instead avoids those issues and will never overflow. The added cast to 'unsigned long' however is necessary to make ULONG_CMP_LT() work correctly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:07 -07:00
Joel Fernandes (Google)	3b745c8969	rcutorture: Make boost test more robust Currently, with RCU_BOOST disabled, I get no failures when forcing rcutorture to test RCU boost priority inversion. The reason seems to be that we don't check for failures if the callback never ran at all for the duration of the boost-test loop. Further, the 'rtb' and 'rtbf' counters seem to be used inconsistently. 'rtb' is incremented at the start of each test and 'rtbf' is incremented per-cpu on each failure of call_rcu. So its possible 'rtbf' > 'rtb'. To test the boost with rcutorture, I did following on a 4-CPU x86 machine: modprobe rcutorture test_boost=2 sleep 20 rmmod rcutorture With patch: rtbf: 8 rtb: 12 Without patch: rtbf: 0 rtb: 2 In summary this patch: - Increments failed and total test counters once per boost-test. - Checks for failure cases correctly. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:06 -07:00
Joel Fernandes (Google)	450efca718	rcutorture: Disable RT throttling for boost tests Currently rcutorture is not able to torture RCU boosting properly. This is because the rcutorture's boost threads which are doing the torturing may be throttled due to RT throttling. This patch makes rcutorture use the right torture technique (unthrottled rcutorture boost tasks) for torturing RCU so that the test fails correctly when no boost is available. Currently this requires accessing sysctl_sched_rt_runtime directly, but that should be Ok since rcutorture is test code. Such direct access is also only possible if rcutorture is used as a built-in so make it conditional on that. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:06 -07:00
Paul E. McKenney	bf1bef50be	rcutorture: Emphasize testing of single reader protection type For RCU implementations supporting multiple types of reader protection, rcutorture currently randomly selects the combinations of types of protection for each phase of each reader. The problem with this, for example, given the four kinds of protection for RCU-sched (local_irq_disable(), local_bh_disable(), preempt_disable(), and rcu_read_lock_sched()), the reader will be protected by a single mechanism only 25% of the time. We really heavier testing of single read-side mechanisms. This commit therefore uses only a single mechanism about 60% of the time, half of the time explicitly and one-eighth of the time by chance. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:05 -07:00
Paul E. McKenney	2397d072f7	rcutorture: Handle extended read-side critical sections This commit enables rcutorture to test whether RCU properly aggregates different types of read-side critical sections into a larger section covering the set. It does this by extending an initial read-side critical section randomly for a random number of extensions. There is a new rcu_torture_ops field ->extendable that specifies what extensions are permitted for a given flavor of RCU (for example, SRCU does not permit any extensions, while RCU-sched permits all types). Note that if a given operation (for example, local_bh_disable()) extends an RCU read-side critical section, then rcutorture feels free to also start and end the critical section with that operation's type of disabling. Disabling operations include local_bh_disable(), local_irq_disable(), and preempt_disable(). This commit also adds a new "busted_srcud" torture type, which verifies rcutorture's ability to detect extensions of RCU read-side critical sections that are not handled. Gotta test the test, after all! Note that it is not legal to invoke local_bh_disable() with interrupts disabled, and this transition is avoided by overriding the random-number generator when it wants to call local_bh_disable() while interrupts are disabled. The code instead leaves both interrupts and bh/softirq disabled in this case. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:05 -07:00
Paul E. McKenney	241b42522a	rcutorture: Make rcu_torture_timer() use rcu_torture_one_read() This commit saves a few lines of code by making rcu_torture_timer() invoke rcu_torture_one_read(), thus completing the consolidation of code between rcu_torture_timer() and rcu_torture_reader(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:04 -07:00
Paul E. McKenney	3025520ec4	rcutorture: Use per-CPU random state for rcu_torture_timer() Currently, the rcu_torture_timer() function uses a single global torture_random_state structure protected by a single global lock. This conflicts to some extent with performance and scalability, but even more with the goal of consolidating read-side testing with rcu_torture_reader(). This commit therefore creates a per-CPU torture_random_state structure for use by rcu_torture_timer() and eliminates the lock. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Make rcu_torture_timer_rand static, per 0day Test Robot report. ]	2018-07-12 15:42:04 -07:00
Paul E. McKenney	8da9a59523	rcutorture: Use atomic increment for n_rcu_torture_timers Currently, rcu_torture_timer() relies on a lock to guard updates to n_rcu_torture_timers. Unfortunately, consolidating code with rcu_torture_reader() will dispense with this lock. This commit therefore makes n_rcu_torture_timers be an atomic_long_t and uses atomic_long_inc() to carry out the update. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:03 -07:00
Paul E. McKenney	6b06aa723e	rcutorture: Extract common code from rcu_torture_reader() This commit extracts the code executed on each pass through the loop in rcu_torture_reader() into a new rcu_torture_one_read() function. This new function will also be used by rcu_torture_timer(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:02 -07:00
Paul E. McKenney	2d3625841d	rcuperf: Remove unused torturing_tasks() function The torturing_tasks() function in rcuperf.c is not used, so this commit removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:02 -07:00
Paul E. McKenney	6bea2cc5a9	rcu: Remove rcutorture test version and sequence number Back when RCU had a debugfs interface, there was a test version and sequence number that allowed associating debugfs data with a particular test run, where the test run started with modprobe and ended with rmmod, which was how tests were run back on the old ABAT system within IBM. But rcutorture testing no longer runs on ABAT, and there is no longer an RCU debugfs interface, so there is no longer any need for test versions and sequence numbers. This commit therefore removes the rcutorture_record_test_transition() and rcutorture_record_progress() functions, and along with them the rcutorture_testseq and rcutorture_vernum variables that they update. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:01 -07:00
Paul E. McKenney	028be12b29	rcutorture: Change units of onoff_interval to jiffies Some RCU bugs have been sensitive to the frequency of CPU-hotplug operations, which have been gradually increased over time. But this frequency is now at the one-second lower limit that can be specified using the rcutorture.onoff_interval kernel parameter. This commit therefore changes the units of rcutorture.onoff_interval from seconds to jiffies, and also sets the value specified for this kernel parameter in the TREE03 rcutorture scenario to 200, which is 200 milliseconds for HZ=1000. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:42:01 -07:00
Joel Fernandes (Google)	c7cd161ecb	rcu: Assign higher prio to RCU threads if rcutorture is built-in The rcutorture RCU priority boosting tests fail even with CONFIG_RCU_BOOST set because rcutorture's threads run at the same priority as the default RCU kthreads (RT class with priority of 1). This patch checks if RCU torture is built into the kernel and if so, assigns RT priority 1 to the RCU threads, allowing the rcutorture boost tests to pass. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:26 -07:00
Paul E. McKenney	52e17ba1d0	srcu: Add grace-period number to rcutorture statistics printout This commit adds the SRCU grace-period number to the rcutorture statistics printout, which allows it to be compared to the rcutorture "Writer stall state" message. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:25 -07:00
Paul E. McKenney	89b4cd4b9e	rcu: Print stall-warning NMI dyntick state in hexadecimal The ->dynticks_nmi_nesting field records the nesting depth of both interrupt and NMI handlers. Because the kernel can enter interrupts and never leave them (and vice versa) and because NMIs can interrupt manipulation of the ->dynticks_nmi_nesting field, the values in this field must be both chosen and maniupated very carefully. As a result, although the value is zero when the corresponding CPU is executing neither an interrupt nor an NMI handler, it is 4,611,686,018,427,387,906 on 64-bit systems when there is a single level of interrupt/NMI handling in progress. This number is difficult to remember and interpret, so this commit switches the output to hexadecimal, resulting in the much nicer 0x4000000000000002. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:24 -07:00
Paul E. McKenney	2ee5aca546	rcu: Make rcu_seq_diff() more exact The current implementatation of rcu_seq_diff() follows tradition in providing a rough-and-ready approximation of the number of elapsed grace periods between the two rcu_seq values. However, this difference is used to flag RCU-failure "near misses", which can be a valuable debugging aid, so more exactitude would be an improvement. This commit therefore improves the accuracy of rcu_seq_diff(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:23 -07:00
Byungchul Park	67abb96cbf	rcu: Check the range of jiffies_till_{first,next}_fqs when setting them Currently, the range of jiffies_till_{first,next}_fqs are checked and adjusted on and on in the loop of rcu_gp_kthread on runtime. However, it's enough to check them only when setting them, not every time in the loop. So make them handled on a setting time via sysfs. Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:22 -07:00
Paul E. McKenney	47199a0812	rcu: Add diagnostics for rcutorture writer stall warning This commit adds any in-the-future ->gp_seq_needed fields to the diagnostics for an rcutorture writer stall warning message. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:22 -07:00
Steven Rostedt (VMware)	cd23ac8ddb	rcu: Add comment to the last sleep in the rcu tasks loop At the end of rcu_tasks_kthread() there's a lonely schedule_timeout_uninterruptible() call with no apparent rationale for its existence. But there is. It is to keep the thread from going into a tight loop if there's some anomaly. That really needs a comment. Link: http://lkml.kernel.org/r/20180524223839.GU3803@linux.vnet.ibm.com Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:21 -07:00
Steven Rostedt (VMware)	c03be752d3	rcu: Speed up calling of RCU tasks callbacks Joel Fernandes found that the synchronize_rcu_tasks() was taking a significant amount of time. He demonstrated it with the following test: # cd /sys/kernel/tracing # while [ 1 ]; do x=1; done & # echo '__schedule_bug:traceon' > set_ftrace_filter # time echo '!__schedule_bug:traceon' > set_ftrace_filter; real 0m1.064s user 0m0.000s sys 0m0.004s Where it takes a little over a second to perform the synchronize, because there's a loop that waits 1 second at a time for tasks to get through their quiescent points when there's a task that must be waited for. After discussion we came up with a simple way to wait for holdouts but increase the time for each iteration of the loop but no more than a full second. With the new patch we have: # time echo '!__schedule_bug:traceon' > set_ftrace_filter; real 0m0.131s user 0m0.000s sys 0m0.004s Which drops it down to 13% of what the original wait time was. Link: http://lkml.kernel.org/r/20180523063815.198302-2-joel@joelfernandes.org Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org> Suggested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:21 -07:00
Joel Fernandes (Google)	0d805a70a6	rcu: Add comment documenting how rcu_seq_snap works rcu_seq_snap may be tricky to decipher. Lets document how it works with an example to make it easier. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Shrink comment as suggested by Peter Zijlstra. ]	2018-07-12 15:39:20 -07:00
Paul E. McKenney	b06ae25a1e	rcu: Use RCU CPU stall timeout for rcu_check_gp_start_stall() Currently, rcu_check_gp_start_stall() waits for one second after the first request before complaining that a grace period has not yet started. This was desirable while testing the conversion from ->future_gp_needed[] to ->gp_seq_needed, but it is a bit on the hair-trigger side for production use under heavy load. This commit therefore makes this wait time be exactly that of the RCU CPU stall warning, allowing easy adjustment of both timeouts to suit the distribution or installation at hand. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:20 -07:00
Paul E. McKenney	51fbb910f5	rcu: Remove __maybe_unused from rcu_cpu_has_callbacks() The rcu_cpu_has_callbacks() function is now used in all configurations, so this commit removes the __maybe_unused. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:19 -07:00
Paul E. McKenney	9622179519	rcu: Remove "inline" from rcu_perf_print_module_parms() This function is in rcuperf.c, which is not an include file, so there is no problem dropping the "inline", especially given that this function is invoked only twice per rcuperf run. This commit therefore delegates the inlining decision to the compiler by dropping the "inline". Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:19 -07:00
Paul E. McKenney	eac45e586c	rcu: Remove "inline" from rcu_torture_print_module_parms() This function is in rcutorture.c, which is not an include file, so there is no problem dropping the "inline", especially given that this function is invoked only twice per rcutorture run. This commit therefore delegates the inlining decision to the compiler by dropping the "inline". Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:18 -07:00
Paul E. McKenney	95394e69c4	rcu: Remove "inline" from panic_on_rcu_stall() and rcu_blocking_is_gp() These functions are in kernel/rcu/tree.c, which is not an include file, so there is no problem dropping the "inline", especially given that these functions are nowhere near a fastpath. This commit therefore delegates the inlining decision to the compiler by dropping the "inline". Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:18 -07:00
Paul E. McKenney	ab6b82147f	rcu: Remove unused local variable "cpu" One danger of using __maybe_unused is that the compiler doesn't yell at you when you remove the last reference, witness rcu_bind_gp_kthread() and its local variable "cpu". This commit removes this local variable. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:17 -07:00
Paul E. McKenney	164ba3fc48	rcu: Remove unused rcu_kick_nohz_cpu() function The rcu_kick_nohz_cpu() function is no longer used, and the functionality it used to provide is now provided by a call to resched_cpu() in the force-quiescent-state function rcu_implicit_dynticks_qs(). This commit therefore removes rcu_kick_nohz_cpu(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:17 -07:00
Paul E. McKenney	c7037ff524	rcu: Clarify and correct the rcu_preempt_qs() header comment The rcu_preempt_qs() function only applies to the CPU, not the task. A task really is allowed to invoke this function while in an RCU-preempt read-side critical section, but only if it has first added itself to some leaf rcu_node structure's ->blkd_tasks list. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:16 -07:00
Paul E. McKenney	3b57a3994f	rcu: Inline rcu_dynticks_momentary_idle() into its sole caller The rcu_dynticks_momentary_idle() function is invoked only from rcu_momentary_dyntick_idle(), and neither function is particularly large. This commit therefore saves a few lines by inlining rcu_dynticks_momentary_idle() into rcu_momentary_dyntick_idle(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:16 -07:00
Paul E. McKenney	15651201fa	rcu: Mark task as .need_qs less aggressively If any scheduling-clock interrupt interrupts an RCU-preempt read-side critical section, the interrupted task's ->rcu_read_unlock_special.b.need_qs field is set. This causes the outermost rcu_read_unlock() to incur the extra overhead of calling into rcu_read_unlock_special(). This commit reduces that overhead by setting ->rcu_read_unlock_special.b.need_qs only if the grace period has been in effect for more than one second. Why one second? Because this is comfortably smaller than the minimum RCU CPU stall-warning timeout of three seconds, but long enough that the .need_qs marking should happen quite rarely. And if your RCU read-side critical section has run on-CPU for a full second, it is not unreasonable to invest some CPU time in ending the grace period quickly. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:15 -07:00
Paul E. McKenney	6f56f714db	rcu: Improve RCU-tasks naming and comments The naming and comments associated with some RCU-tasks code make the faulty assumption that context switches due to cond_resched() are voluntary. As several people pointed out, this is not the case. This commit therefore updates function names and comments to better reflect current reality. Reported-by: Byungchul Park <byungchul.park@lge.com> Reported-by: Joel Fernandes <joel@joelfernandes.org> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:15 -07:00
Joe Perches	a7538352da	rcu: Use pr_fmt to prefix "rcu: " to logging output This commit also adjusts some whitespace while in the area. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Revert string-breaking %s as requested by Andy Shevchenko. ]	2018-07-12 15:39:13 -07:00
Byungchul Park	07f27570dc	rcu: Improve rcu_note_voluntary_context_switch() reporting We expect a quiescent state of TASKS_RCU when cond_resched_tasks_rcu_qs() is called, no matter whether it actually be scheduled or not. However, it currently doesn't report the quiescent state when the task enters into __schedule() as it's called with preempt = true. So make it report the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is called. And in TINY_RCU, even though the quiescent state of rcu_bh also should be reported when the tick interrupt comes from user, it doesn't. So make it reported. Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be reported when the tick interrupt comes from not only user but also idle, as an extended quiescent state. Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Simplify rcutiny portion given no RCU-tasks for !PREEMPT. ]	2018-07-12 15:39:12 -07:00
Paul E. McKenney	3949fa9bac	rcu: Make rcu_read_unlock_special() static Because rcu_read_unlock_special() is no longer used outside of kernel/rcu/tree_plugin.h, this commit makes it static. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:11 -07:00
Paul E. McKenney	f2e2df5978	rcu: Add diagnostics for offline CPUs failing to report QS CPUs are expected to report quiescent states when coming online and when going offline, and grace-period initialization is supposed to handle any race conditions where a CPU's ->qsmask bit is set just after it goes offline. This commit adds diagnostics for the case where an offline CPU nevertheless has a grace period waiting on it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:10 -07:00
Paul E. McKenney	fea3f222d3	rcu: Record ->gp_state for both phases of grace-period initialization Grace-period initialization first processes any recent CPU-hotplug operations, and then initializes state for the new grace period. These two phases of initialization are currently not distinguished in debug prints, but the distinction is valuable in a number of debug situations. This commit therefore introduces two new values for ->gp_state, RCU_GP_ONOFF and RCU_GP_INIT, in order to make this distinction. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:09 -07:00
Paul E. McKenney	5773894231	rcu: Add CPU online/offline state to dump_blkd_tasks() Interactions between CPU-hotplug operations and grace-period initialization can result in dump_blkd_tasks(). One of the first debugging actions in this case is to search back in dmesg to work out which of the affected rcu_node structure's CPUs are online and to determine the last CPU-hotplug operation affecting any of those CPUs. This can be laborious and error-prone, especially when console output is lost. This commit therefore causes dump_blkd_tasks() to dump the state of the affected rcu_node structure's CPUs and the last grace period during which the last offline and online operation affected each of these CPUs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:09 -07:00
Paul E. McKenney	ff3cee3908	rcu: Add up-tree information to dump_blkd_tasks() diagnostics This commit updates dump_blkd_tasks() to print out quiescent-state bitmasks for the rcu_node structures further up the tree. This information helps debugging of interactions between CPU-hotplug operations and RCU grace-period initialization. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:08 -07:00

1 2 3 4 5 ...

943 commits