sched: prevent divide by zero error in cpu_avg_load_per_task
Impact: fix divide by zero crash in scheduler rebalance irq While testing the branch profiler, I hit this crash: divide error: 0000 [#1] PREEMPT SMP [...] RIP: 0010:[<ffffffff8024a008>] [<ffffffff8024a008>] cpu_avg_load_per_task+0x50/0x7f [...] Call Trace: <IRQ> <0> [<ffffffff8024fd43>] find_busiest_group+0x3e5/0xcaa [<ffffffff8025da75>] rebalance_domains+0x2da/0xa21 [<ffffffff80478769>] ? find_next_bit+0x1b2/0x1e6 [<ffffffff8025e2ce>] run_rebalance_domains+0x112/0x19f [<ffffffff8026d7c2>] __do_softirq+0xa8/0x232 [<ffffffff8020ea7c>] call_softirq+0x1c/0x3e [<ffffffff8021047a>] do_softirq+0x94/0x1cd [<ffffffff8026d5eb>] irq_exit+0x6b/0x10e [<ffffffff8022e6ec>] smp_apic_timer_interrupt+0xd3/0xff [<ffffffff8020e4b3>] apic_timer_interrupt+0x13/0x20 The code for cpu_avg_load_per_task has: if (rq->nr_running) rq->avg_load_per_task = rq->load.weight / rq->nr_running; The runqueue lock is not held here, and there is nothing that prevents the rq->nr_running from going to zero after it passes the if condition. The branch profiler simply made the race window bigger. This patch saves off the rq->nr_running to a local variable and uses that for both the condition and the division. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
parent
ee2f6cc7f9
commit
4cd4262034
1 changed files with 3 additions and 2 deletions
|
@ -1453,9 +1453,10 @@ static int task_hot(struct task_struct *p, u64 now, struct sched_domain *sd);
|
|||
static unsigned long cpu_avg_load_per_task(int cpu)
|
||||
{
|
||||
struct rq *rq = cpu_rq(cpu);
|
||||
unsigned long nr_running = rq->nr_running;
|
||||
|
||||
if (rq->nr_running)
|
||||
rq->avg_load_per_task = rq->load.weight / rq->nr_running;
|
||||
if (nr_running)
|
||||
rq->avg_load_per_task = rq->load.weight / nr_running;
|
||||
else
|
||||
rq->avg_load_per_task = 0;
|
||||
|
||||
|
|
Loading…
Reference in a new issue