update_rlimit_cpu() tries to optimize out set_process_cpu_timer() in case
when we already have CPUCLOCK_PROF timer which should expire first. But it
uses cputime_lt() instead of cputime_gt().
Test case:
int main(void)
{
struct itimerval it = {
.it_value = { .tv_sec = 1000 },
};
assert(!setitimer(ITIMER_PROF, &it, NULL));
struct rlimit rl = {
.rlim_cur = 1,
.rlim_max = 1,
};
assert(!setrlimit(RLIMIT_CPU, &rl));
for (;;)
;
return 0;
}
Without this patch, the task is not killed as RLIMIT_CPU demands.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000610.GA10108@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
See http://bugzilla.kernel.org/show_bug.cgi?id=12911
copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other expired cpu timers.
Change copy_signal() to set cputime_expires.prof_exp if we have RLIMIT_CPU.
Also, set cputimer.running = 1 in that case. This is not strictly necessary,
but imho makes sense.
Reported-by: Peter Lojkin <ia6432@inbox.ru>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000607.GA10104@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add missing kernel-doc parameter notation and change function
name to its new name:
Warning(kernel/timer.c:543): No description found for parameter 'name'
Warning(kernel/timer.c:543): No description found for parameter 'key'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: akpm <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
LKML-Reference: <20090401174723.f0bea0eb.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
See http://bugzilla.kernel.org/show_bug.cgi?id=12911
copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other cpu timers.
This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
optimal, we need further cleanups here. With this patch update_rlimit_cpu()
is not really needed, but I don't think it should be removed.
The proper fix (I think) is:
- set_process_cpu_timer() should just start the cputimer->running
logic (it does), no need to change cputime_expires.xxx_exp
- posix_cpu_timers_init_group() should set ->running when needed
- fastpath_timer_check() can check ->running instead of
task_cputime_zero(signal->cputime_expires)
Reported-by: Peter Lojkin <ia6432@inbox.ru>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org> [for 2.6.29.x]
LKML-Reference: <20090323193411.GA17514@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes bug #12208:
Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12208
Subject : uml is very slow on 2.6.28 host
This turned out to be not a scheduler regression, but an already
existing problem in ptrace being triggered by subtle scheduler
changes.
The problem is this:
- task A is ptracing task B
- task B stops on a trace event
- task A is woken up and preempts task B
- task A calls ptrace on task B, which does ptrace_check_attach()
- this calls wait_task_inactive(), which sees that task B is still on the runq
- task A goes to sleep for a jiffy
- ...
Since UML does lots of the above sequences, those jiffies quickly add
up to make it slow as hell.
This patch solves this by not rescheduling in read_unlock() after
ptrace_stop() has woken up the tracer.
Thanks to Oleg Nesterov and Ingo Molnar for the feedback.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: fix ref-after-free crash on failed module load
Fix refptr bug: Change refptr allocation and release order not to access a module
data structure pointed by 'mod' after freeing mod->module_core.
This bug will cause kernel panic(e.g. failed to find undefined symbols).
This bug was reported on systemtap bugzilla.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=9927
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were returning early in the sysfs directory cleanup function if the
user belonged to a non init usernamespace. Due to this a lot of the
cleanup was not done and we were left with a leak. Fix the leak.
Reported-by: Serge Hallyn <serue@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CLONE_PARENT can fool the ->self_exec_id/parent_exec_id logic. If we
re-use the old parent, we must also re-use ->parent_exec_id to make
sure exit_notify() sees the right ->xxx_exec_id's when the CLONE_PARENT'ed
task exits.
Also, move down the "p->parent_exec_id = p->self_exec_id" thing, to place
two different cases together.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frans Pop reported the crash below when running an s390 kernel under Hercules:
Kernel BUG at 000738b4 verbose debug info unavailable!
fixpoint divide exception: 0009 #1! SMP
Modules linked in: nfs lockd nfs_acl sunrpc ctcm fsm tape_34xx
cu3088 tape ccwgroup tape_class ext3 jbd mbcache dm_mirror dm_log dm_snapshot
dm_mod dasd_eckd_mod dasd_mod
CPU: 0 Not tainted 2.6.27.19 #13
Process awk (pid: 2069, task: 0f9ed9b8, ksp: 0f4f7d18)
Krnl PSW : 070c1000 800738b4 (acct_update_integrals+0x4c/0x118)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
Krnl GPRS: 00000000 000007d0 7fffffff fffff830
00000000 ffffffff 00000002 0f9ed9b8
00000000 00008ca0 00000000 0f9ed9b8
0f9edda4 8007386e 0f4f7ec8 0f4f7e98
Krnl Code: 800738aa: a71807d0 lhi %r1,2000
800738ae: 8c200001 srdl %r2,1
800738b2: 1d21 dr %r2,%r1
>800738b4: 5810d10e l %r1,270(%r13)
800738b8: 1823 lr %r2,%r3
800738ba: 4130f060 la %r3,96(%r15)
800738be: 0de1 basr %r14,%r1
800738c0: 5800f060 l %r0,96(%r15)
Call Trace:
( <000000000004fdea>! blocking_notifier_call_chain+0x1e/0x2c)
<0000000000038502>! do_exit+0x106/0x7c0
<0000000000038c36>! do_group_exit+0x7a/0xb4
<0000000000038c8e>! SyS_exit_group+0x1e/0x30
<0000000000021c28>! sysc_do_restart+0x12/0x16
<0000000077e7e924>! 0x77e7e924
Reason for this is that cpu time accounting usually only happens from
interrupt context, but acct_update_integrals gets also called from
process context with interrupts enabled.
So in acct_update_integrals we may end up with the following scenario:
Between reading tsk->stime/tsk->utime and tsk->acct_timexpd an interrupt
happens which updates accouting values. This causes acct_timexpd to be
greater than the former stime + utime. The subsequent calculation of
dtime = cputime_sub(time, tsk->acct_timexpd);
will be negative and the division performed by
cputime_to_jiffies(dtime)
will generate an exception since the result won't fit into a 32 bit
register.
In order to fix this just always disable interrupts while accessing any
of the accounting values.
Reported by: Frans Pop <elendil@planet.nl>
Tested by: Frans Pop <elendil@planet.nl>
Cc: stable@kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a machine is flooded by network frames, a cpu can loop
100% of its time inside ksoftirqd() without calling schedule().
This can delay RCU grace period to insane values.
Adding rcu_qsctr_inc() call in ksoftirqd() solves this problem.
Paul: "This regression was a result of the recent change from
"schedule()" to "cond_resched()", which got rid of that quiescent
state in the common case where a reschedule is not needed".
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: don't allow setuid to succeed if the user does not have rt bandwidth
sched_rt: don't start timer when rt bandwidth disabled
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Teach RCU that idle task is not quiscent state at boot
On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
ljmp, and then use the "syscall" instruction to make a 64-bit system
call. A 64-bit process make a 32-bit system call with int $0x80.
In both these cases under CONFIG_SECCOMP=y, secure_computing() will use
the wrong system call number table. The fix is simple: test TS_COMPAT
instead of TIF_IA32. Here is an example exploit:
/* test case for seccomp circumvention on x86-64
There are two failure modes: compile with -m64 or compile with -m32.
The -m64 case is the worst one, because it does "chmod 777 ." (could
be any chmod call). The -m32 case demonstrates it was able to do
stat(), which can glean information but not harm anything directly.
A buggy kernel will let the test do something, print, and exit 1; a
fixed kernel will make it exit with SIGKILL before it does anything.
*/
#define _GNU_SOURCE
#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <linux/prctl.h>
#include <sys/stat.h>
#include <unistd.h>
#include <asm/unistd.h>
int
main (int argc, char **argv)
{
char buf[100];
static const char dot[] = ".";
long ret;
unsigned st[24];
if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0)
perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?");
#ifdef __x86_64__
assert ((uintptr_t) dot < (1UL << 32));
asm ("int $0x80 # %0 <- %1(%2 %3)"
: "=a" (ret) : "0" (15), "b" (dot), "c" (0777));
ret = snprintf (buf, sizeof buf,
"result %ld (check mode on .!)\n", ret);
#elif defined __i386__
asm (".code32\n"
"pushl %%cs\n"
"pushl $2f\n"
"ljmpl $0x33, $1f\n"
".code64\n"
"1: syscall # %0 <- %1(%2 %3)\n"
"lretl\n"
".code32\n"
"2:"
: "=a" (ret) : "0" (4), "D" (dot), "S" (&st));
if (ret == 0)
ret = snprintf (buf, sizeof buf,
"stat . -> st_uid=%u\n", st[7]);
else
ret = snprintf (buf, sizeof buf, "result %ld\n", ret);
#else
# error "not this one"
#endif
write (1, buf, ret);
syscall (__NR_exit, 1);
return 2;
}
Signed-off-by: Roland McGrath <roland@redhat.com>
[ I don't know if anybody actually uses seccomp, but it's enabled in
at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
free_uid() and free_user_ns() are corecursive when CONFIG_USER_SCHED=n,
but free_user_ns() is called from free_uid() by way of uid_hash_remove(),
which requires uidhash_lock to be held. free_user_ns() then calls
free_uid() to complete the destruction.
Fix this by deferring the destruction of the user_namespace.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: fix hung task with certain (non-default) rt-limit settings
Corey Hickey reported that on using setuid to change the uid of a
rt process, the process would be unkillable and not be running.
This is because there was no rt runtime for that user group. Add
in a check to see if a user can attach an rt task to its task group.
On failure, return EINVAL, which is also returned in
CONFIG_CGROUP_SCHED.
Reported-by: Corey Hickey <bugfood-ml@fatooh.org>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The time_status conditional was accidentally placed right after we clear
the checked time_status bits, which causes us to take the conditional
every time through. This fixes it by moving the conditional to before we
clear the time_status bits.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Clark Williams <williams@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix incorrect condition check
No need to start rt bandwidth timer when rt bandwidth is disabled.
If this timer starts, it may stop at sched_rt_period_timer() on the first time.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes a bug located by Vegard Nossum with the aid of
kmemcheck, updated based on review comments from Nick Piggin,
Ingo Molnar, and Andrew Morton. And cleans up the variable-name
and function-name language. ;-)
The boot CPU runs in the context of its idle thread during boot-up.
During this time, idle_cpu(0) will always return nonzero, which will
fool Classic and Hierarchical RCU into deciding that a large chunk of
the boot-up sequence is a big long quiescent state. This in turn causes
RCU to prematurely end grace periods during this time.
This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
function to ignore the idle task as a quiescent state until the
system has started up the scheduler in rest_init(), introducing a
new non-API function rcu_idle_now_means_idle() to inform RCU of this
transition. RCU maintains an internal rcu_idle_cpu_truthful variable
to track this state, which is then used by rcu_check_callback() to
determine if it should believe idle_cpu().
Because this patch has the effect of disallowing RCU grace periods
during long stretches of the boot-up sequence, this patch also introduces
Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
no-op if num_online_cpus() returns 1. This allows boot-time code that
calls synchronize_rcu() to proceed normally. Note, however, that RCU
callbacks registered by call_rcu() will likely queue up until later in
the boot sequence. Although rcuclassic and rcutree can also use this
same optimization after boot completes, rcupreempt must restrict its
use of this optimization to the portion of the boot sequence before the
scheduler starts up, given that an rcupreempt RCU read-side critical
section may be preeempted.
In addition, this patch takes Nick Piggin's suggestion to make the
system_state global variable be __read_mostly.
Changes since v4:
o Changes the name of the introduced function and variable to
be less emotional. ;-)
Changes since v3:
o WARN_ON(nr_context_switches() > 0) to verify that RCU
switches out of boot-time mode before the first context
switch, as suggested by Nick Piggin.
Changes since v2:
o Created rcu_blocking_is_gp() internal-to-RCU API that
determines whether a call to synchronize_rcu() is itself
a grace period.
o The definition of rcu_blocking_is_gp() for rcuclassic and
rcutree checks to see if but a single CPU is online.
o The definition of rcu_blocking_is_gp() for rcupreempt
checks to see both if but a single CPU is online and if
the system is still in early boot.
This allows rcupreempt to again work correctly if running
on a single CPU after booting is complete.
o Added check to rcupreempt's synchronize_sched() for there
being but one online CPU.
Tested all three variants both SMP and !SMP, booted fine, passed a short
rcutorture test on both x86 and Power.
Located-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
The 'time_adj' local variable is named in a very confusing
way because it almost shadows the 'time_adjust' global
variable - which is used in this same function.
Rename it to 'delta' - to make them stand apart more clearly.
kernel/time/ntp.o:
text data bss dec hex filename
2545 114 144 2803 af3 ntp.o.before
2545 114 144 2803 af3 ntp.o.after
md5:
1bf0b3be564512279ba7cee299d1d2be ntp.o.before.asm
1bf0b3be564512279ba7cee299d1d2be ntp.o.after.asm
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: micro-optimization
Convert the (internal) ntp_tick_adj value we store from unscaled
units to scaled units. This is a constant that we never modify,
so scaling it up once during bootup is enough - we dont have to
do it for every adjustment step.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
Further simplify do_adjtimex():
- introduce the ntp_start_leap_timer() helper function
- eliminate the goto adj_done complication
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
do_adjtimex() is currently a monster function with a maze of
branches. Refactor the txc->modes setting aspects of it into
two new helper functions:
process_adj_status()
process_adjtimex_modes()
kernel/time/ntp.o:
text data bss dec hex filename
2512 114 136 2762 aca ntp.o.before
2512 114 136 2762 aca ntp.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: change (fix) the way the NTP PLL seconds offset is initialized/tracked
Fix a bug and do a micro-optimization:
When PLL is enabled we do not reset time_reftime. If the PLL
was off for a long time (for example after bootup), this is
arguably the wrong thing to do.
We already had a hack for the common boot-time case in
ntp_update_offset(), in form of:
if (unlikely(time_status & STA_FREQHOLD || time_reftime == 0))
secs = 0;
But the update delta should be reset later on too - not just when
the PLL is enabled for the first time after bootup.
So do it on !STA_PLL -> STA_PLL transitions.
This changes behavior, as previously if ntpd was disabled for
a long time and we restarted it, we'd run from that last update,
with a very large delta.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
The time_reftime update in ntp_update_offset() to xtime.tv_sec
is a convoluted way of saying that we want to freeze the frequency
and want the 'secs' delta to be 0. Also make this branch unlikely.
This shaves off 8 bytes from the code size:
text data bss dec hex filename
2504 114 136 2754 ac2 ntp.o.before
2496 114 136 2746 aba ntp.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
Change ntp_update_offset_fll() to delta logic instead of
absolute value logic. This eliminates 'freq_adj' from the
function.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
Change ntp_update_frequency() from a hard to follow code
flow that uses global variables as temporaries, to a clean
input+output flow.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
There's an ugly u64 typecase in the MAX_TICKADJ_SCALED definition,
this can be eliminated by making the MAX_TICKADJ constant's type
64-bit (signed).
kernel/time/ntp.o:
text data bss dec hex filename
2504 114 136 2754 ac2 ntp.o.before
2504 114 136 2754 ac2 ntp.o.after
md5:
41f3009debc9b397d7394dd77d912f0a ntp.o.before.asm
41f3009debc9b397d7394dd77d912f0a ntp.o.after.asm
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
Instead of a hierarchy of conditions, transform them to clean
gradual conditions and return's.
This makes the flow easier to read and makes the purpose of
the function easier to understand.
kernel/time/ntp.o:
text data bss dec hex filename
2552 170 168 2890 b4a ntp.o.before
2552 170 168 2890 b4a ntp.o.after
md5:
eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.before.asm
eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.after.asm
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no functionality changed
Make this file a bit more readable by applying a consistent coding style.
No code changed:
kernel/time/ntp.o:
text data bss dec hex filename
2552 170 168 2890 b4a ntp.o.before
2552 170 168 2890 b4a ntp.o.after
md5:
eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.before.asm
eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.after.asm
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move the sysdev_suspend/resume from the callee to the callers, with
no real change in semantics, so that we can rework the disabling of
interrupts during suspend/hibernation.
This is based on an earlier patch from Linus.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* hibernate:
PM: Fix suspend_console and resume_console to use only one semaphore
PM: Wait for console in resume
PM: Fix pm_notifiers during user mode hibernation
swsusp: clean up shrink_all_zones()
swsusp: dont fiddle with swappiness
PM: fix build for CONFIG_PM unset
PM/hibernate: fix "swap breaks after hibernation failures"
PM/resume: wait for device probing to finish
Consolidate driver_probe_done() loops into one place
This fixes a race where a thread acquires the console while the
console is suspended, and the console is resumed before this
thread releases it. In this case, the secondary console
semaphore would be left locked, and the primary semaphore would
be released twice. This in turn would cause the console switch
on suspend or resume to hang forever.
Note that suspend_console does not actually lock the console
for clients that use acquire_console_sem, it only locks it for
clients that use try_acquire_console_sem. If we change
suspend_console to fully lock the console, then the kernel
may deadlock on suspend. One client of try_acquire_console_sem
is acquire_console_semaphore_for_printk, which uses it to
prevent printk from using the console while it is suspended.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avoids later waking up to a blinking cursor if the device woke up and
returned to sleep before the console switch happened.
Signed-off-by: Brian Swetland <swetland@google.com>
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Snapshot device is opened with O_RDONLY during suspend and O_WRONLY durig
resume. Make sure we also call notifiers with correct parameter telling
them what we are really doing.
Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Compilation of kprobes.c with CONFIG_PM unset is broken due to some broken
config dependncies. Fix that.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
the resume code does not currently wait for device probing to finish.
Even without async function calls this is dicey and not correct,
but with async function calls during the boot sequence this is going
to get hit more...
This patch adds the synchronization using the newly introduced helper.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: limit the number of loops the ring buffer self test can make
tracing: have function trace select kallsyms
tracing: disable tracing while testing ring buffer
tracing/function-graph-tracer: trace the idle tasks
Since the GENERIC_TIME changes landed, the adjtimex behavior changed
for struct timex.tick and .freq changed. When the tick or freq value
is set, we adjust the tick_length_base in ntp_update_frequency().
However, this new value doesn't get applied to tick_length until the
next second (via second_overflow).
This means some applications that do quick time tweaking do not see the
requested change made as quickly as expected.
I've run a few tests with this change, and ntpd still functions fine.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: prevent deadlock if ring buffer gets corrupted
This patch adds a paranoid check to make sure the ring buffer consumer
does not go into an infinite loop. Since the ring buffer has been set
to read only, the consumer should not loop for more than the ring buffer
size. A check is added to make sure the consumer does not loop more than
the ring buffer size.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: fix output of function tracer to be useful
The function tracer is pretty useless if KALLSYMS is not configured.
Unless you are good at reading hex values, the function tracer should
select the KALLSYMS configuration.
Also, the dynamic function tracer will fail its self test if KALLSYMS
is not selected.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: fix to prevent hard lockup on self tests
If one of the tracers are broken and is constantly filling the ring
buffer while the test of the ring buffer is running, it will hang
the box. The reason is that the test is a consumer that will not
stop till the ring buffer is empty. But if the tracer is broken and
is constantly producing input to the buffer, this test will never
end. The result is a lockup of the box.
This happened when KALLSYMS was not defined and the dynamic ftrace
test constantly filled the ring buffer, because the filter failed
and all functions were being traced. Something was being called
that constantly filled the buffer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list
block: fix booting from partitioned md array
block: revert part of 18ce3751cc
cciss: PCI power management reset for kexec
paride/pg.c: xs(): &&/|| confusion
fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free
block: fix bad definition of BIO_RW_SYNC
bsg: Fix sense buffer bug in SG_IO
Compilation of kprobes.c with CONFIG_PM unset is broken due to some broken
config dependncies. Fix that.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In cgroup_kill_sb(), root is freed before sb is detached from the list, so
another sget() may find this sb and call cgroup_test_super(), which will
access the root that has been freed.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: new timer API
Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:
introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.
(regular mod_timer() re-activates non-pending timers.)
This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.
Also while at it:
- optimize the regular mod_timer() path some more, the
timer-stat and a debug check was needlessly duplicated
in __mod_timer().
- make the exports come straight after the function, as
most other exports in timer.c already did.
- eliminate __mod_timer() as an external API, change the
users to mod_timer().
The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.
Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>