PM / reboot: Eliminate race between reboot and suspend

At present, "systemctl suspend" and "shutdown" can run in parrallel. A
system can suspend after devices_shutdown(), and resume. Then the shutdown
task goes on to power off. This causes many devices are not really shut
off. Hence replacing reboot_mutex with system_transition_mutex (renamed
from pm_mutex) to achieve the exclusion. The renaming of pm_mutex as
system_transition_mutex can be better to reflect the purpose of the mutex.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This commit is contained in:
Pingfan Liu 2018-07-31 16:51:32 +08:00 committed by Rafael J. Wysocki
parent 82837ad5bd
commit 55f2503c3b
10 changed files with 40 additions and 36 deletions

View file

@ -204,26 +204,26 @@ VI. Are there any precautions to be taken to prevent freezing failures?
Yes, there are. Yes, there are.
First of all, grabbing the 'pm_mutex' lock to mutually exclude a piece of code First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a piece of code
from system-wide sleep such as suspend/hibernation is not encouraged. from system-wide sleep such as suspend/hibernation is not encouraged.
If possible, that piece of code must instead hook onto the suspend/hibernation If possible, that piece of code must instead hook onto the suspend/hibernation
notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code
(kernel/cpu.c) for an example. (kernel/cpu.c) for an example.
However, if that is not feasible, and grabbing 'pm_mutex' is deemed necessary, However, if that is not feasible, and grabbing 'system_transition_mutex' is deemed necessary,
it is strongly discouraged to directly call mutex_[un]lock(&pm_mutex) since it is strongly discouraged to directly call mutex_[un]lock(&system_transition_mutex) since
that could lead to freezing failures, because if the suspend/hibernate code that could lead to freezing failures, because if the suspend/hibernate code
successfully acquired the 'pm_mutex' lock, and hence that other entity failed successfully acquired the 'system_transition_mutex' lock, and hence that other entity failed
to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE
state. As a consequence, the freezer would not be able to freeze that task, state. As a consequence, the freezer would not be able to freeze that task,
leading to freezing failure. leading to freezing failure.
However, the [un]lock_system_sleep() APIs are safe to use in this scenario, However, the [un]lock_system_sleep() APIs are safe to use in this scenario,
since they ask the freezer to skip freezing this task, since it is anyway since they ask the freezer to skip freezing this task, since it is anyway
"frozen enough" as it is blocked on 'pm_mutex', which will be released "frozen enough" as it is blocked on 'system_transition_mutex', which will be released
only after the entire suspend/hibernation sequence is complete. only after the entire suspend/hibernation sequence is complete.
So, to summarize, use [un]lock_system_sleep() instead of directly using So, to summarize, use [un]lock_system_sleep() instead of directly using
mutex_[un]lock(&pm_mutex). That would prevent freezing failures. mutex_[un]lock(&system_transition_mutex). That would prevent freezing failures.
V. Miscellaneous V. Miscellaneous
/sys/power/pm_freeze_timeout controls how long it will cost at most to freeze /sys/power/pm_freeze_timeout controls how long it will cost at most to freeze

View file

@ -32,7 +32,7 @@ More details follow:
sysfs file sysfs file
| |
v v
Acquire pm_mutex lock Acquire system_transition_mutex lock
| |
v v
Send PM_SUSPEND_PREPARE Send PM_SUSPEND_PREPARE
@ -96,10 +96,10 @@ execution during resume):
* thaw tasks * thaw tasks
* send PM_POST_SUSPEND notifications * send PM_POST_SUSPEND notifications
* Release pm_mutex lock. * Release system_transition_mutex lock.
It is to be noted here that the pm_mutex lock is acquired at the very It is to be noted here that the system_transition_mutex lock is acquired at the very
beginning, when we are just starting out to suspend, and then released only beginning, when we are just starting out to suspend, and then released only
after the entire cycle is complete (i.e., suspend + resume). after the entire cycle is complete (i.e., suspend + resume).

View file

@ -414,7 +414,7 @@ static inline bool hibernation_available(void) { return false; }
#define PM_RESTORE_PREPARE 0x0005 /* Going to restore a saved image */ #define PM_RESTORE_PREPARE 0x0005 /* Going to restore a saved image */
#define PM_POST_RESTORE 0x0006 /* Restore failed */ #define PM_POST_RESTORE 0x0006 /* Restore failed */
extern struct mutex pm_mutex; extern struct mutex system_transition_mutex;
#ifdef CONFIG_PM_SLEEP #ifdef CONFIG_PM_SLEEP
void save_processor_state(void); void save_processor_state(void);

View file

@ -15,7 +15,9 @@
atomic_t system_freezing_cnt = ATOMIC_INIT(0); atomic_t system_freezing_cnt = ATOMIC_INIT(0);
EXPORT_SYMBOL(system_freezing_cnt); EXPORT_SYMBOL(system_freezing_cnt);
/* indicate whether PM freezing is in effect, protected by pm_mutex */ /* indicate whether PM freezing is in effect, protected by
* system_transition_mutex
*/
bool pm_freezing; bool pm_freezing;
bool pm_nosig_freezing; bool pm_nosig_freezing;

View file

@ -338,7 +338,7 @@ static int create_image(int platform_mode)
* hibernation_snapshot - Quiesce devices and create a hibernation image. * hibernation_snapshot - Quiesce devices and create a hibernation image.
* @platform_mode: If set, use platform driver to prepare for the transition. * @platform_mode: If set, use platform driver to prepare for the transition.
* *
* This routine must be called with pm_mutex held. * This routine must be called with system_transition_mutex held.
*/ */
int hibernation_snapshot(int platform_mode) int hibernation_snapshot(int platform_mode)
{ {
@ -500,8 +500,9 @@ static int resume_target_kernel(bool platform_mode)
* hibernation_restore - Quiesce devices and restore from a hibernation image. * hibernation_restore - Quiesce devices and restore from a hibernation image.
* @platform_mode: If set, use platform driver to prepare for the transition. * @platform_mode: If set, use platform driver to prepare for the transition.
* *
* This routine must be called with pm_mutex held. If it is successful, control * This routine must be called with system_transition_mutex held. If it is
* reappears in the restored target kernel in hibernation_snapshot(). * successful, control reappears in the restored target kernel in
* hibernation_snapshot().
*/ */
int hibernation_restore(int platform_mode) int hibernation_restore(int platform_mode)
{ {
@ -806,13 +807,13 @@ static int software_resume(void)
* name_to_dev_t() below takes a sysfs buffer mutex when sysfs * name_to_dev_t() below takes a sysfs buffer mutex when sysfs
* is configured into the kernel. Since the regular hibernate * is configured into the kernel. Since the regular hibernate
* trigger path is via sysfs which takes a buffer mutex before * trigger path is via sysfs which takes a buffer mutex before
* calling hibernate functions (which take pm_mutex) this can * calling hibernate functions (which take system_transition_mutex)
* cause lockdep to complain about a possible ABBA deadlock * this can cause lockdep to complain about a possible ABBA deadlock
* which cannot happen since we're in the boot code here and * which cannot happen since we're in the boot code here and
* sysfs can't be invoked yet. Therefore, we use a subclass * sysfs can't be invoked yet. Therefore, we use a subclass
* here to avoid lockdep complaining. * here to avoid lockdep complaining.
*/ */
mutex_lock_nested(&pm_mutex, SINGLE_DEPTH_NESTING); mutex_lock_nested(&system_transition_mutex, SINGLE_DEPTH_NESTING);
if (swsusp_resume_device) if (swsusp_resume_device)
goto Check_image; goto Check_image;
@ -900,7 +901,7 @@ static int software_resume(void)
atomic_inc(&snapshot_device_available); atomic_inc(&snapshot_device_available);
/* For success case, the suspend path will release the lock */ /* For success case, the suspend path will release the lock */
Unlock: Unlock:
mutex_unlock(&pm_mutex); mutex_unlock(&system_transition_mutex);
pm_pr_dbg("Hibernation image not present or could not be loaded.\n"); pm_pr_dbg("Hibernation image not present or could not be loaded.\n");
return error; return error;
Close_Finish: Close_Finish:

View file

@ -15,17 +15,16 @@
#include <linux/workqueue.h> #include <linux/workqueue.h>
#include <linux/debugfs.h> #include <linux/debugfs.h>
#include <linux/seq_file.h> #include <linux/seq_file.h>
#include <linux/suspend.h>
#include "power.h" #include "power.h"
DEFINE_MUTEX(pm_mutex);
#ifdef CONFIG_PM_SLEEP #ifdef CONFIG_PM_SLEEP
void lock_system_sleep(void) void lock_system_sleep(void)
{ {
current->flags |= PF_FREEZER_SKIP; current->flags |= PF_FREEZER_SKIP;
mutex_lock(&pm_mutex); mutex_lock(&system_transition_mutex);
} }
EXPORT_SYMBOL_GPL(lock_system_sleep); EXPORT_SYMBOL_GPL(lock_system_sleep);
@ -37,8 +36,9 @@ void unlock_system_sleep(void)
* *
* Reason: * Reason:
* Fundamentally, we just don't need it, because freezing condition * Fundamentally, we just don't need it, because freezing condition
* doesn't come into effect until we release the pm_mutex lock, * doesn't come into effect until we release the
* since the freezer always works with pm_mutex held. * system_transition_mutex lock, since the freezer always works with
* system_transition_mutex held.
* *
* More importantly, in the case of hibernation, * More importantly, in the case of hibernation,
* unlock_system_sleep() gets called in snapshot_read() and * unlock_system_sleep() gets called in snapshot_read() and
@ -47,7 +47,7 @@ void unlock_system_sleep(void)
* enter the refrigerator, thus causing hibernation to lockup. * enter the refrigerator, thus causing hibernation to lockup.
*/ */
current->flags &= ~PF_FREEZER_SKIP; current->flags &= ~PF_FREEZER_SKIP;
mutex_unlock(&pm_mutex); mutex_unlock(&system_transition_mutex);
} }
EXPORT_SYMBOL_GPL(unlock_system_sleep); EXPORT_SYMBOL_GPL(unlock_system_sleep);

View file

@ -556,7 +556,7 @@ static int enter_state(suspend_state_t state)
} else if (!valid_state(state)) { } else if (!valid_state(state)) {
return -EINVAL; return -EINVAL;
} }
if (!mutex_trylock(&pm_mutex)) if (!mutex_trylock(&system_transition_mutex))
return -EBUSY; return -EBUSY;
if (state == PM_SUSPEND_TO_IDLE) if (state == PM_SUSPEND_TO_IDLE)
@ -590,7 +590,7 @@ static int enter_state(suspend_state_t state)
pm_pr_dbg("Finishing wakeup.\n"); pm_pr_dbg("Finishing wakeup.\n");
suspend_finish(); suspend_finish();
Unlock: Unlock:
mutex_unlock(&pm_mutex); mutex_unlock(&system_transition_mutex);
return error; return error;
} }

View file

@ -216,7 +216,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
if (!capable(CAP_SYS_ADMIN)) if (!capable(CAP_SYS_ADMIN))
return -EPERM; return -EPERM;
if (!mutex_trylock(&pm_mutex)) if (!mutex_trylock(&system_transition_mutex))
return -EBUSY; return -EBUSY;
lock_device_hotplug(); lock_device_hotplug();
@ -394,7 +394,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
} }
unlock_device_hotplug(); unlock_device_hotplug();
mutex_unlock(&pm_mutex); mutex_unlock(&system_transition_mutex);
return error; return error;
} }

View file

@ -294,7 +294,7 @@ void kernel_power_off(void)
} }
EXPORT_SYMBOL_GPL(kernel_power_off); EXPORT_SYMBOL_GPL(kernel_power_off);
static DEFINE_MUTEX(reboot_mutex); DEFINE_MUTEX(system_transition_mutex);
/* /*
* Reboot system call: for obvious reasons only root may call it, * Reboot system call: for obvious reasons only root may call it,
@ -338,7 +338,7 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off) if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
cmd = LINUX_REBOOT_CMD_HALT; cmd = LINUX_REBOOT_CMD_HALT;
mutex_lock(&reboot_mutex); mutex_lock(&system_transition_mutex);
switch (cmd) { switch (cmd) {
case LINUX_REBOOT_CMD_RESTART: case LINUX_REBOOT_CMD_RESTART:
kernel_restart(NULL); kernel_restart(NULL);
@ -389,7 +389,7 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
mutex_unlock(&reboot_mutex); mutex_unlock(&system_transition_mutex);
return ret; return ret;
} }

View file

@ -155,16 +155,17 @@ static inline void set_pcppage_migratetype(struct page *page, int migratetype)
* The following functions are used by the suspend/hibernate code to temporarily * The following functions are used by the suspend/hibernate code to temporarily
* change gfp_allowed_mask in order to avoid using I/O during memory allocations * change gfp_allowed_mask in order to avoid using I/O during memory allocations
* while devices are suspended. To avoid races with the suspend/hibernate code, * while devices are suspended. To avoid races with the suspend/hibernate code,
* they should always be called with pm_mutex held (gfp_allowed_mask also should * they should always be called with system_transition_mutex held
* only be modified with pm_mutex held, unless the suspend/hibernate code is * (gfp_allowed_mask also should only be modified with system_transition_mutex
* guaranteed not to run in parallel with that modification). * held, unless the suspend/hibernate code is guaranteed not to run in parallel
* with that modification).
*/ */
static gfp_t saved_gfp_mask; static gfp_t saved_gfp_mask;
void pm_restore_gfp_mask(void) void pm_restore_gfp_mask(void)
{ {
WARN_ON(!mutex_is_locked(&pm_mutex)); WARN_ON(!mutex_is_locked(&system_transition_mutex));
if (saved_gfp_mask) { if (saved_gfp_mask) {
gfp_allowed_mask = saved_gfp_mask; gfp_allowed_mask = saved_gfp_mask;
saved_gfp_mask = 0; saved_gfp_mask = 0;
@ -173,7 +174,7 @@ void pm_restore_gfp_mask(void)
void pm_restrict_gfp_mask(void) void pm_restrict_gfp_mask(void)
{ {
WARN_ON(!mutex_is_locked(&pm_mutex)); WARN_ON(!mutex_is_locked(&system_transition_mutex));
WARN_ON(saved_gfp_mask); WARN_ON(saved_gfp_mask);
saved_gfp_mask = gfp_allowed_mask; saved_gfp_mask = gfp_allowed_mask;
gfp_allowed_mask &= ~(__GFP_IO | __GFP_FS); gfp_allowed_mask &= ~(__GFP_IO | __GFP_FS);