a80a6b85b4
Revert commit 03a7beb55b
("epoll: support for disabling items, and a
self-test app") pending resolution of the issues identified by Michael
Kerrisk, copied below.
We'll revisit this for 3.8.
: I've taken a look at this patch as it currently stands in 3.7-rc1, and
: done a bit of testing. (By the way, the test program
: tools/testing/selftests/epoll/test_epoll.c does not compile...)
:
: There are one or two places where the behavior seems a little strange,
: so I have a question or two at the end of this mail. But other than
: that, I want to check my understanding so that the interface can be
: correctly documented.
:
: Just to go though my understanding, the problem is the following
: scenario in a multithreaded application:
:
: 1. Multiple threads are performing epoll_wait() operations,
: and maintaining a user-space cache that contains information
: corresponding to each file descriptor being monitored by
: epoll_wait().
:
: 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
: a file descriptor from the epoll interest list, and
: delete the corresponding record from the user-space cache.
:
: 3. The problem with (2) is that some other thread may have
: previously done an epoll_wait() that retrieved information
: about the fd in question, and may be in the middle of using
: information in the cache that relates to that fd. Thus,
: there is a potential race.
:
: 4. The race can't solved purely in user space, because doing
: so would require applying a mutex across the epoll_wait()
: call, which would of course blow thread concurrency.
:
: Right?
:
: Your solution is the EPOLL_CTL_DISABLE operation. I want to
: confirm my understanding about how to use this flag, since
: the description that has accompanied the patches so far
: has been a bit sparse
:
: 0. In the scenario you're concerned about, deleting a file
: descriptor means (safely) doing the following:
: (a) Deleting the file descriptor from the epoll interest list
: using EPOLL_CTL_DEL
: (b) Deleting the corresponding record in the user-space cache
:
: 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
: conjunction with EPOLLONESHOT.
:
: 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
: conjunction is a logical error.
:
: 3. The correct way to code multithreaded applications using
: EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
:
: a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
: should EPOLLONESHOT.
:
: b. When a thread wants to delete a file descriptor, it
: should do the following:
:
: [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
: [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
: was zero, then the file descriptor can be safely
: deleted by the thread that made this call.
: [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
: then the descriptor is in use. In this case, the calling
: thread should set a flag in the user-space cache to
: indicate that the thread that is using the descriptor
: should perform the deletion operation.
:
: Is all of the above correct?
:
: The implementation depends on checking on whether
: (events & ~EP_PRIVATE_BITS) == 0
: This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
: set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
: causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
: cleared.
:
: A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
: is only useful in conjunction with EPOLLONESHOT. However, as things
: stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
: not have EPOLLONESHOT set in 'events' This results in the following
: (slightly surprising) behavior:
:
: (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
: (the indicator that the file descriptor can be safely deleted).
: (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
:
: This doesn't seem particularly useful, and in fact is probably an
: indication that the user made a logic error: they should only be using
: epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
: EPOLLONESHOT was set in 'events'. If that is correct, then would it
: not make sense to return an error to user space for this case?
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
65 lines
1.7 KiB
C
65 lines
1.7 KiB
C
/*
|
|
* include/linux/eventpoll.h ( Efficient event polling implementation )
|
|
* Copyright (C) 2001,...,2006 Davide Libenzi
|
|
*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License as published by
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
* (at your option) any later version.
|
|
*
|
|
* Davide Libenzi <davidel@xmailserver.org>
|
|
*
|
|
*/
|
|
|
|
#ifndef _UAPI_LINUX_EVENTPOLL_H
|
|
#define _UAPI_LINUX_EVENTPOLL_H
|
|
|
|
/* For O_CLOEXEC */
|
|
#include <linux/fcntl.h>
|
|
#include <linux/types.h>
|
|
|
|
/* Flags for epoll_create1. */
|
|
#define EPOLL_CLOEXEC O_CLOEXEC
|
|
|
|
/* Valid opcodes to issue to sys_epoll_ctl() */
|
|
#define EPOLL_CTL_ADD 1
|
|
#define EPOLL_CTL_DEL 2
|
|
#define EPOLL_CTL_MOD 3
|
|
|
|
/*
|
|
* Request the handling of system wakeup events so as to prevent system suspends
|
|
* from happening while those events are being processed.
|
|
*
|
|
* Assuming neither EPOLLET nor EPOLLONESHOT is set, system suspends will not be
|
|
* re-allowed until epoll_wait is called again after consuming the wakeup
|
|
* event(s).
|
|
*
|
|
* Requires CAP_BLOCK_SUSPEND
|
|
*/
|
|
#define EPOLLWAKEUP (1 << 29)
|
|
|
|
/* Set the One Shot behaviour for the target file descriptor */
|
|
#define EPOLLONESHOT (1 << 30)
|
|
|
|
/* Set the Edge Triggered behaviour for the target file descriptor */
|
|
#define EPOLLET (1 << 31)
|
|
|
|
/*
|
|
* On x86-64 make the 64bit structure have the same alignment as the
|
|
* 32bit structure. This makes 32bit emulation easier.
|
|
*
|
|
* UML/x86_64 needs the same packing as x86_64
|
|
*/
|
|
#ifdef __x86_64__
|
|
#define EPOLL_PACKED __attribute__((packed))
|
|
#else
|
|
#define EPOLL_PACKED
|
|
#endif
|
|
|
|
struct epoll_event {
|
|
__u32 events;
|
|
__u64 data;
|
|
} EPOLL_PACKED;
|
|
|
|
|
|
#endif /* _UAPI_LINUX_EVENTPOLL_H */
|