kernel-fxtec-pro1x

History

David Gibson 90481622d7 hugepages: fix use after free bug in "quota" handling hugetlbfs_{get,put}_quota() are badly named. They don't interact with the general quota handling code, and they don't much resemble its behaviour. Rather than being about maintaining limits on on-disk block usage by particular users, they are instead about maintaining limits on in-memory page usage (including anonymous MAP_PRIVATE copied-on-write pages) associated with a particular hugetlbfs filesystem instance. Worse, they work by having callbacks to the hugetlbfs filesystem code from the low-level page handling code, in particular from free_huge_page(). This is a layering violation of itself, but more importantly, if the kernel does a get_user_pages() on hugepages (which can happen from KVM amongst others), then the free_huge_page() can be delayed until after the associated inode has already been freed. If an unmount occurs at the wrong time, even the hugetlbfs superblock where the "quota" limits are stored may have been freed. Andrew Barry proposed a patch to fix this by having hugepages, instead of storing a pointer to their address_space and reaching the superblock from there, had the hugepages store pointers directly to the superblock, bumping the reference count as appropriate to avoid it being freed. Andrew Morton rejected that version, however, on the grounds that it made the existing layering violation worse. This is a reworked version of Andrew's patch, which removes the extra, and some of the existing, layering violation. It works by introducing the concept of a hugepage "subpool" at the lower hugepage mm layer - that is a finite logical pool of hugepages to allocate from. hugetlbfs now creates a subpool for each filesystem instance with a page limit set, and a pointer to the subpool gets added to each allocated hugepage, instead of the address_space pointer used now. The subpool has its own lifetime and is only freed once all pages in it _and_ all other references to it (i.e. superblocks) are gone. subpools are optional - a NULL subpool pointer is taken by the code to mean that no subpool limits are in effect. Previous discussion of this bug found in: "Fix refcounting in hugetlbfs quota handling.". See: https://lkml.org/lkml/2011/8/11/28 or http://marc.info/?l=linux-mm&m=126928970510627&w=1 v2: Fixed a bug spotted by Hillf Danton, and removed the extra parameter to alloc_huge_page() - since it already takes the vma, it is not necessary. Signed-off-by: Andrew Barry <abarry@cray.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2012-03-21 17:54:59 -07:00
..
9p	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs	2012-01-10 15:09:01 -08:00
adfs	vfs: switch ->show_options() to struct dentry *	2012-01-06 23:19:54 -05:00
affs	affs: propagate umode_t	2012-01-03 22:55:04 -05:00
afs	Merge branch 'kmap_atomic' of git://github.com/congwang/linux	2012-03-21 09:40:26 -07:00
autofs4	autofs: work around unhappy compat problem on x86-64	2012-02-25 12:10:27 -08:00
befs	vfs: fix the stupidity with i_dentry in inode destructors	2012-01-03 22:52:40 -05:00
bfs	switch ->create() to umode_t	2012-01-03 22:54:53 -05:00
btrfs	Merge branch 'kmap_atomic' of git://github.com/congwang/linux	2012-03-21 09:40:26 -07:00
cachefiles	fs: move code out of buffer.c	2012-01-03 22:54:07 -05:00
ceph	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client	2012-02-02 15:47:33 -08:00
cifs	CIFS: Do not kmalloc under the flocks spinlock	2012-03-06 21:50:15 -06:00
coda	coda: switch coda_cnode_make() to sane API as well, clean coda_lookup()	2012-01-10 11:13:16 -05:00
configfs	configfs: convert to umode_t	2012-01-03 22:54:57 -05:00
cramfs	cramfs: Fix typo in inode.c	2012-02-21 11:40:35 +01:00
debugfs	Merge 3.3-rc2 into the driver-core-next branch.	2012-02-02 11:24:44 -08:00
devpts	tty: rework pty count limiting	2012-01-24 14:01:01 -08:00
dlm	dlm: Do not allocate a fd for peeloff	2012-03-08 13:52:09 -08:00
ecryptfs	ecryptfs: fix printk format warning for size_t	2012-02-28 16:55:30 -08:00
efs	vfs: fix the stupidity with i_dentry in inode destructors	2012-01-03 22:52:40 -05:00
exofs	exofs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:22 +08:00
exportfs
ext2	ext2: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:22 +08:00
ext3	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs	2012-01-09 12:51:21 -08:00
ext4	Merge branch 'for_linus' into for_linus_merged	2012-01-10 11:54:07 -05:00
fat	Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb	2012-01-09 12:09:47 -08:00
freevxfs	fs: propagate umode_t, misc bits	2012-01-03 22:55:10 -05:00
fscache
fuse	fuse: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:22 +08:00
gfs2	gfs2: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:23 +08:00
hfs	vfs: switch ->show_options() to struct dentry *	2012-01-06 23:19:54 -05:00
hfsplus	hfsplus: creation of hidden dir on mount can fail	2012-01-10 17:48:52 -05:00
hostfs	vfs: switch ->show_options() to struct dentry *	2012-01-06 23:19:54 -05:00
hpfs	switch ->mknod() to umode_t	2012-01-03 22:54:54 -05:00
hppfs	vfs: for usbfs, etc. internal vfsmounts ->mnt_sb->s_root == ->mnt_root	2012-01-03 22:52:41 -05:00
hugetlbfs	hugepages: fix use after free bug in "quota" handling	2012-03-21 17:54:59 -07:00
isofs	isofs: inode leak on mount failure	2012-01-09 10:48:11 -05:00
jbd	Power management updates for 3.4	2012-03-21 10:15:51 -07:00
jbd2	Power management updates for 3.4	2012-03-21 10:15:51 -07:00
jffs2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2012-03-20 21:12:50 -07:00
jfs	Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm	2012-01-08 13:10:57 -08:00
lockd	module_param: make bool parameters really bool (drivers & misc)	2012-01-13 09:32:20 +10:30
logfs	logfs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:24 +08:00
minix	minix: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:24 +08:00
ncpfs	vfs: switch ->show_options() to struct dentry *	2012-01-06 23:19:54 -05:00
nfs	nfs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:24 +08:00
nfs_common
nfsd	Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux	2012-01-14 12:26:41 -08:00
nilfs2	nilfs2: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:24 +08:00
nls	NLS: raname "maxlen" to "maxout" in UTF conversion routines	2011-11-26 19:58:47 -08:00
notify	fsnotify: don't BUG in fsnotify_destroy_mark()	2012-01-14 18:01:42 -08:00
ntfs	Merge branch 'kmap_atomic' of git://github.com/congwang/linux	2012-03-21 09:40:26 -07:00
ocfs2	ocfs2: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:25 +08:00
omfs	omfs: propagate umode_t	2012-01-03 22:55:01 -05:00
openpromfs	vfs: fix the stupidity with i_dentry in inode destructors	2012-01-03 22:52:40 -05:00
proc	procfs: mark thread stack correctly in proc/<pid>/maps	2012-03-21 17:54:58 -07:00
pstore	pstore: gracefully handle NULL pstore_info functions	2011-11-18 13:49:00 -08:00
qnx4	qnx4: don't leak ->BitMap on late failure exits	2012-01-19 13:54:36 -05:00
quota	quota: Fix deadlock with suspend and quotas	2012-02-13 20:45:39 -05:00
ramfs	pohmelfs: propagate umode_t	2012-01-03 22:55:07 -05:00
reiserfs	Merge branch 'kmap_atomic' of git://github.com/congwang/linux	2012-03-21 09:40:26 -07:00
romfs	MTD pull for 3.3	2012-01-10 13:45:22 -08:00
squashfs	squashfs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:25 +08:00
sysfs	Revert "sysfs: Kill nlink counting."	2012-03-08 13:03:10 -08:00
sysv	vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb	2012-01-06 23:16:53 -05:00
ubifs	ubifs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:26 +08:00
udf	udf: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:26 +08:00
ufs	vfs: switch ->show_options() to struct dentry *	2012-01-06 23:19:54 -05:00
xfs	xfs: make inode quota check more general	2012-02-21 10:12:43 -06:00
aio.c	fs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:21 +08:00
anon_inodes.c
attr.c	switch is_sxid() to umode_t	2012-01-03 22:55:11 -05:00
bad_inode.c	switch ->mknod() to umode_t	2012-01-03 22:54:54 -05:00
binfmt_aout.c	aout: move setup_arg_pages() prior to reading/mapping the binary	2012-03-05 13:51:32 -08:00
binfmt_elf.c	regset: Prevent null pointer reference on readonly regsets	2012-03-02 11:38:15 -08:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c	vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb	2012-01-06 23:16:53 -05:00
binfmt_script.c
binfmt_som.c
bio-integrity.c	fs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:21 +08:00
bio.c	bio: don't overflow in bio_get_nr_vecs()	2012-02-08 22:07:18 +01:00
block_dev.c	block: Fix NULL pointer dereference in sd_revalidate_disk	2012-03-02 10:38:33 +01:00
buffer.c	fs: move code out of buffer.c	2012-01-03 22:54:07 -05:00
char_dev.c	char_dev.c: fix up some whitespace errors	2011-12-13 11:18:17 -08:00
compat.c	vfs: fix compat_sys_stat() handling of overflows in st_nlink	2012-02-13 20:45:39 -05:00
compat_binfmt_elf.c
compat_ioctl.c	ppp: Replace uses of <linux/if_ppp.h> with <linux/ppp-ioctl.h>	2012-03-04 20:41:38 -05:00
dcache.c	Merge branch 'dcache-word-accesses'	2012-03-19 16:37:28 -07:00
dcookies.c
direct-io.c	Restore direct_io / truncate locking API	2012-02-23 15:56:21 -08:00
drop_caches.c
eventfd.c
eventpoll.c	Don't limit non-nested epoll paths	2012-03-18 12:25:04 -07:00
exec.c	Merge branch 'kmap_atomic' of git://github.com/congwang/linux	2012-03-21 09:40:26 -07:00
fcntl.c
fhandle.c	vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb	2012-01-06 23:16:53 -05:00
fifo.c
file.c
file_table.c	vfs: prevent remount read-only if pending removes	2012-01-06 23:20:13 -05:00
filesystems.c	vfs: convert fs_supers to hlist	2012-01-03 22:52:39 -05:00
fs-writeback.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2012-03-20 21:12:50 -07:00
fs_struct.c
generic_acl.c
inode.c	restore smp_mb() in unlock_new_inode()	2012-03-10 17:07:28 -05:00
internal.h	vfs: protect remounting superblock read-only	2012-01-06 23:20:12 -05:00
ioctl.c	vfs: fix up ENOIOCTLCMD error handling	2012-01-05 15:40:12 -08:00
ioprio.c	block: strip out locking optimization in put_io_context()	2012-02-07 07:51:30 +01:00
Kconfig	vfs: use 'unsigned long' accesses for dcache name comparison and hashing	2012-03-08 18:08:44 -08:00
Kconfig.binfmt	fs: binfmt_elf: create Kconfig variable for PIE randomization	2012-01-10 16:30:51 -08:00
libfs.c	fs: move code out of buffer.c	2012-01-03 22:54:07 -05:00
locks.c	vfs: fix handling of lock allocation failure in lease-break case	2011-12-26 10:25:26 -08:00
Makefile	Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z	2012-01-06 23:15:54 -05:00
mbcache.c
mount.h	vfs: keep list of mounts for each superblock	2012-01-06 23:20:12 -05:00
mpage.c	fs: remove unneeded plug in mpage_readpages()	2012-01-12 09:19:54 +01:00
namei.c	fs/namei.c: fix warnings on 32-bit	2012-03-21 17:54:54 -07:00
namespace.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2012-01-08 13:21:22 -08:00
no-block.c
open.c	switch security_path_chmod() to struct path *	2012-01-06 23:16:53 -05:00
pipe.c	fs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:21 +08:00
pnode.c	vfs: switch pnode.h macros to struct mount *	2012-01-03 22:57:11 -05:00
pnode.h	vfs: switch pnode.h macros to struct mount *	2012-01-03 22:57:11 -05:00
posix_acl.c	vfs: pass all mask flags check_acl and posix_acl_permission	2011-10-28 14:58:54 +02:00
proc_namespace.c	vfs: switch ->show_options() to struct dentry *	2012-01-06 23:19:54 -05:00
read_write.c	Cross Memory Attach	2011-10-31 17:30:44 -07:00
read_write.h
readdir.c
select.c	sys_poll: fix incorrect type for 'timeout' parameter	2012-02-21 17:24:20 -08:00
seq_file.c	seq_file: fix mishandling of consecutive pread() invocations.	2012-03-21 17:54:54 -07:00
signalfd.c	epoll: ep_unregister_pollwait() can use the freed pwq->whead	2012-02-24 11:42:50 -08:00
splice.c	fs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:21 +08:00
stack.c	filesystems: add set_nlink()	2011-11-02 12:53:43 +01:00
stat.c	readlinkat: ensure we return ENOENT for the empty pathname for normal lookups	2011-11-02 12:53:42 +01:00
statfs.c	vfs: new helper - vfs_ustat()	2012-01-03 22:53:07 -05:00
super.c	vfs: Provide function to get superblock and wait for it to thaw	2012-02-13 20:45:38 -05:00
sync.c	fs: move code out of buffer.c	2012-01-03 22:54:07 -05:00
timerfd.c
utimes.c
xattr.c	vfs: mnt_drop_write_file()	2012-01-03 22:52:40 -05:00
xattr_acl.c