kernel-fxtec-pro1x

Author	SHA1	Message	Date
Josef Bacik	6a91221304	Btrfs: use dget_parent where we can UPDATED There are lots of places where we do dentry->d_parent->d_inode without holding the dentry->d_lock. This could cause problems with rename. So instead we need to use dget_parent() and hold the reference to the parent as long as we are going to use it's inode and then dput it at the end. Signed-off-by: Josef Bacik <josef@redhat.com> Cc: raven@themaw.net Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:09 -05:00
Josef Bacik	7619585390	Btrfs: fix more ESTALE problems with NFS When creating new inodes we don't setup inode->i_generation. So if we generate an fh with a newly created inode we save the generation of 0, but if we flush the inode to disk and have to read it back when getting the inode on the server we'll have the right i_generation, so gens wont match and we get ESTALE. This patch properly sets inode->i_generation when we create the new inode and now I'm no longer getting ESTALE. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:08 -05:00
Josef Bacik	2ede0daf01	Btrfs: handle NFS lookups properly People kept reporting NFS issues, specifically getting ESTALE alot. I figured out how to reproduce the problem SERVER mkfs.btrfs /dev/sda1 mount /dev/sda1 /mnt/btrfs-test <add /mnt/btrfs-test to /etc/exports> btrfs subvol create /mnt/btrfs-test/foo service nfs start CLIENT mount server:/mnt/btrfs /mnt/test cd /mnt/test/foo ls SERVER echo 3 > /proc/sys/vm/drop_caches CLIENT ls <-- get an ESTALE here This is because the standard way to lookup a name in nfsd is to use readdir, and what it does is do a readdir on the parent directory looking for the inode of the child. So in this case the parent being / and the child being foo. Well subvols all have the same inode number, so doing a readdir of / looking for inode 256 will return '.', which obviously doesn't match foo. So instead we need to have our own .get_name so that we can find the right name. Our .get_name will either lookup the inode backref or the root backref, whichever we're looking for, and return the name we find. Running the above reproducer with this patch results in everything acting the way its supposed to. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:08 -05:00
Mariusz Kozlowski	0410c94aff	btrfs: make 1-bit signed fileds unsigned Fixes these sparse warnings: fs/btrfs/ctree.h:811:17: error: dubious one-bit signed bitfield fs/btrfs/ctree.h:812:20: error: dubious one-bit signed bitfield fs/btrfs/ctree.h:813:19: error: dubious one-bit signed bitfield Signed-off-by: Mariusz Kozlowski <mk@lab.zgora.pl> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:07 -05:00
Li Zefan	f209561ad8	btrfs: Show device attr correctly for symlinks Symlinks and files of other types show different device numbers, though they are on the same partition: $ touch tmp; ln -s tmp tmp2; stat tmp tmp2 File: `tmp' Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 15h/21d Inode: 984027 Links: 1 --- snip --- File: `tmp2' -> `tmp' Size: 3 Blocks: 0 IO Block: 4096 symbolic link Device: 13h/19d Inode: 984028 Links: 1 Reported-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:07 -05:00
Li Zefan	5f3888ff6f	btrfs: Set file size correctly in file clone Set src_offset = 0, src_length = 20K, dest_offset = 20K. And the original filesize of the dest file 'file2' is 30K: # ls -l /mnt/file2 -rw-r--r-- 1 root root 30720 Nov 18 16:42 /mnt/file2 Now clone file1 to file2, the dest file should be 40K, but it still shows 30K: # ls -l /mnt/file2 -rw-r--r-- 1 root root 30720 Nov 18 16:42 /mnt/file2 Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:06 -05:00
Li Zefan	2a6b8daeda	btrfs: Check if dest_offset is block-size aligned before cloning file We've done the check for src_offset and src_length, and We should also check dest_offset, otherwise we'll corrupt the destination file: (After cloning file1 to file2 with unaligned dest_offset) # cat /mnt/file2 cat: /mnt/file2: Input/output error Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:05 -05:00
Josef Bacik	0de90876c6	Btrfs: handle the space_cache option properly When I added the clear_cache option I screwed up and took the break out of the space_cache case statement, so whenever you mount with space_cache you also get clear_cache, which does you no good if you say set space_cache in fstab so it always gets set. This patch adds the break back in properly. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:05 -05:00
Arne Jansen	6f33434850	btrfs: Fix early enospc because 'unused' calculated with wrong sign. 'unused' calculated with wrong sign in reserve_metadata_bytes(). This might have lead to unwanted over-reservations. Signed-off-by: Arne Jansen <sensille@gmx.net> Reviewed-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:04 -05:00
Miao Xie	e65e153554	btrfs: fix panic caused by direct IO btrfs paniced when we write >64KB data by direct IO at one time. Reproduce steps: # mkfs.btrfs /dev/sda5 /dev/sda6 # mount /dev/sda5 /mnt # dd if=/dev/zero of=/mnt/tmpfile bs=100K count=1 oflag=direct Then btrfs paniced: mapping failed logical 1103155200 bio len 69632 len 12288 ------------[ cut here ]------------ kernel BUG at fs/btrfs/volumes.c:3010! [SNIP] Pid: 1992, comm: btrfs-worker-0 Not tainted 2.6.37-rc1 #1 D2399/PRIMERGY RIP: 0010:[<ffffffffa03d1462>] [<ffffffffa03d1462>] btrfs_map_bio+0x202/0x210 [btrfs] [SNIP] Call Trace: [<ffffffffa03ab3eb>] __btrfs_submit_bio_done+0x1b/0x20 [btrfs] [<ffffffffa03a35ff>] run_one_async_done+0x9f/0xb0 [btrfs] [<ffffffffa03d3d20>] run_ordered_completions+0x80/0xc0 [btrfs] [<ffffffffa03d45a4>] worker_loop+0x154/0x5f0 [btrfs] [<ffffffffa03d4450>] ? worker_loop+0x0/0x5f0 [btrfs] [<ffffffffa03d4450>] ? worker_loop+0x0/0x5f0 [btrfs] [<ffffffff81083216>] kthread+0x96/0xa0 [<ffffffff8100cec4>] kernel_thread_helper+0x4/0x10 [<ffffffff81083180>] ? kthread+0x0/0xa0 [<ffffffff8100cec0>] ? kernel_thread_helper+0x0/0x10 We fix this problem by splitting bios when we submit bios. Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Tested-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:04 -05:00
Miao Xie	88f794ede7	btrfs: cleanup duplicate bio allocating functions extent_bio_alloc() and compressed_bio_alloc() are similar, cleanup similar source code. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:03 -05:00
Miao Xie	0c56fa9662	btrfs: fix free dip and dip->csums twice bio_endio() will free dip and dip->csums, so dip and dip->csums twice will be freed twice. Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:02 -05:00
Chris Mason	784b4e29a2	Btrfs: add migrate page for metadata inode Migrate page will directly call the btrfs btree writepage function, which isn't actually allowed. Our writepage assumes that you have locked the extent_buffer and flagged the block as written. Without doing these steps, we can corrupt metadata blocks. A later commit will remove the btree writepage function since it is really only safely used internally by btrfs. We use writepages for everything else. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-11-21 22:26:02 -05:00
Linus Torvalds	b86db47442	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard fs: Do not dispatch FITRIM through separate super_operation ext4: ext4_fill_super shouldn't return 0 on corruption jbd2: fix /proc/fs/jbd2/<dev> when using an external journal ext4: missing unlock in ext4_clear_request_list() ext4: fix setting random pages PageUptodate	2010-11-19 19:46:45 -08:00
Lukas Czerner	e681c047e4	ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard Filesystem independent ioctl was rejected as not common enough to be in core vfs ioctl. Since we still need to access to this functionality this commit adds ext4 specific ioctl EXT4_IOC_TRIM to dispatch ext4_trim_fs(). It takes fstrim_range structure as an argument. fstrim_range is definec in the include/linux/fs.h and its definition is as follows. struct fstrim_range { __u64 start; __u64 len; __u64 minlen; } start - first Byte to trim len - number of Bytes to trim from start minlen - minimum extent length to trim, free extents shorter than this number of Bytes will be ignored. This will be rounded up to fs block size. After the FITRIM is done, the number of actually discarded Bytes is stored in fstrim_range.len to give the user better insight on how much storage space has been really released for wear-leveling. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-19 21:47:07 -05:00
Lukas Czerner	93bb41f4f8	fs: Do not dispatch FITRIM through separate super_operation There was concern that FITRIM ioctl is not common enough to be included in core vfs ioctl, as Christoph Hellwig pointed out there's no real point in dispatching this out to a separate vector instead of just through ->ioctl. So this commit removes ioctl_fstrim() from vfs ioctl and trim_fs from super_operation structure. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-19 21:18:35 -05:00
Linus Torvalds	76db8ac45f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: fix readdir EOVERFLOW on 32-bit archs ceph: fix frag offset for non-leftmost frags ceph: fix dangling pointer ceph: explicitly specify page alignment in network messages ceph: make page alignment explicit in osd interface ceph: fix comment, remove extraneous args ceph: fix update of ctime from MDS ceph: fix version check on racing inode updates ceph: fix uid/gid on resent mds requests ceph: fix rdcache_gen usage and invalidate ceph: re-request max_size if cap auth changes ceph: only let auth caps update max_size ceph: fix open for write on clustered mds ceph: fix bad pointer dereference in ceph_fill_trace ceph: fix small seq message skipping Revert "ceph: update issue_seq on cap grant"	2010-11-19 15:32:22 -08:00
Darrick J. Wong	5a9ae68a34	ext4: ext4_fill_super shouldn't return 0 on corruption At the start of ext4_fill_super, ret is set to -EINVAL, and any failure path out of that function returns ret. However, the generic_check_addressable clause sets ret = 0 (if it passes), which means that a subsequent failure (e.g. a group checksum error) returns 0 even though the mount should fail. This causes vfs_kern_mount in turn to think that the mount succeeded, leading to an oops. A simple fix is to avoid using ret for the generic_check_addressable check, which was last changed in commit `30ca22c70e`. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-19 09:56:44 -05:00
Abhijith Das	14870b4575	GFS2: Userland expects quota limit/warn/usage in 512b blocks Userland programs using the quotactl() syscall assume limit/warn/usage block counts in 512b basic blocks which were instead being read/written in fs blocksize in gfs2. With this patch, gfs2 correctly interacts with the syscall using 512b blocks. Signed-off-by: Abhi Das <adas@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2010-11-19 11:20:29 +00:00
Sage Weil	3105c19c45	ceph: fix readdir EOVERFLOW on 32-bit archs One of the readdir filldir_t callers was passing the raw ceph 64-bit ino instead of the hashed 32-bit one, producing an EOVERFLOW in the filler callback. Fix this by calling the ceph_vino_to_ino() helper to do the conversion. Reported-by: Jan Smets <jan.smets@alcatel-lucent.com> Tested-by: Jan Smets <jan.smets@alcatel-lucent.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-18 09:15:07 -08:00
yangsheng	0587aa3d11	jbd2: fix /proc/fs/jbd2/<dev> when using an external journal In jbd2_journal_init_dev(), we need make sure the journal structure is fully initialzied before calling jbd2_stats_proc_init(). Reviewed-by: Andreas Dilger <andreas.dilger@oracle.com> Signed-off-by: yangsheng <sheng.yang@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-17 21:46:26 -05:00
Dan Carpenter	f4c8cc652d	ext4: missing unlock in ext4_clear_request_list() If the the li_request_list was empty then it returned with the lock held. Instead of adding a "goto unlock" I just removed that special case and let it go past the empty list_for_each_safe(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-17 21:46:25 -05:00
Markus Trippelsdorf	08da1193d2	ext4: fix setting random pages PageUptodate ext4_end_bio calls put_page and kmem_cache_free before calling SetPageUpdate(). This can result in setting the PageUptodate bit on random pages and causes the following BUG: BUG: Bad page state in process rm pfn:52e54 page:ffffea0001222260 count:0 mapcount:0 mapping: (null) index:0x0 arch kernel: page flags: 0x4000000000000008(uptodate) Fix the problem by moving put_io_page() after the SetPageUpdate() call. Thanks to Hugh Dickins for analyzing this problem. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-17 21:46:06 -05:00
Arnd Bergmann	460781b542	BKL: remove references to lock_kernel from comments Lock_kernel is gone from the code, so the comments should be updated, too. nfsd now uses lock_flocks instead of lock_kernel to protect against posix file locks. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
Arnd Bergmann	451a3c24b0	BKL: remove extraneous #include <smp_lock.h> The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
Catalin Marinas	04e4bd1c67	nfs: Ignore kmemleak false positive in nfs_readdir_make_qstr Strings allocated via kmemdup() in nfs_readdir_make_qstr() are referenced from the nfs_cache_array which is stored in a page cache page. Kmemleak does not scan such pages and it reports several false positives. This patch annotates the string->name pointer so that kmemleak does not consider it a real leak. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-11-16 12:03:14 -05:00
Trond Myklebust	ac39612824	NFS: readdir shouldn't read beyond the reply returned by the server Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-11-15 20:44:29 -05:00
Trond Myklebust	8cd51a0ccd	NFS: Fix a couple of regressions in readdir. Fix up the issue that array->eof_index needs to be able to be set even if array->size == 0. Ensure that we catch all important memory allocation error conditions and/or kmap() failures. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-11-15 20:44:28 -05:00
Trond Myklebust	23ebbd9acf	Revert "NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR" This reverts commit `80e60639f1`. This change requires further fixes to ensure that the open doesn't succeed if the lookup later results in a regular file being created. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-11-15 20:44:27 -05:00
Paulius Zaleckas	1e657bd51f	Regression: fix mounting NFS when NFSv3 support is not compiled Trying to mount NFS (root partition in my case) fails if CONFIG_NFS_V3 is not selected. nfs_validate_mount_data() returns EPROTONOSUPPORT, because of this check: #ifndef CONFIG_NFS_V3 if (args->version == 3) goto out_v3_not_compiled; #endif /* !CONFIG_NFS_V3 */ and args->version was always initialized to 3. It was working in 2.6.36 Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-11-15 20:44:27 -05:00
Trond Myklebust	8e35f8e7c6	NLM: Fix a regression in lockd Nick Bowler reports: There are no unusual messages on the client... but I just logged into the server and I see lots of messages of the following form: nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! Bisected to commit `9247685088` (SUNRPC: Properly initialize sock_xprt.srcaddr in all cases) Apparently, removing the 'transport->srcaddr.ss_family = family' from xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly initialising the srcaddr family to AF_UNSPEC. Reported-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-11-15 20:44:26 -05:00
Linus Torvalds	620751a259	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Fix inode deallocation race	2010-11-15 08:43:29 -08:00
Steven Whitehouse	044b9414c7	GFS2: Fix inode deallocation race This area of the code has always been a bit delicate due to the subtleties of lock ordering. The problem is that for "normal" alloc/dealloc, we always grab the inode locks first and the rgrp lock later. In order to ensure no races in looking up the unlinked, but still allocated inodes, we need to hold the rgrp lock when we do the lookup, which means that we can't take the inode glock. The solution is to borrow the technique already used by NFS to solve what is essentially the same problem (given an inode number, look up the inode carefully, checking that it really is in the expected state). We cannot do that directly from the allocation code (lock ordering again) so we give the job to the pre-existing delete workqueue and carry on with the allocation as normal. If we find there is no space, we do a journal flush (required anyway if space from a deallocation is to be released) which should block against the pending deallocations, so we should always get the space back. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2010-11-15 12:44:42 +00:00
Greg Thelen	d69b78ba1d	ioprio: grab rcu_read_lock in sys_ioprio_{set,get}() Using: - CONFIG_LOCKUP_DETECTOR=y - CONFIG_PREEMPT=y - CONFIG_LOCKDEP=y - CONFIG_PROVE_LOCKING=y - CONFIG_PROVE_RCU=y found a missing rcu lock during boot on a 512 MiB x86_64 ubuntu vm: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/pid.c:419 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by ureadahead/1355: #0: (tasklist_lock){.+.+..}, at: [<ffffffff8115bc09>] sys_ioprio_set+0x7f/0x29e stack backtrace: Pid: 1355, comm: ureadahead Not tainted 2.6.37-dbg-DEV #1 Call Trace: [<ffffffff8109c10c>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffff81088cbf>] find_task_by_pid_ns+0x44/0x5d [<ffffffff81088cfa>] find_task_by_vpid+0x22/0x24 [<ffffffff8115bc3e>] sys_ioprio_set+0xb4/0x29e [<ffffffff8147cf21>] ? trace_hardirqs_off_thunk+0x3a/0x3c [<ffffffff8105c409>] sysenter_dispatch+0x7/0x2c [<ffffffff8147cee2>] ? trace_hardirqs_on_thunk+0x3a/0x3f The fix is to: a) grab rcu lock in sys_ioprio_{set,get}() and b) avoid grabbing tasklist_lock. Discussion in: http://marc.info/?l=linux-kernel&m=128951324702889 Signed-off-by: Greg Thelen <gthelen@google.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Modified by Jens to remove the now redundant inner rcu lock and unlock since they are now protected by the outer lock. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-11-15 10:23:31 +01:00
Linus Torvalds	1ca7318cac	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Change some lock status member in ocfs2_lock_res to char.	2010-11-14 13:04:53 -08:00
Steve French	362d31297f	[CIFS] fs/cifs/Kconfig: CIFS depends on CRYPTO_HMAC linux-2.6.37-rc1: I compiled a kernel with CIFS which subsequently failed with an error indicating it couldn't initialize crypto module "hmacmd5". CONFIG_CRYPTO_HMAC=y fixed the problem. This patch makes CIFS depend on CRYPTO_HMAC in kconfig. Signed-off-by: Jody Bruchon<jody@nctritech.com> CC: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-14 03:34:30 +00:00
Tao Ma	1c66b360fe	ocfs2: Change some lock status member in ocfs2_lock_res to char. Commit `83fd9c7` changes l_level, l_requested and l_blocking of ocfs2_lock_res from int to unsigned char. But actually it is initially as -1(ocfs2_lock_res_init_common) which correspoding to 255 for unsigned char. So the whole dlm lock mechanism doesn't work now which means a disaster to ocfs2. Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-11-13 03:15:08 -08:00
Jeff Layton	59c55ba1fb	cifs: don't take extra tlink reference in initiate_cifs_search It's possible for initiate_cifs_search to be called on a filp that already has private_data attached. If this happens, we'll end up calling cifs_sb_tlink, taking an extra reference to the tlink and attaching that to the cifsFileInfo. This leads to refcount leaks that manifest as a "stuck" cifsd at umount time. Fix this by only looking up the tlink for the cifsFile on the filp's first pass through this function. When called on a filp that already has cifsFileInfo associated with it, just use the tlink reference that it already owns. This patch fixes samba.org bug 7792: https://bugzilla.samba.org/show_bug.cgi?id=7792 Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-13 03:26:17 +00:00
Linus Torvalds	8a9f772c14	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits) block: remove unused copy_io_context() Documentation: remove anticipatory scheduler info block: remove REQ_HARDBARRIER ioprio: rcu_read_lock/unlock protect find_task_by_vpid call (V2) ioprio: fix RCU locking around task dereference block: ioctl: fix information leak to userland block: read i_size with i_size_read() cciss: fix proc warning on attempt to remove non-existant directory bio: take care not overflow page count when mapping/copying user data block: limit vec count in bio_kmalloc() and bio_alloc_map_data() block: take care not to overflow when calculating total iov length block: check for proper length of iov entries in blk_rq_map_user_iov() cciss: remove controllers supported by hpsa cciss: use usleep_range not msleep for small sleeps cciss: limit commands allocated on reset_devices cciss: Use kernel provided PCI state save and restore functions cciss: fix board status waiting code drbd: Removed checks for REQ_HARDBARRIER on incomming BIOs drbd: REQ_HARDBARRIER -> REQ_FUA transition for meta data accesses drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch code ...	2010-11-12 08:52:47 -08:00
Linus Torvalds	fb1cb7b27b	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: remove incorrect assert in xfs_vm_writepage xfs: use hlist_add_fake xfs: fix a few compiler warnings with CONFIG_XFS_QUOTA=n xfs: tell lockdep about parent iolock usage in filestreams xfs: move delayed write buffer trace xfs: fix per-ag reference counting in inode reclaim tree walking xfs: xfs_ioctl: fix information leak to userland xfs: remove experimental tag from the delaylog option	2010-11-12 08:11:03 -08:00
Linus Torvalds	0f90933c47	Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linux * 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: locks: remove dead lease error-handling code locks: fix leak on merging leases nfsd4: fix 4.1 connection registration race	2010-11-12 07:59:41 -08:00
Dave Jones	52ca0e84b0	hugetlbfs: lessen the impact of a deprecation warning WARN_ONCE is a bit strong for a deprecation warning, given that it spews a huge backtrace. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-12 07:55:32 -08:00
Sage Weil	7b88dadc13	ceph: fix frag offset for non-leftmost frags We start at offset 2 for the leftmost frag, and 0 for subsequent frags. When we reach the end (rightmost), we go back to 2. This fixes readdir on fragmented (large) directories. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-11 16:48:59 -08:00
Sage Weil	a1629c3b24	ceph: fix dangling pointer Clear fi->last_name when it's freed. The only caller is rewinddir() (or equivalent lseek). Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-11 15:24:06 -08:00
Shirish Pargaonkar	987b21d7d9	cifs: Percolate error up to the caller during get/set acls [try #4 ] Modify get/set_cifs_acl* calls to reutrn error code and percolate the error code up to the caller. Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-11 03:54:36 +00:00
Oskar Schirmer	a7851ce73b	cifs: fix another memleak, in cifs_root_iget cifs_root_iget allocates full_path through cifs_build_path_to_root, but fails to kfree it upon cifs_get_inode_info* failure. Make all failure exit paths traverse clean up handling at the end of the function. Signed-off-by: Oskar Schirmer <oskar@scara.com> Reviewed-by: Jesper Juhl <jj@chaosbits.net> Cc: stable@kernel.org Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-11 03:40:13 +00:00
Christoph Hellwig	ece413f59f	xfs: remove incorrect assert in xfs_vm_writepage In commit `20cb52ebd1`, titled "xfs: simplify xfs_vm_writepage" I added an assert that any !mapped and uptodate buffers are not dirty. That asserts turns out to trigger a lot when running fsx on filesystems with small block sizes. The reason for that is that the assert is simply incorrect. !mapped and uptodate just mean this buffer covers a hole, and whenever we do a set_page_dirty we mark all blocks in the page dirty, no matter if they have data or not. So remove the assert, and update the comment above the condition to match reality. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 15:51:10 -06:00
J. Bruce Fields	8896b93f42	locks: remove dead lease error-handling code A minor oversight from `f7347ce4ee`, "fasync: re-organize fasync entry insertion to allow it under a spinlock": this cleanup-on-error was only needed to handle -ENOMEM. Now that we're preallocating it's unneeded. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-10 14:31:29 -05:00
J. Bruce Fields	3df057ac9a	locks: fix leak on merging leases We must also free the passed-in lease in the case it wasn't used because an existing lease was upgrade/downgraded or already existed. Note the nfsd caller doesn't care because it's fl_change callback returns an error in those cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-10 14:31:23 -05:00
Christoph Hellwig	c6f6cd0608	xfs: use hlist_add_fake XFS does not need it's inodes to actuall be hashed in the VFS inode cache, but we require the inode to be marked hashed for the writeback code to work. Insted of using insert_inode_hash, which requires a second inode_lock roundtrip after the partial merge of the inode scalability patches in 2.6.37-rc simply use the new hlist_add_fake helper to mark it hashed without requiring a lock or touching a global cache line. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 12:00:48 -06:00
Christoph Hellwig	5d2bf8a55e	xfs: fix a few compiler warnings with CONFIG_XFS_QUOTA=n Andi Kleen reported that gcc-4.5 gives lots of warnings for him inside the XFS code. It turned out most of them are due to the quota stubs beeing macros, and gcc now complaining about macros evaluating to 0 that are not assigned to variables. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 12:00:48 -06:00
Christoph Hellwig	785ce41805	xfs: tell lockdep about parent iolock usage in filestreams The filestreams code may take the iolock on the parent inode while holding it on a child. This is the only place in XFS where we take both the child and parent iolock, so just telling lockdep about it is enough. The lock flag required for that was already added as part of the ilock lockdep annotations and unused so far. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 12:00:48 -06:00
Dave Chinner	bfe2741967	xfs: move delayed write buffer trace The delayed write buffer split trace currently issues a trace for every buffer it scans. These buffers are not necessarily queued for delayed write. Indeed, when buffers are pinned, there can be thousands of traces of buffers that aren't actually queued for delayed write and the ones that are are lost in the noise. Move the trace point to record only buffers that are split out for IO to be issued on. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 12:00:48 -06:00
Dave Chinner	f83282a8ef	xfs: fix per-ag reference counting in inode reclaim tree walking The walk fails to decrement the per-ag reference count when the non-blocking walk fails to obtain the per-ag reclaim lock, leading to an assert failure on debug kernels when unmounting a filesystem. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 12:00:48 -06:00
Kulikov Vasiliy	6762b938ea	xfs: xfs_ioctl: fix information leak to userland al_hreq is copied from userland. If al_hreq.buflen is not properly aligned then xfs_attr_list will ignore the last bytes of kbuf. These bytes are unitialized. It leads to leaking of contents of kernel stack memory. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 12:00:47 -06:00
Christoph Hellwig	5d0af85cd0	xfs: remove experimental tag from the delaylog option We promised to do this for 2.6.37, and the code looks stable enough to keep that promise. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-11-10 12:00:47 -06:00
Jeff Layton	ebe2e91e00	cifs: fix potential use-after-free in cifs_oplock_break_put cfile may very well be freed after the cifsFileInfo_put. Make sure we have a valid pointer to the superblock for cifs_sb_deactive. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-10 15:37:17 +00:00
Sergey Senozhatsky	f85acd81aa	ioprio: rcu_read_lock/unlock protect find_task_by_vpid call (V2) Commit `4221a9918e` "Add RCU check for find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns= Assertion failed in sys_ioprio_get. The patch is fixing assertion failure in ioprio_set as well. kernel/pid.c:419 invoked rcu_dereference_check() without protection! stack backtrace: Pid: 4254, comm: iotop Not tainted Call Trace: [<ffffffff810656f2>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff81053c67>] find_task_by_pid_ns+0x4f/0x68 [<ffffffff81053c9d>] find_task_by_vpid+0x1d/0x1f [<ffffffff811104e2>] sys_ioprio_get+0x50/0x2da [<ffffffff81002182>] system_call_fastpath+0x16/0x1b V2: rcu critical section expanded according to comment by Paul E. McKenney Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-11-10 14:40:53 +01:00
Daniel J Blueman	1447399b3e	ioprio: fix RCU locking around task dereference With 2.6.37-rc1, I observe sys_ioprio_set not taking the RCU lock [1] across access to the task credentials. Inspecting the code in fs/ioprio.c, the tasklist_lock is held for read across the __task_cred call, which is presumably sufficient to prevent the task credentials becoming stale. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/pid.c:419 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 1 lock held by start-stop-daem/2246: #0: (tasklist_lock){.?.?..}, at: [<ffffffff811a2dfa>] sys_ioprio_set+0x8a/0x400 stack backtrace: Pid: 2246, comm: start-stop-daem Not tainted 2.6.37-rc1-330cd+ #2 Call Trace: [<ffffffff8109f5f4>] lockdep_rcu_dereference+0xa4/0xc0 [<ffffffff81085651>] find_task_by_pid_ns+0x81/0x90 [<ffffffff8108567d>] find_task_by_vpid+0x1d/0x20 [<ffffffff811a3160>] sys_ioprio_set+0x3f0/0x400 [<ffffffff816efa79>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81003482>] system_call_fastpath+0x16/0x1b Take the RCU lock for read across acquiring the pointer to the task credentials and dereferencing it. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Fixed up by Jens to fix missing rcu_read_unlock() on mismatches. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-11-10 14:40:53 +01:00
Jens Axboe	cb4644cac4	bio: take care not overflow page count when mapping/copying user data If the iovec is being set up in a way that causes uaddr + PAGE_SIZE to overflow, we could end up attempting to map a huge number of pages. Check for this invalid input type. Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-11-10 14:40:43 +01:00
Jens Axboe	f3f63c1c28	block: limit vec count in bio_kmalloc() and bio_alloc_map_data() Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-11-10 14:40:42 +01:00
Sage Weil	b7495fc2ff	ceph: make page alignment explicit in osd interface We used to infer alignment of IOs within a page based on the file offset, which assumed they matched. This broke with direct IO that was not aligned to pages (e.g., 512-byte aligned IO). We were also trusting the alignment specified in the OSD reply, which could have been adjusted by the server. Explicitly specify the page alignment when setting up OSD IO requests. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-09 12:43:12 -08:00
Sage Weil	e98b6fed84	ceph: fix comment, remove extraneous args The offset/length arguments aren't used. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-09 12:24:53 -08:00
Linus Torvalds	f6614b7bb4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: fix a memleak in cifs_setattr_nounix() cifs: make cifs_ioctl handle NULL filp->private_data correctly	2010-11-09 10:34:48 -08:00
Suresh Jayaraman	3565bd46b1	cifs: fix a memleak in cifs_setattr_nounix() Andrew Hendry reported a kmemleak warning in 2.6.37-rc1 while editing a text file with gedit over cifs. unreferenced object 0xffff88022ee08b40 (size 32): comm "gedit", pid 2524, jiffies 4300160388 (age 2633.655s) hex dump (first 32 bytes): 5c 2e 67 6f 75 74 70 75 74 73 74 72 65 61 6d 2d \.goutputstream- 35 42 41 53 4c 56 00 de 09 00 00 00 2c 26 78 ee 5BASLV......,&x. backtrace: [<ffffffff81504a4d>] kmemleak_alloc+0x2d/0x60 [<ffffffff81136e13>] __kmalloc+0xe3/0x1d0 [<ffffffffa0313db0>] build_path_from_dentry+0xf0/0x230 [cifs] [<ffffffffa031ae1e>] cifs_setattr+0x9e/0x770 [cifs] [<ffffffff8115fe90>] notify_change+0x170/0x2e0 [<ffffffff81145ceb>] sys_fchmod+0x10b/0x140 [<ffffffff8100c172>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff The commit `1025774c` that removed inode_setattr() seems to have introduced this memleak by returning early without freeing 'full_path'. Reported-by: Andrew Hendry <andrew.hendry@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-09 15:17:53 +00:00
Meelis Roos	0e15482566	sparc: fix openpromfs compile Fix openpromfs compilation by adding a missing semicolon in fs/openpromfs/inode.c openprom_mount(). Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-08 14:29:39 -08:00
Linus Torvalds	a7bcf21e60	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Add new ext4 inode tracepoints ext4: Don't call sb_issue_discard() in ext4_free_blocks() ext4: do not try to grab the s_umount semaphore in ext4_quota_off ext4: fix potential race when freeing ext4_io_page structures ext4: handle writeback of inodes which are being freed ext4: initialize the percpu counters before replaying the journal ext4: "ret" may be used uninitialized in ext4_lazyinit_thread() ext4: fix lazyinit hang after removing request	2010-11-08 11:54:53 -08:00
Jeff Layton	618763958b	cifs: make cifs_ioctl handle NULL filp->private_data correctly Commit `13cfb7334e` made cifs_ioctl use the tlink attached to the cifsFileInfo for a filp. This ignores the case of an open directory however, which in CIFS can have a NULL private_data until a readdir is done on it. This patch re-adds the NULL pointer checks that were removed in commit `50ae28f01` and moves the setting of tcon and "caps" variables lower. Long term, a better fix would be to establish a f_op->open routine for directories that populates that field at open time, but that requires some other changes to how readdir calls are handled. Reported-by: Kjell Rune Skaaraas <kjella79@yahoo.no> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-08 18:56:36 +00:00
Theodore Ts'o	7ff9c073dd	ext4: Add new ext4 inode tracepoints Add ext4_evict_inode, ext4_drop_inode, ext4_mark_inode_dirty, and ext4_begin_ordered_truncate() Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-08 13:51:33 -05:00
Theodore Ts'o	b56ff9d397	ext4: Don't call sb_issue_discard() in ext4_free_blocks() Commit `5c521830cf` (ext4: Support discard requests when running in no-journal mode) attempts to add sb_issue_discard() for data blocks (in data=writeback mode) and in no-journal mode. Unfortunately, this no longer works, because in commit `dd3932eddf` (block: remove BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous interface, and there are times when we call ext4_free_blocks() when we are are holding a spinlock, or are otherwise in an atomic context. For now, I've removed the call to sb_issue_discard() to prevent a deadlock or (if spinlock debugging is enabled) failures like this: BUG: scheduling while atomic: rc.sysinit/1376/0x00000002 Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1 Call Trace: [<ffffffff810397ce>] __schedule_bug+0x5e/0x70 [<ffffffff81403110>] schedule+0x950/0xa70 [<ffffffff81060bad>] ? insert_work+0x7d/0x90 [<ffffffff81060fbd>] ? queue_work_on+0x1d/0x30 [<ffffffff81061127>] ? queue_work+0x37/0x60 [<ffffffff8140377d>] schedule_timeout+0x21d/0x360 [<ffffffff812031c3>] ? generic_make_request+0x2c3/0x540 [<ffffffff81402680>] wait_for_common+0xc0/0x150 [<ffffffff81041490>] ? default_wake_function+0x0/0x10 [<ffffffff812034bc>] ? submit_bio+0x7c/0x100 [<ffffffff810680a0>] ? wake_bit_function+0x0/0x40 [<ffffffff814027b8>] wait_for_completion+0x18/0x20 [<ffffffff8120a969>] blkdev_issue_discard+0x1b9/0x210 [<ffffffff811ba03e>] ext4_free_blocks+0x68e/0xb60 [<ffffffff811b1650>] ? __ext4_handle_dirty_metadata+0x110/0x120 [<ffffffff811b098c>] ext4_ext_truncate+0x8cc/0xa70 [<ffffffff810d713e>] ? pagevec_lookup+0x1e/0x30 [<ffffffff81191618>] ext4_truncate+0x178/0x5d0 [<ffffffff810eacbb>] ? unmap_mapping_range+0xab/0x280 [<ffffffff810d8976>] vmtruncate+0x56/0x70 [<ffffffff811925cb>] ext4_setattr+0x14b/0x460 [<ffffffff811319e4>] notify_change+0x194/0x380 [<ffffffff81117f80>] do_truncate+0x60/0x90 [<ffffffff811e08fa>] ? security_inode_permission+0x1a/0x20 [<ffffffff811eaec1>] ? tomoyo_path_truncate+0x11/0x20 [<ffffffff81127539>] do_last+0x5d9/0x770 [<ffffffff811278bd>] do_filp_open+0x1ed/0x680 [<ffffffff8140644f>] ? page_fault+0x1f/0x30 [<ffffffff81132bfc>] ? alloc_fd+0xec/0x140 [<ffffffff81118db1>] do_sys_open+0x61/0x120 [<ffffffff81118e8b>] sys_open+0x1b/0x20 [<ffffffff81002e6b>] system_call_fastpath+0x16/0x1b https://bugzilla.kernel.org/show_bug.cgi?id=22302 Reported-by: Mathias Burén <mathias.buren@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: jiayingz@google.com	2010-11-08 13:49:33 -05:00
Dmitry Monakhov	87009d86dc	ext4: do not try to grab the s_umount semaphore in ext4_quota_off It's not needed to sync the filesystem, and it fixes a lock_dep complaint. Signed-off-by: Dmitry Monakhov <dmonakhov@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>	2010-11-08 13:47:33 -05:00
Theodore Ts'o	83668e7141	ext4: fix potential race when freeing ext4_io_page structures Use an atomic_t and make sure we don't free the structure while we might still be submitting I/O for that page. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-08 13:45:33 -05:00
Theodore Ts'o	f7ad6d2e92	ext4: handle writeback of inodes which are being freed The following BUG can occur when an inode which is getting freed when it still has dirty pages outstanding, and it gets deleted (in this because it was the target of a rename). In ordered mode, we need to make sure the data pages are written just in case we crash before the rename (or unlink) is committed. If the inode is being freed then when we try to igrab the inode, we end up tripping the BUG_ON at fs/ext4/page-io.c:146. To solve this problem, we need to keep track of the number of io callbacks which are pending, and avoid destroying the inode until they have all been completed. That way we don't have to bump the inode count to keep the inode from being destroyed; an approach which doesn't work because the count could have already been dropped down to zero before the inode writeback has started (at which point we're not allowed to bump the count back up to 1, since it's already started getting freed). Thanks to Dave Chinner for suggesting this approach, which is also used by XFS. kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146! Call Trace: [<ffffffff811075b1>] ext4_bio_write_page+0x172/0x307 [<ffffffff811033a7>] mpage_da_submit_io+0x2f9/0x37b [<ffffffff811068d7>] mpage_da_map_and_submit+0x2cc/0x2e2 [<ffffffff811069b3>] mpage_add_bh_to_extent+0xc6/0xd5 [<ffffffff81106c66>] write_cache_pages_da+0x2a4/0x3ac [<ffffffff81107044>] ext4_da_writepages+0x2d6/0x44d [<ffffffff81087910>] do_writepages+0x1c/0x25 [<ffffffff810810a4>] __filemap_fdatawrite_range+0x4b/0x4d [<ffffffff810815f5>] filemap_fdatawrite_range+0xe/0x10 [<ffffffff81122a2e>] jbd2_journal_begin_ordered_truncate+0x7b/0xa2 [<ffffffff8110615d>] ext4_evict_inode+0x57/0x24c [<ffffffff810c14a3>] evict+0x22/0x92 [<ffffffff810c1a3d>] iput+0x212/0x249 [<ffffffff810bdf16>] dentry_iput+0xa1/0xb9 [<ffffffff810bdf6b>] d_kill+0x3d/0x5d [<ffffffff810be613>] dput+0x13a/0x147 [<ffffffff810b990d>] sys_renameat+0x1b5/0x258 [<ffffffff81145f71>] ? _atomic_dec_and_lock+0x2d/0x4c [<ffffffff810b2950>] ? cp_new_stat+0xde/0xea [<ffffffff810b29c1>] ? sys_newlstat+0x2d/0x38 [<ffffffff810b99c6>] sys_rename+0x16/0x18 [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b Reported-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Nick Bowler <nbowler@elliptictech.com>	2010-11-08 13:43:33 -05:00
Sage Weil	d8672d64b8	ceph: fix update of ctime from MDS The client can have a newer ctime than the MDS due to AUTH_EXCL and XATTR_EXCL caps as well; update the check in ceph_fill_file_time appropriately. This fixes cases where ctime/mtime goes backward under the right sequence of local updates (e.g. chmod) and mds replies (e.g. subsequent stat that goes to the MDS). Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-08 09:24:34 -08:00
Sage Weil	8bd59e0188	ceph: fix version check on racing inode updates We may get updates on the same inode from multiple MDSs; generally we only pay attention if the update is newer than what we already have. The exception is when an MDS sense unstable information, in which case we always update. The old > check got this wrong when our version was odd (e.g. 3) and the reply version was even (e.g. 2): the older stale (v2) info would be applied. Fixed and clarified the comment. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-08 09:23:12 -08:00
Sage Weil	cb4276cca4	ceph: fix uid/gid on resent mds requests MDS requests can be rebuilt and resent in non-process context, but were filling in uid/gid from current_fsuid/gid. Put that information in the request struct on request setup. This fixes incorrect (and root) uid/gid getting set for requests that are forwarded between MDSs, usually due to metadata migrations. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-08 07:29:05 -08:00
Sage Weil	cd045cb42a	ceph: fix rdcache_gen usage and invalidate We used to use rdcache_gen to indicate whether we "might" have cached pages. Now we just look at the mapping to determine that. However, some old behavior remains from that transition. First, rdcache_gen == 0 no longer means we have no pages. That can happen at any time (presumably when we carry FILE_CACHE). We should not reset it to zero, and we should not check that it is zero. That means that the only purpose for rdcache_revoking is to resolve races between new issues of FILE_CACHE and an async invalidate. If they are equal, we should invalidate. On success, we decrement rdcache_revoking, so that it is no longer equal to rdcache_gen. Similarly, if we success in doing a sync invalidate, set revoking = gen - 1. (This is a small optimization to avoid doing unnecessary invalidate work and does not affect correctness.) Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-08 07:29:05 -08:00
Sage Weil	feb4cc9bb4	ceph: re-request max_size if cap auth changes If the auth cap migrates to another MDS, clear requested_max_size so that we resend any pending max_size increase requests. This fixes potential hangs on writes that extend a file and race with an cap migration between MDSs. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-07 09:39:23 -08:00
Sage Weil	912a9b0319	ceph: only let auth caps update max_size Only the auth MDS has a meaningful max_size value for us, so only update it in fill_inode if we're being issued an auth cap. Otherwise, a random stat result from a non-auth MDS can clobber a meaningful max_size, get the client<->mds cap state out of sync, and make writes hang. Specifically, even if the client re-requests a larger max_size (which it will), the MDS won't respond because as far as it knows we already have a sufficiently large value. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-07 09:39:21 -08:00
Sage Weil	7421ab8041	ceph: fix open for write on clustered mds Normally when we open a file we already have a cap, and simply update the wanted set. However, if we open a file for write, but don't have an auth cap, that doesn't work; we need to open a new cap with the auth MDS. Only reuse existing caps if we are opening for read or the existing cap is auth. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-07 09:07:15 -08:00
Sage Weil	d8b16b3d1c	ceph: fix bad pointer dereference in ceph_fill_trace We dereference *in a few lines down, but only set it on rename. It is apparently pretty rare for this to trigger, but I have been hitting it with a clustered MDSs. Signed-off-by: Sage Weil <sage@newdream.net>	2010-11-07 08:40:43 -08:00
Linus Torvalds	2e5c36722d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: make cifs_set_oplock_level() take a cifsInodeInfo pointer cifs: dereferencing first then checking cifs: trivial comment fix: tlink_tree is now a rbtree [CIFS] Cleanup unused variable build warning cifs: convert tlink_tree to a rbtree cifs: store pointer to master tlink in superblock (try #2) cifs: trivial doc fix: note setlease implemented CIFS: Add cifs_set_oplock_level FS: cifs, remove unneeded NULL tests	2010-11-05 14:17:01 -07:00
Pavel Shilovsky	c67236281c	cifs: make cifs_set_oplock_level() take a cifsInodeInfo pointer All the callers already have a pointer to struct cifsInodeInfo. Use it. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-05 17:39:01 +00:00
Jeff Layton	d38922949d	cifs: dereferencing first then checking This patch is based on Dan's original patch. His original description is below: Smatch complained about a couple checking for NULL after dereferencing bugs. I'm not super familiar with the code so I did the conservative thing and move the dereferences after the checks. The dereferences in cifs_lock() and cifs_fsync() were added in `ba00ba64cf` "cifs: make various routines use the cifsFileInfo->tcon pointer". The dereference in find_writable_file() was added in `6508d904e6` "cifs: have find_readable/writable_file filter by fsuid". The comments there say it's possible to trigger the NULL dereference under stress. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-04 19:39:07 +00:00
Suresh Jayaraman	6ef933a38a	cifs: trivial comment fix: tlink_tree is now a rbtree Noticed while reviewing (late) the rbtree conversion patchset (which has been merged already). Cc: Jeff Layton <jlayton@redhat.com> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-04 19:35:30 +00:00
Theodore Ts'o	ce7e010aef	ext4: initialize the percpu counters before replaying the journal We now initialize the percpu counters before replaying the journal, but after the journal, we recalculate the global counters, to deal with the possibility of the per-blockgroup counts getting updated by the journal replay. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-03 12:03:21 -04:00
J. Bruce Fields	21b75b0199	nfsd4: fix 4.1 connection registration race If a connection is closed just after a sequence or create_session is sent over it, we could end up trying to register a callback that will never get called since the xprt is already marked dead. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-02 17:13:52 -04:00
Steve French	54eeafe1e4	[CIFS] Cleanup unused variable build warning Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-02 19:22:45 +00:00
Jeff Layton	b647c35f77	cifs: convert tlink_tree to a rbtree Radix trees are ideal when you want to track a bunch of pointers and can't embed a tracking structure within the target of those pointers. The tradeoff is an increase in memory, particularly if the tree is sparse. In CIFS, we use the tlink_tree to track tcon_link structs. A tcon_link can never be in more than one tlink_tree, so there's no impediment to using a rb_tree here instead of a radix tree. Convert the new multiuser mount code to use a rb_tree instead. This should reduce the memory required to manage the tlink_tree. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-02 19:20:23 +00:00
Jeff Layton	413e661c13	cifs: store pointer to master tlink in superblock (try #2 ) This is the second version of this patch, the only difference between it and the first one is that this explicitly makes cifs_sb_master_tlink a static inline. Instead of keeping a tag on the master tlink in the tree, just keep a pointer to the master in the superblock. That eliminates the need for using the radix tree to look up a tagged entry. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-02 19:15:09 +00:00
J. Bruce Fields	df098db12a	cifs: trivial doc fix: note setlease implemented Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-02 18:48:14 +00:00
Pavel Shilovsky	e66673e39a	CIFS: Add cifs_set_oplock_level Simplify many places when we need to set oplock level on an inode. Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-02 18:40:54 +00:00
Theodore Ts'o	b2c78cd09b	ext4: "ret" may be used uninitialized in ext4_lazyinit_thread() Newer GCC's reported the following build warning: fs/ext4/super.c: In function 'ext4_lazyinit_thread': fs/ext4/super.c:2702: warning: 'ret' may be used uninitialized in this function Fix it by removing the need for the ret variable in the first place. Signed-off-by: "Lukas Czerner" <lczerner@redhat.com> Reported-by: "Stefan Richter" <stefanr@s5r6.in-berlin.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-02 14:19:30 -04:00
Lukas Czerner	f4245bd4eb	ext4: fix lazyinit hang after removing request When the request has been removed from the list and no other request has been issued, we will end up with next wakeup scheduled to MAX_JIFFY_OFFSET which is bad. So check for that. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-11-02 14:07:17 -04:00
Theodore Ts'o	eb8abb927a	ext4: Remove useless spinlock in ext4_getattr() Linus noted, and complained to me, that doing while lots of "git diff"'s of kernel sources, these spinlocks were responsible for 27% of the spinlock cost on his two-processor system as reported by perf. Git was doing lots of parallel stats, and this was putting a lot of pressure on ext4_getattr(). A spinlock to protect a single memory-to-memory copy is pointless, so remove it. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-02 10:38:30 -04:00
Steve French	ce2f6fb8bd	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2010-11-02 03:48:02 +00:00
Jiri Slaby	50ae28f014	FS: cifs, remove unneeded NULL tests Stanse found that pSMBFile in cifs_ioctl and file->f_path.dentry in cifs_user_write are dereferenced prior their test to NULL. The alternative is not to dereference them before the tests. The patch is to point out the problem, you have to decide. While at it we cache the inode in cifs_user_write to a local variable and use all over the function. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Steve French <sfrench@samba.org> Cc: linux-cifs@vger.kernel.org Cc: Jeff Layton <jlayton@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-11-02 03:47:21 +00:00
Paul Mundt	e99d11d199	fs: logfs: Fix up MTD=y build. Commit `7d945a3aa7` ("logfs get_sb, part 3") broke the logfs build when CONFIG_MTD is set due to a mangled logfs_get_sb_mtd() definition. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-01 16:34:56 -04:00
Linus Torvalds	82279e6bd7	Merge branches 'irq-core-for-linus' and 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Fix up irq_node() for irq_data changes. genirq: Add single IRQ reservation helper genirq: Warn if enable_irq is called before irq is set up * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: semaphore: Remove mutex emulation staging: Final semaphore cleanup jbd2: Convert jbd2_slab_create_sem to mutex hpfs: Convert sbi->hpfs_creation_de to mutex Fix up trivial change/delete conflicts with deleted 'dream' drivers (drivers/staging/dream/camera/{mt9d112.c,mt9p012_fox.c,mt9t013.c,s5k3e2fx.c})	2010-10-31 20:40:24 -04:00
Christoph Hellwig	bb8430a2c8	locks: remove fl_copy_lock lock_manager operation This one was only used for a nasty hack in nfsd, which has recently been removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-31 06:35:15 -07:00

1 2 3 4 5 ...

20431 commits