kernel-fxtec-pro1x/fs
Chuck Lever 6c9dc42551 lockd: Update NSM state from SM_MON replies
When rpc.statd starts up in user space at boot time, it attempts to
write the latest NSM local state number into
/proc/sys/fs/nfs/nsm_local_state.

If lockd.ko isn't loaded yet (as is the case in most configurations),
that file doesn't exist, thus the kernel's NSM state remains set to
its initial value of zero during lockd operation.

This is a problem because rpc.statd and lockd use the NSM state number
to prevent repeated lock recovery on rebooted hosts.  If lockd sends
a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state
number is received, there is no way for lockd or rpc.statd to
distinguish that stale SM_NOTIFY from an actual reboot.  Thus lock
recovery could be performed after the rebooted host has already
started reclaiming locks, and those locks will be lost.

We could change /etc/init.d/nfslock so it always modprobes lockd.ko
before starting rpc.statd.  However, if lockd.ko is ever unloaded
and reloaded, we are back at square one, since the NSM state is not
preserved across an unload/reload cycle.  This may happen frequently
on clients that use automounter.  A period of NFS inactivity causes
lockd.ko to be unloaded, and the kernel loses its NSM state setting.

Instead, let's use the fact that rpc.statd plants the local system's
NSM state in every SM_MON (and SM_UNMON) reply.  lockd performs a
synchronous SM_MON upcall to the local rpc.statd _before_ sending its
first NLM request to a new remote.  This would permit rpc.statd to
provide the current NSM state to lockd, even after lockd.ko had been
unloaded and reloaded.

Note that NLMPROC_LOCK arguments are constructed before the
nsm_monitor() call, so we have to rearrange argument construction very
slightly to make this all work out.

And, the kernel appears to treat NSM state as a u32 (see struct
nlm_args and nsm_res).  Make nsm_local_state a u32 as well, to ensure
we don't get bogus comparison results.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 18:02:10 -07:00
..
9p 9P doesn't need BKL in ->umount_begin() 2009-06-17 00:36:36 -04:00
adfs Cleanup of adfs headers 2009-06-17 00:36:36 -04:00
affs affs: add ->sync_fs 2009-06-11 21:36:14 -04:00
afs AFS: Correctly translate auth error aborts and don't failover in such cases 2009-06-16 21:20:14 -07:00
autofs switch follow_down() 2009-06-11 21:36:01 -04:00
autofs4 switch follow_down() 2009-06-11 21:36:01 -04:00
befs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-06-17 08:46:57 -07:00
bfs bfs: add ->sync_fs 2009-06-11 21:36:14 -04:00
btrfs Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block 2009-06-16 11:46:45 -07:00
cachefiles enforce ->sync_fs is only called for rw superblock 2009-06-11 21:36:06 -04:00
cifs push BKL down into ->put_super 2009-06-11 21:36:07 -04:00
coda splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
configfs configfs: Rework configfs_depend_item() locking and make lockdep happy 2009-04-30 10:48:26 -07:00
cramfs fs/cramfs: return f_fsid for statfs(2) 2009-04-02 19:05:08 -07:00
debugfs debugfs: use specified mode to possibly mark files read/write only 2009-06-15 21:30:28 -07:00
devpts devpts: unregister the file system on error 2009-06-11 08:51:06 -07:00
dlm dlm: use more NOFS allocation 2009-05-15 11:24:59 -05:00
ecryptfs push BKL down into ->put_super 2009-06-11 21:36:07 -04:00
efs get rid of BKL in fs/efs 2009-06-17 00:36:36 -04:00
exofs [SCSI] Merge branch 'linus' 2009-06-12 10:02:03 -05:00
exportfs Merge branch 'next' into for-linus 2008-12-25 11:40:09 +11:00
ext2 trivial: ext2: fix a typo in comment in ext2.h 2009-06-12 18:01:44 +02:00
ext3 ext3: avoid unnecessary spinlock in critical POSIX ACL path 2009-06-17 00:36:35 -04:00
ext4 ext4: avoid unnecessary spinlock in critical POSIX ACL path 2009-06-17 00:36:35 -04:00
fat Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 2009-06-16 13:06:10 -07:00
freevxfs push BKL down into ->put_super 2009-06-11 21:36:07 -04:00
fscache FS-Cache: Fixup renamed filenames in comments in internal.h 2009-05-27 10:20:13 -07:00
fuse fuse doesn't need BKL in ->umount_begin() 2009-06-17 00:36:36 -04:00
gfs2 GFS2: Remove lock_kernel from gfs2_put_super() 2009-06-12 13:40:47 +01:00
hfs hfs: add ->sync_fs 2009-06-11 21:36:15 -04:00
hfsplus hfsplus: add ->sync_fs 2009-06-11 21:36:16 -04:00
hostfs constify dentry_operations: misc filesystems 2009-03-27 14:44:00 -04:00
hpfs Push BKL down into ->remount_fs() 2009-06-11 21:36:11 -04:00
hppfs hppfs: hppfs_read_file() may return -ERROR 2009-04-02 19:04:53 -07:00
hugetlbfs Merge branch 'master' into next 2009-05-22 18:40:59 +10:00
isofs NLS: update handling of Unicode 2009-06-15 21:44:43 -07:00
jbd jbd: fix race in buffer processing in commit code 2009-06-09 16:59:03 -07:00
jbd2 jbd2: Fix minor typos in comments in fs/jbd2/journal.c 2009-06-09 00:06:20 -04:00
jffs2 jffs2: call jffs2_write_super from jffs2_sync_fs 2009-06-11 21:36:16 -04:00
jfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 2009-06-16 12:23:52 -07:00
lockd lockd: Update NSM state from SM_MON replies 2009-06-17 18:02:10 -07:00
minix get rid of BKL in fs/minix 2009-06-17 00:36:37 -04:00
ncpfs NLS: update handling of Unicode 2009-06-15 21:44:43 -07:00
nfs NFS: Fix false error return from nfs_callback_up() if ipv6.ko is not available 2009-06-17 18:02:10 -07:00
nfs_common SUNRPC: nfsacl_encode/nfsacl_decode should be exported as GPL-only 2008-12-23 15:21:32 -05:00
nfsd switch follow_down() 2009-06-11 21:36:01 -04:00
nilfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 2009-06-15 09:13:49 -07:00
nls NLS: update handling of Unicode 2009-06-15 21:44:43 -07:00
notify fsnotify: allow groups to set freeing_mark to null 2009-06-11 14:57:55 -04:00
ntfs ntfs: use is_power_of_2() function for clarity. 2009-06-16 19:47:48 -07:00
ocfs2 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 2009-06-16 12:11:57 -07:00
omfs switch omfs to simple_fsync() 2009-06-11 21:36:13 -04:00
openpromfs zero i_uid/i_gid on inode allocation 2009-01-05 11:54:28 -05:00
partitions Merge branch 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 2009-06-12 09:29:42 -07:00
proc oom: move oom_adj value from task_struct to mm_struct 2009-06-16 19:47:43 -07:00
qnx4 fs/qnx4: sanitize includes 2009-06-11 21:36:12 -04:00
quota quota: cleanup dquota sync functions (version 4) 2009-06-11 21:36:04 -04:00
ramfs ramfs: ignore unknown mount options 2009-06-14 17:58:25 -07:00
reiserfs Push BKL down into ->remount_fs() 2009-06-11 21:36:11 -04:00
romfs ROMFS: romfs_dev_read() error ignored 2009-05-09 10:49:41 -04:00
smbfs push BKL down into ->put_super 2009-06-11 21:36:07 -04:00
squashfs push BKL down into ->put_super 2009-06-11 21:36:07 -04:00
sysfs Sysfs: fix possible memleak in sysfs_follow_link 2009-06-15 21:30:23 -07:00
sysv get rid of BKL in fs/sysv 2009-06-17 00:36:37 -04:00
ubifs Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6 2009-06-17 09:46:33 -07:00
udf switch udf to simple_fsync() 2009-06-11 21:36:13 -04:00
ufs ufs: add ->sync_fs 2009-06-11 21:36:16 -04:00
xfs Merge branch 'master' of git://oss.sgi.com/xfs/xfs into for-linus 2009-06-12 21:28:59 -05:00
aio.c aio: lookup_ioctx can return the wrong value when looking up a bogus context 2009-03-19 15:57:18 -07:00
anon_inodes.c constify dentry_operations: rest 2009-03-27 14:44:03 -04:00
attr.c vfs: Use lowercase names of quota functions 2009-03-26 02:18:35 +01:00
bad_inode.c kill ->dir_notify() 2008-12-31 18:07:43 -05:00
binfmt_aout.c sanitize ifdefs in binfmt_aout 2009-01-03 11:45:54 -08:00
binfmt_elf.c Trim includes in binfmt_elf 2009-03-31 23:00:27 -04:00
binfmt_elf_fdpic.c ptrace: s/parent/real_parent/ in binfmt_elf_fdpic.c 2009-05-02 15:36:10 -07:00
binfmt_em86.c
binfmt_flat.c flat: fix data sections alignment 2009-05-29 08:40:02 -07:00
binfmt_misc.c fs/binfmt_misc.c: add terminating newline to /proc/sys/fs/binfmt_misc/status 2009-01-06 15:59:19 -08:00
binfmt_script.c
binfmt_som.c Don't crap into descriptor table in binfmt_som 2009-03-31 23:00:28 -04:00
bio-integrity.c block: add private bio_set for bio integrity allocations 2009-03-24 12:35:17 +01:00
bio.c block: remove some includings of blktrace_api.h 2009-06-16 11:19:36 +02:00
block_dev.c vfs: Rename fsync_super() to sync_filesystem() (version 4) 2009-06-11 21:36:04 -04:00
buffer.c Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
char_dev.c fs: Remove i_cindex from struct inode 2009-06-11 21:36:09 -04:00
compat.c trivial: fix comment typo in fs/compat.c 2009-06-12 18:01:44 +02:00
compat_binfmt_elf.c
compat_ioctl.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2009-06-17 09:50:44 -07:00
dcache.c dcache: extrace and use d_unlinked() 2009-06-11 21:36:06 -04:00
dcookies.c [CVE-2009-0029] System call wrapper special cases 2009-01-14 14:15:18 +01:00
direct-io.c block: Do away with the notion of hardsect_size 2009-05-22 23:22:54 +02:00
drop_caches.c mm: remove __invalidate_mapping_pages variant 2009-06-16 19:47:43 -07:00
eventfd.c eventfd: export eventfd_signal and eventfd_fget for lguest 2009-06-12 22:27:09 +09:30
eventpoll.c epoll: fix size check in epoll_create() 2009-05-12 14:11:35 -07:00
exec.c Merge branch 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-11 14:01:07 -07:00
fcntl.c send_sigio_to_task: sanitize the usage of fown->signum 2009-06-16 15:36:17 -07:00
fifo.c
file.c
file_table.c fs: move mark_files_ro into file_table.c 2009-06-11 21:36:02 -04:00
filesystems.c fs: Mark get_filesystem_list() as __init function. 2009-04-20 23:02:52 -04:00
fs-writeback.c writeback: skip new or to-be-freed inodes 2009-06-16 19:47:45 -07:00
fs_struct.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
generic_acl.c New helper - current_umask() 2009-03-31 23:00:26 -04:00
inode.c trivial: fs/inode: Fix typo in file_update_time nanodoc 2009-06-12 18:01:45 +02:00
internal.h Trim a bit of crap from fs.h 2009-06-11 21:36:07 -04:00
ioctl.c No instance of ->bmap() needs BKL 2009-06-17 00:36:35 -04:00
ioprio.c [CVE-2009-0029] System call wrappers part 28 2009-01-14 14:15:30 +01:00
Kconfig Hugetlbfs: Enable hugetlbfs for more systems in Kconfig. 2009-06-17 11:06:31 +01:00
Kconfig.binfmt CORE_DUMP_DEFAULT_ELF_HEADERS depends on ELF_CORE 2009-01-09 16:54:41 -08:00
libfs.c New helper - simple_fsync() 2009-06-11 21:36:11 -04:00
locks.c [CVE-2009-0029] System call wrappers part 16 2009-01-14 14:15:25 +01:00
Makefile nilfs2: update makefile and Kconfig 2009-04-07 08:31:16 -07:00
mbcache.c
mpage.c ext4: Properly initialize the buffer_head state 2009-05-13 15:13:42 -04:00
namei.c switch lookup_mnt() 2009-06-11 21:36:01 -04:00
namespace.c Push BKL down into do_remount_sb() 2009-06-11 21:36:08 -04:00
nfsctl.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
no-block.c
open.c fs: introduce mnt_clone_write 2009-06-11 21:36:02 -04:00
pipe.c splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
pnode.c
pnode.h
posix_acl.c
read_write.c splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
read_write.h
readdir.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
select.c poll: avoid extra wakeups in select/poll 2009-06-16 19:47:48 -07:00
seq_file.c cpumask: fix seq_bitmap_*() functions. 2009-03-30 22:05:11 +10:30
signalfd.c [CVE-2009-0029] System call wrappers part 31 2009-01-14 14:15:31 +01:00
splice.c splice: fix kmaps in default_file_splice_write() 2009-05-19 11:37:46 +02:00
stack.c
stat.c kill vfs_stat_fd / vfs_lstat_fd 2009-04-20 23:02:52 -04:00
super.c remove unlock_kernel() left accidentally 2009-06-17 00:36:35 -04:00
sync.c remove the call to ->write_super in __sync_filesystem 2009-06-11 21:36:17 -04:00
timerfd.c timerfd: add flags check 2009-02-18 15:37:53 -08:00
utimes.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
xattr.c fs: introduce mnt_clone_write 2009-06-11 21:36:02 -04:00
xattr_acl.c