prune back iprune_sem

iprune_sem is continously giving us lockdep warnings because we do take it in
read mode in the reclaim path, but we're also doing non-NOFS allocations under
it taken in write mode.

Taking a bit deeper look at it I think it's fixable quite trivially:

 - for invalidate_inodes we do not need iprune_sem at all.  We have an active
   reference on the superblock, so the filesystem is not going away until it
   has finished.
 - for evict_inodes we do need it, to make sure prune_icache has done it's
   work before we tear down the superblock.  But there is no reason to
   hold it over the actual reclaim operation - it's enough to cycle through
   it after the actual reclaim to make sure we wait for any pending
   prune_icache to complete.  We just have to remove the WARN_ON for
   otherwise busy inodes as they can actually happen now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This commit is contained in:
Christoph Hellwig 2011-03-15 21:51:24 +01:00 committed by Al Viro
parent 5229645bdc
commit bab1d9444d

View file

@ -84,16 +84,13 @@ static struct hlist_head *inode_hashtable __read_mostly;
DEFINE_SPINLOCK(inode_lock);
/*
* iprune_sem provides exclusion between the kswapd or try_to_free_pages
* icache shrinking path, and the umount path. Without this exclusion,
* by the time prune_icache calls iput for the inode whose pages it has
* been invalidating, or by the time it calls clear_inode & destroy_inode
* from its final dispose_list, the struct super_block they refer to
* (for inode->i_sb->s_op) may already have been freed and reused.
* iprune_sem provides exclusion between the icache shrinking and the
* umount path.
*
* We make this an rwsem because the fastpath is icache shrinking. In
* some cases a filesystem may be doing a significant amount of work in
* its inode reclaim code, so this should improve parallelism.
* We don't actually need it to protect anything in the umount path,
* but only need to cycle through it to make sure any inode that
* prune_icache took off the LRU list has been fully torn down by the
* time we are past evict_inodes.
*/
static DECLARE_RWSEM(iprune_sem);
@ -516,17 +513,12 @@ void evict_inodes(struct super_block *sb)
struct inode *inode, *next;
LIST_HEAD(dispose);
down_write(&iprune_sem);
spin_lock(&inode_lock);
list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
if (atomic_read(&inode->i_count))
continue;
if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) {
WARN_ON(1);
if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))
continue;
}
inode->i_state |= I_FREEING;
@ -542,6 +534,13 @@ void evict_inodes(struct super_block *sb)
spin_unlock(&inode_lock);
dispose_list(&dispose);
/*
* Cycle through iprune_sem to make sure any inode that prune_icache
* moved off the list before we took the lock has been fully torn
* down.
*/
down_write(&iprune_sem);
up_write(&iprune_sem);
}
@ -561,8 +560,6 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
struct inode *inode, *next;
LIST_HEAD(dispose);
down_write(&iprune_sem);
spin_lock(&inode_lock);
list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))
@ -590,7 +587,6 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
spin_unlock(&inode_lock);
dispose_list(&dispose);
up_write(&iprune_sem);
return busy;
}