According to the report titled "problem with nilfs_cleanerd" from
Łukasz Wójcicki, nilfs_btree_lookup_dirty_buffers or
nilfs_btree_add_dirty_buffer got memory violation during garbage
collection.
This could happen if a level field of given btree node buffer is
incorrect, which is a crucial internal bug.
This inserts a sanity check to figure out the problem.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds is_remount argument to the parse_options() function that
obtains mount options from strings.
Previously, parse_options did not distinguish context whether it's
called for a new mount or remount, so the caller needed additional
verifications outside the function.
This allows parse_options to verify options and print messages
depending on the context.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This replaces seq_printf() with seq_puts() in nilfs_show_options for
mount options which have no argument.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Nilfs has "discard" mount option which issues discard/TRIM commands to
underlying block device, but it lacks a complementary option and has
no way to disable the feature through remount.
This adds "nodiscard" option to resolve this imbalance.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Nilfs enables write barriers by default and has "nobarrier" mount
option to disable this feature. But it lacks the complementary option
and has no way to re-enable the feature on remount.
This adds "barrier" option to resolve this imbalance.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Super blocks of nilfs are periodically overwritten in order to record
the recent log position. This shortens recovery time after unclean
unmount, but the current implementation performs the update even for a
few blocks of change. If the filesystem gets small changes slowly and
continually, super blocks may be updated excessively.
This moderates the issue by skipping update of log cursor if it does
not cross a segment boundary.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Although nilfs redundantly uses two super blocks and each may point to
different position on log, the current version of nilfs does not try
fallback to the spare super block when it doesn't find any valid log
at the position that the primary super block points to.
This has been a cause of mount failures due to write order reversals
on barrier less block devices.
This inserts fallback code in error path of nilfs_search_super_root
routine to resolve the mount failure problem.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_search_super_root can return -ENOMEM, but this error code is not
described in its kernel-doc comment. This fixes the discrepancy.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This separates a setup routine of log cursor from init_nilfs(). The
routine, nilfs_store_log_cursor, reads the last position of the log
containing a super root, and initializes relevant state on the nilfs
object.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This will sync super blocks in turns instead of syncing duplicate
super blocks at the time. This will help searching valid super root
when super block is written into disk before log is written, which is
happen when barrier-less block devices are unmounted uncleanly. In
the situation, old super block likely points to valid log.
This patch introduces ns_sbwcount member to the nilfs object and adds
nilfs_sb_will_flip() function; ns_sbwcount counts how many times super
blocks write back to the disk. And, nilfs_sb_will_flip() decides
whether flipping required or not based on the count of ns_sbwcount to
sync super blocks asymmetrically.
The following functions are also changed:
- nilfs_prepare_super(): flips super blocks according to the
argument. The argument is calculated by nilfs_sb_will_flip()
function.
- nilfs_cleanup_super(): sets "clean" flag to both super blocks if
they point to the same checkpoint.
To update both of super block information, caller of
nilfs_commit_super must set the information on both super blocks.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This function checks validity of super block pointers.
If first super block is invalid, it will swap the super blocks.
The function should be called before any super block information updates.
Caller must obtain nilfs->ns_sem.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This moves out section that updates information of the recent log
position stored in super blocks from nilfs_commit_super to a new
routine named nilfs_set_log_cursor.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This function marks error state and write it on super blocks. This is
a preparation for making super block writeback alternately.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This function write out filesystem state to super blocks in order to
share the same cleanup work. This is a preparation for making super
block writeback alternately.
Cc: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Mount time field in super block is wrongly updated when nilfs remounts
the partition from read-write to read-only. This fixes the issue.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This removes macros to test segment summary flags and redefines a few
relevant macros with inline functions.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
load_segment_summary function has two distinct roles: getting summary
header of a log, and verifying consistencies of the log.
This divide it into two corresponding functions, nilfs_read_log_header
and nilfs_validate_log to clarify the meaning.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
The function name of nilfs_recover_logical_segments makes no sense.
This changes the name into nilfs_salvage_orphan_logs to clarify the
role of the function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Most functions in recovery code take an argument of a super block
instance or a nilfs_sb_info struct for convenience sake.
This replaces them aggressively with a nilfs object by applying
__bread and __breadahead against routines using sb_bread and
sb_breadahead.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This stores blocksize in nilfs objects for the successive refactoring
of recovery logic.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
The commit 41c88bd7 ("nilfs2: cleanup multi
kmem_cache_{create,destroy} code") consolidated slab constructors and
destructors used in nilfs, but it left some declarations in header
files.
This gets rid of the obsolete declarations.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This gets rid of unwanted space chars in front of conditional
sentences of nilfs_destroy_cachep().
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (69 commits)
fix handling of offsets in cris eeprom.c, get rid of fake on-stack files
get rid of home-grown mutex in cris eeprom.c
switch ecryptfs_write() to struct inode *, kill on-stack fake files
switch ecryptfs_get_locked_page() to struct inode *
simplify access to ecryptfs inodes in ->readpage() and friends
AFS: Don't put struct file on the stack
Ban ecryptfs over ecryptfs
logfs: replace inode uid,gid,mode initialization with helper function
ufs: replace inode uid,gid,mode initialization with helper function
udf: replace inode uid,gid,mode init with helper
ubifs: replace inode uid,gid,mode initialization with helper function
sysv: replace inode uid,gid,mode initialization with helper function
reiserfs: replace inode uid,gid,mode initialization with helper function
ramfs: replace inode uid,gid,mode initialization with helper function
omfs: replace inode uid,gid,mode initialization with helper function
bfs: replace inode uid,gid,mode initialization with helper function
ocfs2: replace inode uid,gid,mode initialization with helper function
nilfs2: replace inode uid,gid,mode initialization with helper function
minix: replace inode uid,gid,mode init with helper
ext4: replace inode uid,gid,mode init with helper
...
Trivial conflict in fs/fs-writeback.c (mark bitfields unsigned)
Snapshots and regular ro/rw mounts are essentially-different within
the meaning whether the checkpoint is static or not and is marked with
a snapshot flag or not.
The current implemenation, however, allows to remount a snapshot to a
regular rw-mount if the checkpoint number equals the latest one.
This transition is actually impossible since changing a checkpoint to
a snapshot makes another checkpoint, thus the condition is never
satisfied.
This fixes the weird state of affairs, and specifically separates
snapshots and regular rw/ro-mounts.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This replaces uses of new_encode_dev/new_decode_dev with their 64-bit
counterparts, huge_encode_dev/huge_decode_dev respectively.
This is just for clarification and has no impact on the disk format.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
deactivate_super was replaced with deactivate_locked_super, but the
comment of nilfs_get_sb remain unchanged. This renews the comment.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
MS_VERBOSE is deprecated. This replaces it with MS_SILENT in
reference to get_sb_bdev function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
An fmode_t argument is passed to kill_block_super() through s_mode
member of the super_block structure. This is used to release the
block device with the same mode, however, nilfs does not set s_mode
anywhere.
This modifies nilfs_get_sb function to properly initialize the s_mode
member.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
The second argument of open_bdev_exclusive/close_bdev_exclusive takes
fmode_t flags instead of mount flags. This fixes the misuse.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Nilfs maintains two super blocks, and selects the new one on mount
time if they both have valid checksums and their timestamps differ.
However, this has potential for mis-selection since the system clock
may be rewinded and the resolution of the timestamps is not high.
Usually this doesn't become an issue because both super blocks are
updated at the same time when the file system is unmounted. Even if
the file system wasn't unmounted cleanly, the roll-forward recovery
will find the proper log which stores the latest super root. Thus,
the issue can appear only if update of one super block fails and the
clock happens to be rewinded.
This fixes the issue by using checkpoint numbers instead of timestamps
to pick the super block storing the location of the latest log.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds missing endian conversions in comparision of the magic
number of super blocks. It was coincidence that prior versions didn't
incur problems; the upper byte of the magic number happened to be
equal to the lower byte. But, semantically it's wrong to depend on
this.
This won't change anything else nor suffer any compatibility issues.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This kills the following sparse warnings:
fs/nilfs2/segment.c:567:28: warning: symbol 'nilfs_sc_file_ops' was not declared. Should it be static?
fs/nilfs2/segment.c:617:28: warning: symbol 'nilfs_sc_dat_ops' was not declared. Should it be static?
fs/nilfs2/segment.c:625:28: warning: symbol 'nilfs_sc_dsync_ops' was not declared. Should it be static?
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
The implementation of persistent object allocator (alloc.c) is poorly
documented. This adds kernel doc style comments on that functions.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
In nilfs_segctor_thread(), timer is a local variable allocated on stack. Its
address can't be set to sci->sc_timer and passed in several procedures.
It works now by chance, just because other procedures are called by
nilfs_segctor_thread() directly or indirectly and the stack hasn't been
deallocated yet.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
There are only two lines of code in nilfs_segctor_init(). From a logic
design view, the first line 'sci->sc_seq_done = sci->sc_seq_request;'
should be put in nilfs_segctor_new(). Even in nilfs_segctor_new(),
this initialization is needless because sci is kzalloc-ed. So
nilfs_segctor_init() is only a wrap call to
nilfs_segctor_start_thread().
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds a field to record the latest checkpoint number in the
nilfs_segment_summary structure. This will help to recover the latest
checkpoint number from logs on disk. This field is intended for
crucial cases in which super blocks have lost pointer to the latest
log.
Even though this will change the disk format, both backward and
forward compatibility is preserved by a size field prepared in the
segment summary header.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Printing a message after loading a file system is a practice. Add this to
provide a better user-friendly experience.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This cleanup patch gives several improvements:
- Moving all kmem_cache_{create_destroy} calls into one place, which removes
some small function calls, cleans up error check code and clarify the logic.
- Mark all initial code in __init section.
- Remove some very obvious comments.
- Adjust some declarations.
- Fix some space-tab issues.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This moves a pointer to buffer storing super root block to each log
buffer from nilfs_sc_info struct for simplicity.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Like ext3, nilfs has 'errors' mount option to allow specifying desired
behavior on severe errors.
Currently, the default action is 'errors=continue' and has potential
to advance filesystem corruption for severe errors.
This will change the action to 'errors=remount-ro' to avoid the issue.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_btree_release_path() and nilfs_btree_free_path() are bound into each other
tightly. Make them into one procedure to clearify the logic and avoid some
misusages.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_btree_alloc_path() and nilfs_btree_init_path() are bound into each other
tightly. Make them into one procedure to clearify the logic and avoid some
misusages.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
As of 32a88aa1, __sync_filesystem() will return 0 if s_bdi is not set.
And nilfs does not set s_bdi anywhere. I noticed this problem by the
warning introduced by the recent commit 5129a469 ("Catch filesystem
lacking s_bdi").
WARNING: at fs/super.c:959 vfs_kern_mount+0xc5/0x14e()
Hardware name: PowerEdge 2850
Modules linked in: nilfs2 loop tpm_tis tpm tpm_bios video shpchp pci_hotplug output dcdbas
Pid: 3773, comm: mount.nilfs2 Not tainted 2.6.34-rc6-debug #38
Call Trace:
[<c1028422>] warn_slowpath_common+0x60/0x90
[<c102845f>] warn_slowpath_null+0xd/0x10
[<c1095936>] vfs_kern_mount+0xc5/0x14e
[<c1095a03>] do_kern_mount+0x32/0xbd
[<c10a811e>] do_mount+0x671/0x6d0
[<c1073794>] ? __get_free_pages+0x1f/0x21
[<c10a684f>] ? copy_mount_options+0x2b/0xe2
[<c107b634>] ? strndup_user+0x48/0x67
[<c10a81de>] sys_mount+0x61/0x8f
[<c100280c>] sysenter_do_call+0x12/0x32
This ensures to set s_bdi for nilfs and fixes the sync silent failure.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After merging the block tree, today's linux-next build (powerpc ppc64_defconfig)
failed like this:
fs/nilfs2/the_nilfs.c: In function 'nilfs_discard_segments':
fs/nilfs2/the_nilfs.c:673: error: 'DISCARD_FL_BARRIER' undeclared (first use in this function)
Caused by commit fbd9b09a17 ("blkdev:
generalize flags for blkdev_issue_fn functions") interacting with commit
e902ec9906 ("nilfs2: issue discard request
after cleaning segments") (which netered Linus' tree on about March 4 -
before v2.6.34-rc1).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix typo "numer" -> "number" in alloc.c
nilfs2: Remove an uninitialization warning in nilfs_btree_propagate_v()
nilfs2: fix a wrong type conversion in nilfs_ioctl()
`make CONFIG_NILFS2_FS=m M=fs/nilfs2/` will give the following warnings:
fs/nilfs2/btree.c: In function 'nilfs_btree_propagate':
fs/nilfs2/btree.c:1882: warning: 'maxlevel' may be used uninitialized in this function
fs/nilfs2/btree.c:1882: note: 'maxlevel' was declared here
Set maxlevel = 0 to fix it.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
(void * __user *) should be (void __user *)
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
nilfs_wait_on_logs has a potential to slip out before completion of
all bio requests when it met an error. This synchronization fault may
cause unexpected results, for instance, violative access to freed
segment buffers from an end-bio callback routine.
This fixes the issue by ensuring that nilfs_wait_on_logs waits all
given logs.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
According to the report from Andreas Beckmann (Message-ID:
<4BA54677.3090902@abeckmann.de>), nilfs in 2.6.33 kernel got stuck
after a disk full error.
This turned out to be a regression by log writer updates merged at
kernel 2.6.33. nilfs_segctor_abort_construction, which is a cleanup
function for erroneous cases, was skipping writeback completion for
some logs.
This fixes the bug and would resolve the hang issue.
Reported-by: Andreas Beckmann <debian@abeckmann.de>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: stable <stable@kernel.org> [2.6.33.x]
Andreas Beckmann gave me a report that nilfs logged the following
warnings when it got a disk full:
nilfs_sufile_do_cancel_free: segment 0 must be clean
nilfs_sufile_do_cancel_free: segment 1 must be clean
These arise from a duplicate call to nilfs_segctor_cancel_freev in an
error path of log writer. This will fix the issue.
Reported-by: Andreas Beckmann <debian@abeckmann.de>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This kills the following checkpatch warnings:
WARNING: unnecessary whitespace before a quoted newline
#869: FILE: super.c:869:
+ "remount to a different snapshot. \n",
WARNING: unnecessary whitespace before a quoted newline
#389: FILE: the_nilfs.c:389:
+ printk(KERN_ERR "NILFS: too short segment. \n");
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This kills the following checkpatch warnings:
WARNING: please, no space before tabs
#74: FILE: segment.h:74:
+^Iunsigned ^I^Iflags;$
WARNING: please, no space before tabs
#35: FILE: segbuf.c:35:
+^Iint ^I^I^Istart, end; /* The region to be submitted */$
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Two segbuf functions, nilfs_segbuf_write and nilfs_segbuf_wait, are
declared with the static storage class specifier, but their
implementations are not.
This fixes the discrepancy.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
init: Open /dev/console from rootfs
mqueue: fix typo "failues" -> "failures"
mqueue: only set error codes if they are really necessary
mqueue: simplify do_open() error handling
mqueue: apply mathematics distributivity on mq_bytes calculation
mqueue: remove unneeded info->messages initialization
mqueue: fix mq_open() file descriptor leak on user-space processes
fix race in d_splice_alias()
set S_DEAD on unlink() and non-directory rename() victims
vfs: add NOFOLLOW flag to umount(2)
get rid of ->mnt_parent in tomoyo/realpath
hppfs can use existing proc_mnt, no need for do_kern_mount() in there
Mirror MS_KERNMOUNT in ->mnt_flags
get rid of useless vfsmount_lock use in put_mnt_ns()
Take vfsmount_lock to fs/internal.h
get rid of insanity with namespace roots in tomoyo
take check for new events in namespace (guts of mounts_poll()) to namespace.c
Don't mess with generic_permission() under ->d_lock in hpfs
sanitize const/signedness for udf
nilfs: sanitize const/signedness in dealing with ->d_name.name
...
Fix up fairly trivial (famous last words...) conflicts in
drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
This adds reader's lock for the_nilfs->cno in nilfs_ioctl_sync,
for the_nilfs->cno should be proctected by segctor_sem when reading.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is a trivial patch to remove unnecessary condition.
load_segment_summary() checks crc of segment_summary OR crc of whole
log data blocks based on boolean argument full_check. However,
callers of the function pass only 1 as full_check, which means only
whole log data blocks checking code is running all the time.
This patch deletes the condition and full_check argument and also
deletes enum 'NILFS_SEG_FAIL_CHECKSUM_SEGSUM' and corresponding case
clause, for it is nolonger used anymore.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This moves iterator to submit write requests for a series of logs into
segbuf.c, and hides nilfs_segbuf_write() and nilfs_segbuf_wait() in
the file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This replaces s_dirt flag use in nilfs with a new flag added on the
nilfs object. The s_dirt flag was used to indicate if
sop->write_super() should be called, however the current version of
nilfs does not use the callback. Thus, it can be replaced with the
own flag.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Jiro SEKIBA <jir@unicus.jp>
This will clean up nilfs_segctor_req struct and the obscure request
argument passed among private methods of segment constructor.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is a trivial patch to delete unnecessary condition in nilfs_dat_translate.
nilfs_dat_translate() will asign translated address to *blocknrp if blocknrp
is not NULL. However the condition is unneeded, because all callers of
nilfs_dat_translate() pass blocknrp properly.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_error() calls nilfs_detach_segment_constructor() if
errors=remount-ro option is specified, and this may lead to a hang due
to recursive locking of, for instance, nilfs->ns_segctor_sem and
others.
In this case, detaching segment constructor is not necessary because
read-only flag is set to the filesystem and further writes are
blocked.
This fixes the potential hang issue by removing the
nilfs_detach_segment_constructor() call from nilfs_error.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
A few nilfs2 ioctls need to ask for and then later release write
access to the mount in order to avoid potential write to read-only
mounts.
This adds the missing mnt_want_write and mnt_drop_write in
nilfs_ioctl_change_cpmode, nilfs_ioctl_delete_checkpoint, and
nilfs_ioctl_clean_segments.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds a function to send discard requests for given array of
segment numbers, and calls the function when garbage collection
succeeded.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This fixes incorrect usage of nilfs_segctor_confirm() test function in
nilfs_segctor_destroy(); nilfs_segctor_confirm() returns zero if the
filesystem is not clean, so its use in nilfs_segctor_destroy() needs
inversion.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
The C99 specification states in section 6.11.5:
The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is a trivial style fix patch to mend errors/warnings
reported by "checkpatch.pl --file".
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This reverts commit e4c570c4cb, as
requested by Alexey:
"I think I gave a good enough arguments to not merge it.
To iterate:
* patch makes impossible to start using ext3 on EXT3_FS=n kernels
without reboot.
* this is done only for one pointer on task_struct"
None of config options which define task_struct are tristate directly
or effectively."
Requested-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
journal_info in task_struct is used in journaling file system only. So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This separates wait function for submitted logs from the write
function nilfs_segctor_write(). A new list of segment buffers
"sc_write_logs" is added to hold logs under writing, and double
buffering is partially applied to hide io latency.
At this point, the double buffering is disabled for blocksize <
pagesize because page dirty flag is turned off during write and dirty
buffers are not properly collected for pages crossing over segments.
To receive full benefit of the double buffering, further refinement is
needed to move the io wait outside the lock section of log writer.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds a few iterator functions for segment buffers to make it easy
to handle multiple series of logs.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Hides nilfs_write_info struct and nilfs_segbuf_prepare_write function
in segbuf.c to simplify the interface of nilfs_segbuf_write function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This moves io status variables in nilfs_write_info struct to
nilfs_segment_buffer struct.
This is a preparation to hide nilfs_write_info in segment buffer code.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Previously, log writer had possibility to set an io error flag on
segments even in case of memory allocation failure.
This fixes the issue.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This applies list_splice_tail (or list_splice_tail_init) operation
instead of list_splice (or list_splice_init, respectively) to append a
new list to tail of an existing list.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Replace mark_inode_dirty() as nilfs_mark_inode_dirty()
to reduce deep function calls.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Delete mark_inode_dirty() in nilfs_delete_entry() to reduce duplicate
mark_inode_dirty() calls both in nilfs_rename() and nilfs_delete_entry().
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Delete mark_inode_dirty() in nilfs_commit_chunk(), for callers of
nilfs_commit_chunk() will call equivalent mark_inode_dirty()
after calling nilfs_commit_chunk().
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
change return type of nilfs_commit_chunk() as void from int,
for nilfs_set_file_dirty() usually does not return error.
This is an intermediate patch to reduce mark_inode_dirty() in
nilfs_commit_chunk().
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Split nilfs_unlink() to reduce nested transaction and duplicate
mark_inode_dirty() calls when calling nilfs_unlink() from nilfs_rmdir().
nilfs_do_unlink() is an actual unlink functionality which is not
in transaction and does not call mark_inode_dirty() for dentry argument.
nilfs_unlink() is a wrapper function for do_nilfs_unlink() with
transaction and mark_inode_dirty() for dentry argument.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is an intermidiate patch to reduce redandunt mark_inode_dirty() calls
by calling inode_inc_link_count() and inode_dec_link_count() functions.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
It is redundant to call mark_inode_dirty() in nilfs_new_inode() because
all caller of nilfs_new_inode() will call mark_inode_dirty()
after calling nilfs_new_inode() directly or indirectly in transaction.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds "norecovery" mount option which disables temporal write
access to read-only mounts or snapshots during mount/recovery.
Without this option, write access will be even performed for those
types of mounts; the temporal write access is needed to mount root
file system read-only after an unclean shutdown.
This option will be helpful when user wants to prevent any write
access to the device.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Eric Sandeen <sandeen@redhat.com>
This adds a helper function, nilfs_valid_fs() which returns if nilfs
is in a valid state or not.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Although mount recovery of nilfs is integrated in load_nilfs()
procedure, the completion of recovery was isolated from the procedure
and performed at the end of the fill_super routine.
This was somewhat confusing since the recovery is needed for the nilfs
object, not for a super block instance.
To resolve the inconsistency, this will integrate the recovery
completion into load_nilfs().
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>