Implement barrier support for single device DM devices
This patch implements barrier support in DM for the common case of dm linear
just remapping a single underlying device. In this case we can safely
pass the barrier through because there can be no reordering between
devices.
NB. Any DM device might cease to support barriers if it gets
reconfigured so code must continue to allow for a possible
-EOPNOTSUPP on every barrier bio submitted. - agk
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch adds the following target interfaces for request-based dm.
map_rq : for mapping a request
rq_end_io : for finishing a request
busy : for avoiding performance regression from bio-based dm.
Target can tell dm core not to map requests now, and
that may help requests in the block layer queue to be
bigger by I/O merging.
In bio-based dm, this behavior is done by device
drivers managing the block layer queue.
But in request-based dm, dm core has to do that
since dm core manages the block layer queue.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allow NULL buffer in dm_copy_name_and_uuid if you only want to return one of
the fields.
(Required by a following patch that adds these fields to sysfs.)
Signed-off-by: Milan Broz <mbroz@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Check that the log bitmap will fit within the log device.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Move log size validation from mirror target to log constructor.
Removed PAGE_SIZE restriction we no longer think necessary.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
rw_header function updates three members of io_req data every time
when I/O is processed. bi_rw and notify.fn are never modified once
they get initialized, and so they can be set in advance.
header_to_disk() can also be pulled out of write_header() since only one
caller needs it and write_header() can be replaced by rw_header()
directly.
Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Change dm_unregister_target to return void and use BUG() for error
reporting.
dm_unregister_target can only fail because of programming bug in the
target driver. It can't fail because of user's behavior or disk errors.
This patch changes unregister_target to return void and use BUG if
someone tries to unregister non-registered target or unregister target
that is in use.
This patch removes code duplication (testing of error codes in all dm
targets) and reports bugs in just one place, in dm_unregister_target. In
some target drivers, these return codes were ignored, which could lead
to a situation where bugs could be missed.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Always increase the error count when I/O on a leg of a mirror fails.
The error count is used to decide whether to select an alternative
mirror leg. If the target doesn't use the "handle_errors" feature, the
error count is not updated and the bio can get requeued forever by the
read callback.
Fix it by increasing error_count before the handle_errors feature
checking.
Cc: stable@kernel.org
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
In create_log_context function, dm_io_client_destroy function needs
to be called, when memory allocation of disk_header, sync_bits and
recovering_bits failed, but dm_io_client_destroy is not called.
Cc: stable@kernel.org
Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Acked-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Change yield() to msleep(1). If the thread had realtime priority,
yield() doesn't really yield, so the yielding process would loop
indefinitely and cause machine lockup.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Move one dm_table_put() so that the last reference in the thread
gets dropped in __unbind().
This is required for a following patch,
dm-table-rework-reference-counting.patch, which will change the logic in
such a way that table destructor is called only at specific points in
the code.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* 'audit.b61' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
audit: validate comparison operations, store them in sane form
clean up audit_rule_{add,del} a bit
make sure that filterkey of task,always rules is reported
audit rules ordering, part 2
fixing audit rule ordering mess, part 1
audit_update_lsm_rules() misses the audit_inode_hash[] ones
sanitize audit_log_capset()
sanitize audit_fd_pair()
sanitize audit_mq_open()
sanitize AUDIT_MQ_SENDRECV
sanitize audit_mq_notify()
sanitize audit_mq_getsetattr()
sanitize audit_ipc_set_perm()
sanitize audit_ipc_obj()
sanitize audit_socketcall
don't reallocate buffer in every audit_sockaddr()
Add standard interfaces for alarm/update irqs enabling. Drivers are no
more required to implement equivalent ioctl code as rtc-dev will provide
it.
UIE emulation should now be handled correctly and will work even for those
RTC drivers who cannot be configured to do both UIE and AIE.
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the write_begin/write_end aops, page_symlink was broken because it
could no longer pass a GFP_NOFS type mask into the point where the
allocations happened. They are done in write_begin, which would always
assume that the filesystem can be entered from reclaim. This bug could
cause filesystem deadlocks.
The funny thing with having a gfp_t mask there is that it doesn't really
allow the caller to arbitrarily tinker with the context in which it can be
called. It couldn't ever be GFP_ATOMIC, for example, because it needs to
take the page lock. The only thing any callers care about is __GFP_FS
anyway, so turn that into a single flag.
Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on
this flag in their write_begin function. Change __grab_cache_page to
accept a nofs argument as well, to honour that flag (while we're there,
change the name to grab_cache_page_write_begin which is more instructive
and does away with random leading underscores).
This is really a more flexible way to go in the end anyway -- if a
filesystem happens to want any extra allocations aside from the pagecache
ones in ints write_begin function, it may now use GFP_KERNEL (rather than
GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a
random example).
[kosaki.motohiro@jp.fujitsu.com: fix ubifs]
[kosaki.motohiro@jp.fujitsu.com: fix fuse]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Cleaned up the calling convention: just pass in the AOP flags
untouched to the grab_cache_page_write_begin() function. That
just simplifies everybody, and may even allow future expansion of the
logic. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The function viafb_cursor() uses 2 stack-variables of CURSOR_SIZE bits;
CURSOR_SIZE is defined as (8 * 1024). Using up twice 1k on stack is too
much for 4k-stack (though it works with 8k-stacks). Make those two
variables kzalloc'ed to preserve stack space.
Also merge the whole lot of local struct's in viafb_ioctl into a union so
the stack usage gets minimized here as well. (struct's are only accessed
in their indicidual IOCTL case) This second part is only compile-tested as
I know of no userspace app using the IOCTLs.
Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Cc: <JosephChan@via.com.tw>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As suggested by Andreas Dilger, introduce a bgl_lock_ptr() helper in
<linux/blockgroup_lock.h> and add separate sb_bgl_lock() helpers to
filesystem specific header files to break the hidden dependency to
struct ext[234]_sb_info.
Also, while at it, convert the macros to static inlines to try make up
for all the times I broke Andrew Morton's tree.
Acked-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Include header files as used/needed:
In file included from drivers/leds/leds-dac124s085.c:16:
include/linux/spi/spi.h:66: error: field 'dev' has incomplete type
include/linux/spi/spi.h: In function 'to_spi_device':
include/linux/spi/spi.h💯 warning: type defaults to 'int' in declaration of '__mptr'
...
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The flush_cache_vmap in vmap_page_range() is called with the end of the
range twice. The following patch fixes this for me.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The race is calling cgroup_clone() while umounting the ns cgroup subsys,
and thus cgroup_clone() might access invalid cgroup_fs, or kill_sb() is
called after cgroup_clone() created a new dir in it.
The BUG I triggered is BUG_ON(root->number_of_cgroups != 1);
------------[ cut here ]------------
kernel BUG at kernel/cgroup.c:1093!
invalid opcode: 0000 [#1] SMP
...
Process umount (pid: 5177, ti=e411e000 task=e40c4670 task.ti=e411e000)
...
Call Trace:
[<c0493df7>] ? deactivate_super+0x3f/0x51
[<c04a3600>] ? mntput_no_expire+0xb3/0xdd
[<c04a3ab2>] ? sys_umount+0x265/0x2ac
[<c04a3b06>] ? sys_oldumount+0xd/0xf
[<c0403911>] ? sysenter_do_call+0x12/0x31
...
EIP: [<c0456e76>] cgroup_kill_sb+0x23/0xe0 SS:ESP 0068:e411ef2c
---[ end trace c766c1be3bf944ac ]---
Cc: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't store the field->op in the messy (and very inconvenient for e.g.
audit_comparator()) form; translate to dense set of values and do full
validation of userland-submitted value while we are at it.
->audit_init_rule() and ->audit_match_rule() get new values now; in-tree
instances updated.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fix the actual rule listing; add per-type lists _not_ used for matching,
with all exit,... sitting on one such list. Simplifies "do something
for all rules" logics, while we are at it...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Problem: ordering between the rules on exit chain is currently lost;
all watch and inode rules are listed after everything else _and_
exit,never on one kind doesn't stop exit,always on another from
being matched.
Solution: assign priorities to rules, keep track of the current
highest-priority matching rule and its result (always/never).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* don't bother with allocations
* don't do double copy_from_user()
* don't duplicate parts of check for audit_dummy_context()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* logging the original value of *msg_prio in mq_timedreceive(2)
is insane - the argument is write-only (i.e. syscall always
ignores the original value and only overwrites it).
* merge __audit_mq_timed{send,receive}
* don't do copy_from_user() twice
* don't mess with allocations in auditsc part
* ... and don't bother checking !audit_enabled and !context in there -
we'd already checked for audit_dummy_context().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* don't copy_from_user() twice
* don't bother with allocations
* don't duplicate parts of audit_dummy_context()
* make it return void
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
No need to do that more than once per process lifetime; allocating/freeing
on each sendto/accept/etc. is bloody pointless.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits)
x86: setup_per_cpu_areas() cleanup
cpumask: fix compile error when CONFIG_NR_CPUS is not defined
cpumask: use alloc_cpumask_var_node where appropriate
cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t
x86: use cpumask_var_t in acpi/boot.c
x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids
sched: put back some stack hog changes that were undone in kernel/sched.c
x86: enable cpus display of kernel_max and offlined cpus
ia64: cpumask fix for is_affinity_mask_valid()
cpumask: convert RCU implementations, fix
xtensa: define __fls
mn10300: define __fls
m32r: define __fls
h8300: define __fls
frv: define __fls
cris: define __fls
cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS
cpumask: zero extra bits in alloc_cpumask_var_node
cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/
cpumask: convert mm/
...
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (34 commits)
V4L/DVB (10173): Missing v4l2_prio_close in radio_release
V4L/DVB (10172): add DVB_DEVICE_TYPE= to uevent
V4L/DVB (10171): Use usb_set_intfdata
V4L/DVB (10170): tuner-simple: prevent possible OOPS caused by divide by zero error
V4L/DVB (10168): sms1xxx: fix inverted gpio for lna control on tiger r2
V4L/DVB (10167): sms1xxx: add support for inverted gpio
V4L/DVB (10166): dvb frontend: stop using non-C99 compliant comments
V4L/DVB (10165): Add FE_CAN_2G_MODULATION flag to frontends that support DVB-S2
V4L/DVB (10164): Add missing S2 caps flag to S2API
V4L/DVB (10163): em28xx: allocate adev together with struct em28xx dev
V4L/DVB (10162): tuner-simple: Fix tuner type set message
V4L/DVB (10161): saa7134: fix autodetection for AVer TV GO 007 FM Plus
V4L/DVB (10160): em28xx: update chip id for em2710
V4L/DVB (10157): Add USB ID for the Sil4701 radio from DealExtreme
V4L/DVB (10156): saa7134: Add support for Avermedia AVer TV GO 007 FM Plus
V4L/DVB (10155): Add TEA5764 radio driver
V4L/DVB (10154): saa7134: fix a merge conflict on Behold H6 board
V4L/DVB (10153): Add the Beholder H6 card to DVB-T part of sources.
V4L/DVB (10152): Change configuration of the Beholder H6 card
V4L/DVB (10151): Fix I2C bridge error in zl10353
...
those two functions only used in that C file
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: warn about voltage mismatches
mmc_spi: Add support for OpenFirmware bindings
pxamci: fix dma_unmap_sg length
mmc_block: ensure all sectors that do not have errors are read
drivers/mmc: Move a dereference below a NULL test
sdhci: handle built-in sdhci with modular leds class
mmc: balanc pci_iomap with pci_iounmap
mmc_block: print better error messages
mmc: Add mmc_vddrange_to_ocrmask() helper function
ricoh_mmc: Handle newer models of Ricoh controllers
mmc: Add 8-bit bus width support
sdhci: activate led support also when module
mmc: trivial annotation of 'blocks'
pci: use pci_ioremap_bar() in drivers/mmc
sdricoh_cs: Add support for Bay Controller devices
mmc: at91_mci: reorder timer setup and mmc_add_host() call
Before, when we only ever printed out the pointer value itself, a NULL
pointer would never cause issues and might as well be printed out as
just its numeric value.
However, with the extended %p formats, especially %pR, we might validly
want to print out resources for debugging. And sometimes they don't
even exist, and the resource pointer is just NULL. Print it out as
such, rather than oopsing.
This is a more generic version of a patch done by Trent Piepho (catching
all %p cases rather than just %pR, and using "(null)" instead of
"[NULL]" to match glibc).
Requested-by: Trent Piepho <xyzzy@speakeasy.org>
Acked-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... just make it a binfmt handler like #! one.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They are actually alpha vs. i386/arm/m68k i.e. ecoff vs. aout.
In the only place where we actually tried to handle arm and i386/m68k in
different ways (START_DATA() in coredump handling), the arm variant
works for all of them (i386 and m68k have u.start_code set to 0).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This updates some personal info in the CREDITS file.
I'm no longer actively involved in Keyspan driver work so shouldn't
really be listed as a Maintainer here.
I do however field the occasional question on them and as I'm dropping
the misc.nu domain, want to ensure people can find me should they need
to.
Signed-off-by: Hugh Blemings <hugh@blemings.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: cleanup
__alloc_bootmem and __alloc_bootmem_node do panic
for us in case of fail so no need for additional
checks here.
Also lets use pr_*() macros for printing.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CONFIG_NR_CPUS will be defined for all arch's whether SMP or not, but
it may not have made it into all arches yet.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>