kernel-fxtec-pro1x

Author	SHA1	Message	Date
Christian Borntraeger	c29834584e	virtio_console: support console resizing this patch uses the new hvc callback hvc_resize to set the window size which allows to change the tty size of hvc_console via a hvc_resize function. I have added a new feature bit VIRTIO_CONSOLE_F_SIZE. The driver will change the window size on tty open and via the config_changed callback of the transport. Currently lguest and kvm_s390 have not implemented this callback, but the callback can be implemented at a later point in time. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:10 +10:30
Hollis Blanchard	1b4aa2faec	virtio: avoid implicit use of Linux page size in balloon interface Make the balloon interface always use 4K pages, and convert Linux pfns if necessary. This patch assumes that Linux's PAGE_SHIFT will never be less than 12. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (modified)	2008-12-30 09:26:04 +10:30
Rusty Russell	87c7d57c17	virtio: hand virtio ring alignment as argument to vring_new_virtqueue This allows each virtio user to hand in the alignment appropriate to their virtio_ring structures. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2008-12-30 09:26:03 +10:30
Rusty Russell	2966af73e7	virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize This doesn't really matter, since lguest is i386 only at the moment, but we could actually choose a different value. (lguest doesn't have a guarenteed ABI). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:02 +10:30
Rusty Russell	498af14783	virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci. That doesn't work for non-4k guests which are now appearing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:25:58 +10:30
Rusty Russell	5f0d1d7f22	virtio: rename 'pagesize' arg to vring_init/vring_size It's really the alignment desired for consumer/producer separation; historically this x86 pagesize, but with PowerPC it'll still be x86 pagesize. And in theory lguest could choose a different value. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:25:57 +10:30
Rusty Russell	480daab42c	virtio: Don't use PAGE_SIZE in virtio_pci.c The virtio PCI devices don't depend on the guest page size. This matters now PowerPC virtio is gaining ground (they like 64k pages). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:25:57 +10:30
Rusty Russell	e12f0102ac	cpumask: Use nr_cpu_ids in seq_cpumask Impact: cleanup, futureproof nr_cpu_ids is the (badly named) runtime limit on possible CPU numbers; ie. the variable version of NR_CPUS. With the new cpumask operators, only bits less than this are defined. So we should use it everywhere, rather than NR_CPUS. Eventually this will make it possible to allocate cpumasks of the minimal length at runtime. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu>	2008-12-30 09:05:19 +10:30
Rusty Russell	54b11e6d57	cpumask: smp_call_function_many() Impact: Implementation change to remove cpumask_t from stack. Actually change smp_call_function_mask() to smp_call_function_many(). We avoid cpumasks on the stack in this version. (S390 has its own version, but that's going away apparently). We have to do some dancing to figure out if 0 or 1 other cpus are in the mask supplied and the online mask without allocating a tmp cpumask. It's still fairly cheap. We allocate the cpumask at the end of the call_function_data structure: if allocation fails we fallback to smp_call_function_single rather than using the baroque quiescing code (which needs a cpumask on stack). (Thanks to Hiroshi Shimamoto for spotting several bugs in previous versions!) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: npiggin@suse.de Cc: axboe@kernel.dk	2008-12-30 09:05:16 +10:30
Rusty Russell	3fa4152069	cpumask: make set_cpu_/init_cpu_ out-of-line They're only for use in boot/cpu hotplug code anyway, and this avoids the use of deprecated cpu_*_map. Stephen Rothwell points out that gcc 4.2.4 (on powerpc at least) didn't like the cast away of const anyway: include/linux/cpumask.h: In function 'set_cpu_possible': include/linux/cpumask.h:1052: warning: passing argument 2 of 'cpumask_set_cpu' discards qualifiers from pointer target type So this kills two birds with one stone. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:16 +10:30
Rusty Russell	ae7a47e72e	cpumask: make cpumask.h eat its own dogfood. Changes: 1) cpumask_t to struct cpumask, 2) cpus_weight_nr to cpumask_weight, 3) cpu_isset to cpumask_test_cpu, 4) ->bits to cpumask_bits() 5) cpu__map to cpu__mask. 6) for_each_cpu_mask_nr to for_each_cpu Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:15 +10:30
Rusty Russell	b3199c025d	cpumask: switch over to cpu_online/possible/active/present_mask: core Impact: cleanup This implements the obsolescent cpu_online_map in terms of cpu_online_mask, rather than the other way around. Same for the other maps. The documentation comments are also updated to refer to _mask rather than _map. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>	2008-12-30 09:05:14 +10:30
Rusty Russell	cb78a0ce69	bitmap: fix seq_bitmap and seq_cpumask to take const pointer Impact: cleanup seq_bitmap just calls bitmap_scnprintf on the bits: that arg can be const. Similarly, seq_cpumask just calls seq_bitmap. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:14 +10:30
Rusty Russell	4b0bc0bca8	bitmap: test for constant as well as small size for inline versions Impact: reduce text size bitmap_zero et al have a fastpath for nbits <= BITS_PER_LONG, but this should really only apply where the nbits is known at compile time. This only saves about 1200 bytes on an allyesconfig kernel, but with cpumasks going variable that number will increase. text data bss dec hex filename 35327852 5035607 6782976 47146435 2cf65c3 vmlinux-before 35326640 5035607 6782976 47145223 2cf6107 vmlinux-after Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:13 +10:30
Rusty Russell	278d1ed65e	cpumask: make CONFIG_NR_CPUS always valid. Impact: cleanup Currently we have NR_CPUS, which is 1 on UP, and CONFIG_NR_CPUS on SMP. If we make CONFIG_NR_CPUS always valid (and always 1 on !SMP), we can skip the middleman. This also allows us to find and check all the unaudited NR_CPUS usage as we prepare for v. large NR_CPUS. To avoid breaking every arch, we cheat and do this for the moment in the header if the arch doesn't. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>	2008-12-30 09:05:12 +10:30
Rusty Russell	33edcf133b	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-12-30 08:02:35 +10:30
Robert Jarzmik	69acdf1e5a	V4L/DVB (9530): Add new pixel format VYUY 16 bits wide. There were already 3 YUV formats defined : - YUYV - YVYU - UYVY The only left combination is VYUY, which is added in this patch. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-29 17:53:27 -02:00
Bartlomiej Zolnierkiewicz	e2984c628c	ide: move Power Management support to ide-pm.c There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:37 +01:00
Bartlomiej Zolnierkiewicz	702c026be8	ide: rework handling of serialized ports (v2) * hpt366: set IDE_HFLAG_SERIALIZE in ->host_flags if needed in init_hwif_hpt366(). Remove HPT_SERIALIZE_IO while at it. * Set IDE_HFLAG_SERIALIZE in ->host_flags if needed in ide_init_port(). * Convert init_irq() to use IDE_HFLAG_SERIALIZE together with hwif->host to find out ports which need to be serialized. * Remove no longer needed save_match() and ide_hwif_t.serialized. v2: * Set host's ->host_flags field instead of port's copy. This patch should fix the incorrect grouping of port(s) from host(s) that need serialization with port(s) that happen to use the same IRQ(s) but are from the host(s) that don't need it. Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:36 +01:00
Bartlomiej Zolnierkiewicz	b7876a6fb6	cy82c693: remove superfluous ide_cy82c693 chipset type Since CY82C693 doesn't require serialization we may as well use the default ide_pci chipset type. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:34 +01:00
Bartlomiej Zolnierkiewicz	1f66019bdf	trm290: add IDE_HFLAG_TRM290 host flag * Add IDE_HFLAG_TRM290 host flag and use it in ide_build_dmatable(). * Remove no longer needed ide_trm290 chipset type. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:34 +01:00
Bartlomiej Zolnierkiewicz	6b4924962c	ide: add ->max_sectors field to struct ide_port_info * Add ->max_sectors field to struct ide_port_info to allow host drivers to specify value used for hwif->rqsize (if smaller than the default). * Convert pdc202xx_old to use ->max_sectors and remove no longer needed IDE_HFLAG_RQSIZE_256 flag. There should be no functional changes caused by this patch. Acked-by: Sergei Shtyltov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:34 +01:00
Bartlomiej Zolnierkiewicz	7f1ac8c4b9	rz1000: apply chipset quirks early (v2) * Use pci_name(dev) instead of hwif->name in init_hwif_rz1000(). * init_hwif_rz1000() -> rz1000_init_chipset(). Update rz1000_init_one() to use rz1000_init_chipset() and add now required rz1000_remove(). * Remove superfluous ide_rz1000 chipset type. v2: * unsigned int rz1000_init_chipset() -> int rz1000_disable_readahead() per Sergei's suggestion. Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:33 +01:00
Bartlomiej Zolnierkiewicz	f58c1ab8de	ide: always set nIEN on idle devices * Set nIEN for previous port/device in ide_do_request() also if port uses a non-shared IRQ. * Remove no longer needed ide_hwif_t.sharing_irq. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:33 +01:00
Bartlomiej Zolnierkiewicz	6b5cde3629	cmd64x: set IDE_HFLAG_SERIALIZE explictly for CMD646 * Set IDE_HFLAG_SERIALIZE explictly for CMD646. * Remove no longer needed ide_cmd646 chipset type (which has a nice side-effect of fixing handling of unexpected IRQs). Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:32 +01:00
Bartlomiej Zolnierkiewicz	27c01c2db0	ide-cd: remove obsolete seek optimization It doesn't make much sense nowadays and is problematic on some drives. Cc: Borislav Petkov <petkovbb@googlemail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:32 +01:00
Bartlomiej Zolnierkiewicz	2a2ca6a961	ide: replace the global ide_lock spinlock by per-hwgroup spinlocks (v2) Now that (almost) all host drivers have been fixed not to abuse ide_lock and core code usage of ide_lock has been sanitized we may safely replace ide_lock by per-hwgroup locks. This patch is partially based on earlier patch from Ravikiran G Thirumalai. While at it: - don't use deprecated HWIF() and HWGROUP() macros - update locking documentation in ide.h v2: Add missing spin_lock_init(&hwgroup->lock). (Noticed by Elias Oltmanns) Cc: Vaibhav V. Nivargi <vaibhav.nivargi@gmail.com> Cc: Alok N. Kataria <alokk@calsoftinc.com> Cc: Shai Fultheim <shai@scalex86.org> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Cc: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:31 +01:00
Jiri Slaby	d61c72e52b	DMI: add dmi_match Add a wrapper for testing system_info which will handle also NULL system infos. This will be used by the ata PIIX driver. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk> Cc: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-12-29 07:39:34 -05:00
Yinghai Lu	43a256322a	sparseirq: move __weak symbols into separate compilation unit GCC has a bug with __weak alias functions: if the functions are in the same compilation unit as their call site, GCC can decide to inline them - and thus rob the linker of the opportunity to override the weak alias with the real thing. So move all the IRQ handling related __weak symbols to kernel/irq/chip.c. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-29 12:15:49 +01:00
Pekka Enberg	3c506efd7e	Merge branch 'topic/failslab' into for-linus Conflicts: mm/slub.c Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 11:47:05 +02:00
Pekka Enberg	fd37617e69	Merge branches 'topic/fixes', 'topic/cleanups' and 'topic/documentation' into for-linus	2008-12-29 11:45:47 +02:00
Pascal Terjan	dfcd361028	slab: Fix comment on #endif This #endif in slab.h is described as closing the inner block while it's for the big CONFIG_NUMA one. That makes reading the code a bit harder. This trivial patch fixes the comment. Signed-off-by: Pascal Terjan <pterjan@mandriva.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 11:40:56 +02:00
Akinobu Mita	773ff60e84	SLUB: failslab support Currently fault-injection capability for SLAB allocator is only available to SLAB. This patch makes it available to SLUB, too. [penberg@cs.helsinki.fi: unify slab and slub implementations] Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 11:27:46 +02:00
Dave Airlie	f453ba0460	DRM: add mode setting support Add mode setting support to the DRM layer. This is a fairly big chunk of work that allows DRM drivers to provide full output control and configuration capabilities to userspace. It was motivated by several factors: - the fb layer's APIs aren't suited for anything but simple configurations - coordination between the fb layer, DRM layer, and various userspace drivers is poor to non-existent (radeonfb excepted) - user level mode setting drivers makes displaying panic & oops messages more difficult - suspend/resume of graphics state is possible in many more configurations with kernel level support This commit just adds the core DRM part of the mode setting APIs. Driver specific commits using these new structure and APIs will follow. Co-authors: Jesse Barnes <jbarnes@virtuousgeek.org>, Jakob Bornecrantz <jakob@tungstengraphics.com> Contributors: Alan Hourihane <alanh@tungstengraphics.com>, Maarten Maathuis <madman2003@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-29 17:47:23 +10:00
Jens Axboe	b3a6ffe16b	Get rid of CONFIG_LSF We have two seperate config entries for large devices/files. One is CONFIG_LBD that guards just the devices, the other is CONFIG_LSF that handles large files. This doesn't make a lot of sense, you typically want both or none. So get rid of CONFIG_LSF and change CONFIG_LBD wording to indicate that it covers both. Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:51 +01:00
Jens Axboe	a6f23657d3	block: add one-hit cache for disk partition lookup disk_map_sector_rcu() returns a partition from a sector offset, which we use for IO statistics on a per-partition basis. The lookup itself is an O(N) list lookup, where N is the number of partitions. This actually hurts performance quite a bit, even on the lower end partitions. On higher numbered partitions, it can get pretty bad. Solve this by adding a one-hit cache for partition lookup. This makes the lookup O(1) for the case where we do most IO to one partition. Even for mixed partition workloads, amortized cost is pretty close to O(1) since the natural IO batching makes the one-hit cache last for lots of IOs. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:51 +01:00
Jens Axboe	b374d18a4b	block: get rid of elevator_t typedef Just use struct elevator_queue everywhere instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	abf137dd77	aio: make the lookup_ioctx() lockless The mm->ioctx_list is currently protected by a reader-writer lock, so we always grab that lock on the read side for doing ioctx lookups. As the workload is extremely reader biased, turn this into an rcu hlist so we can make lookup_ioctx() lockless. Get rid of the rwlock and use a spinlock for providing update side exclusion. There's usually only 1 entry on this list, so it doesn't make sense to look into fancier data structures. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	392ddc3298	bio: add support for inlining a number of bio_vecs inside the bio When we go and allocate a bio for IO, we actually do two allocations. One for the bio itself, and one for the bi_io_vec that holds the actual pages we are interested in. This feature inlines a definable amount of io vecs inside the bio itself, so we eliminate the bio_vec array allocation for IO's up to a certain size. It defaults to 4 vecs, which is typically 16k of IO. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	bb799ca020	bio: allow individual slabs in the bio_set Instead of having a global bio slab cache, add a reference to one in each bio_set that is created. This allows for personalized slabs in each bio_set, so that they can have bios of different sizes. This means we can personalize the bios we return. File systems may want to embed the bio inside another structure, to avoid allocation more items (and stuffing them in ->bi_private) after the get a bio. Or we may want to embed a number of bio_vecs directly at the end of a bio, to avoid doing two allocations to return a bio. This is now possible. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:23 +01:00
Jens Axboe	1b43449869	bio: move the slab pointer inside the bio_set In preparation for adding differently sized bios. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:46 +01:00
Jens Axboe	7ff9345ffa	bio: only mempool back the largest bio_vec slab cache We only very rarely need the mempool backing, so it makes sense to get rid of all but one of the mempool in a bio_set. So keep the largest bio_vec count mempool so we can always honor the largest allocation, and "upgrade" callers that fail. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:46 +01:00
Tejun Heo	58eea927d2	block: simplify empty barrier implementation Empty barrier required special handling in __elv_next_request() to complete it without letting the low level driver see it. With previous changes, barrier code is now flexible enough to skip the BAR step using the same barrier sequence selection mechanism. Drop the special handling and mask off q->ordered from start_ordered(). Remove blk_empty_barrier() test which now has no user. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:45 +01:00
Tejun Heo	8f11b3e99a	block: make barrier completion more robust Barrier completion had the following assumptions. * start_ordered() couldn't finish the whole sequence properly. If all actions are to be skipped, q->ordseq is set correctly but the actual completion was never triggered thus hanging the barrier request. * Drain completion in elv_complete_request() assumed that there's always at least one request in the queue when drain completes. Both assumptions are true but these assumptions need to be removed to improve empty barrier implementation. This patch makes the following changes. * Make start_ordered() use blk_ordered_complete_seq() to mark skipped steps complete and notify __elv_next_request() that it should fetch the next request if the whole barrier has completed inside start_ordered(). * Make drain completion path in elv_complete_request() check whether the queue is empty. Empty queue also indicates drain completion. * While at it, convert 0/1 return from blk_do_ordered() to false/true. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:45 +01:00
Tejun Heo	f671620e7d	block: make every barrier action optional In all barrier sequences, the barrier write itself was always assumed to be issued and thus didn't have corresponding control flag. This patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in start_ordered() such that any barrier action can be skipped. This patch doesn't introduce any visible behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:45 +01:00
Tejun Heo	313e42999d	block: reorganize QUEUE_ORDERED_* constants Separate out ordering type (drain,) and action masks (preflush, postflush, fua) from visible ordering mode selectors (QUEUE_ORDERED_). Ordering types are now named QUEUE_ORDERED_BY_ while action masks are named QUEUE_ORDERED_DO_*. This change is necessary to add QUEUE_ORDERED_DO_BAR and make it optional to improve empty barrier implementation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Richard Kennedy	ba744d5e29	block: reorder struct bio to remove padding on 64bit Remove 8 bytes of padding from struct bio which also removes 16 bytes from struct bio_pair to make it 248 bytes. bio_pair then fits into one fewer cache lines & into a smaller slab. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Cheng Renquan	64d01dc9e1	block: use cancel_work_sync() instead of kblockd_flush_work() After many improvements on kblockd_flush_work, it is now identical to cancel_work_sync, so a direct call to cancel_work_sync is suggested. The only difference is that cancel_work_sync is a GPL symbol, so no non-GPL modules anymore. Signed-off-by: Cheng Renquan <crquan@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Keith Mannthey	08bafc0341	block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set Allow the scsi request REQ_QUIET flag to be propagated to the buffer file system layer. The basic ideas is to pass the flag from the scsi request to the bio (block IO) and then to the buffer layer. The buffer layer can then suppress needless printks. This patch declutters the kernel log by removed the 40-50 (per lun) buffer io error messages seen during a boot in my multipath setup . It is a good chance any real errors will be missed in the "noise" it the logs without this patch. During boot I see blocks of messages like " __ratelimit: 211 callbacks suppressed Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242847 Buffer I/O error on device sdm, logical block 1 Buffer I/O error on device sdm, logical block 5242878 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block `5242872` " in my logs. My disk environment is multipath fiber channel using the SCSI_DH_RDAC code and multipathd. This topology includes an "active" and "ghost" path for each lun. IO's to the "ghost" path will never complete and the SCSI layer, via the scsi device handler rdac code, quick returns the IOs to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi layer messages. I am wanting to extend the QUIET behavior to include the buffer file system layer to deal with these errors as well. I have been running this patch for a while now on several boxes without issue. A few runs of bonnie++ show no noticeable difference in performance in my setup. Thanks for John Stultz for the quiet_error finalization. Submitted-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Fernando Luis Vázquez Cao	88e740f165	block: add queue flag for paravirt frontend drivers As is the case with SSD devices, we do not want to idle in AS/CFQ when the block device is a paravirt front-end driver. This patch adds a flag (QUEUE_FLAG_VIRT) which should be used by front-end drivers such as virtio_blk and xen-blkfront to indicate a paravirtualized device. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:41 +01:00
Lachlan McIlroy	0a8c5395f9	[XFS] Fix merge failures Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: fs/xfs/linux-2.6/xfs_cred.h fs/xfs/linux-2.6/xfs_globals.h fs/xfs/linux-2.6/xfs_ioctl.c fs/xfs/xfs_vnodeops.h Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-29 16:47:18 +11:00
David S. Miller	e3c6d4ee54	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: arch/sparc64/kernel/idprom.c	2008-12-28 20:19:47 -08:00
Tejun Heo	ece180d1cf	libata: perform port detach in EH ata_port_detach() first made sure EH saw ATA_PFLAG_UNLOADING and then assumed EH context belongs to it and performed detach operation itself. However, UNLOADING doesn't disable all of EH and this could lead to problems including triggering WARN_ON()'s in EH path. This patch makes port detach behave more like other EH actions such that ata_port_detach() requests EH to detach and waits for completion. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-12-28 22:43:21 -05:00
Tejun Heo	1eca4365be	libata: beef up iterators There currently are the following looping constructs. * __ata_port_for_each_link() for all available links * ata_port_for_each_link() for edge links * ata_link_for_each_dev() for all devices * ata_link_for_each_dev_reverse() for all devices in reverse order Now there's a need for looping construct which is similar to __ata_port_for_each_link() but iterates over PMP links before the host link. Instead of adding another one with long name, do the following cleanup. * Implement and export ata_link_next() and ata_dev_next() which take @mode parameter and can be used to build custom loop. * Implement ata_for_each_link() and ata_for_each_dev() which take looping mode explicitly. The following iteration modes are implemented. * ATA_LITER_EDGE : loop over edge links * ATA_LITER_HOST_FIRST : loop over all links, host link first * ATA_LITER_PMP_FIRST : loop over all links, PMP links first * ATA_DITER_ENABLED : loop over enabled devices * ATA_DITER_ENABLED_REVERSE : loop over enabled devices in reverse order * ATA_DITER_ALL : loop over all devices * ATA_DITER_ALL_REVERSE : loop over all devices in reverse order This change removes exlicit device enabledness checks from many loops and makes it clear which ones are iterated over in which direction. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-12-28 22:43:20 -05:00
Linus Torvalds	3c92ec8ae9	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits) powerpc/44x: Support 16K/64K base page sizes on 44x powerpc: Force memory size to be a multiple of PAGE_SIZE powerpc/32: Wire up the trampoline code for kdump powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M powerpc/32: Allow __ioremap on RAM addresses for kdump kernel powerpc/32: Setup OF properties for kdump powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs() powerpc: Prepare xmon_save_regs for use with kdump powerpc: Remove default kexec/crash_kernel ops assignments powerpc: Make default kexec/crash_kernel ops implicit powerpc: Setup OF properties for ppc32 kexec powerpc/pseries: Fix cpu hotplug powerpc: Fix KVM build on ppc440 powerpc/cell: add QPACE as a separate Cell platform powerpc/cell: fix build breakage with CONFIG_SPUFS disabled powerpc/mpc5200: fix error paths in PSC UART probe function powerpc/mpc5200: add rts/cts handling in PSC UART driver powerpc/mpc5200: Make PSC UART driver update serial errors counters powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver ... Fix trivial conflict in drivers/char/Makefile as per Paul's directions	2008-12-28 16:54:33 -08:00
Linus Torvalds	0191b625ca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).	2008-12-28 12:49:40 -08:00
Linus Torvalds	1d248b2593	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits) IB/mlx4: Set ownership bit correctly when copying CQEs during CQ resize RDMA/nes: Remove tx_free_list RDMA/cma: Add IPv6 support RDMA/addr: Add support for translating IPv6 addresses mlx4_core: Delete incorrect comment mlx4_core: Add support for multiple completion event vectors IB/iser: Avoid recv buffer exhaustion caused by unexpected PDUs IB/ehca: Remove redundant test of vpage IB/ehca: Replace modulus operations in flush error completion path IB/ipath: Add locking for interrupt use of ipath_pd contexts vs free IB/ipath: Fix spi_pioindex value IB/ipath: Only do 1X workaround on rev1 chips IB/ipath: Don't count IB symbol and link errors unless link is UP IB/ipath: Check return value of dma_map_single() IB/ipath: Fix PSN of send WQEs after an RDMA read resend RDMA/nes: Cleanup warnings RDMA/nes: Add loopback check to make_cm_node() RDMA/nes: Check cqp_avail_reqs is empty after locking the list RDMA/nes: Fix TCP compliance test failures RDMA/nes: Forward packets for a new connection with stale APBVT entry ...	2008-12-28 12:33:59 -08:00
Linus Torvalds	a39b863342	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) sched: fix warning in fs/proc/base.c schedstat: consolidate per-task cpu runtime stats sched: use RCU variant of list traversal in for_each_leaf_rt_rq() sched, cpuacct: export percpu cpuacct cgroup stats sched, cpuacct: refactoring cpuusage_read / cpuusage_write sched: optimize update_curr() sched: fix wakeup preemption clock sched: add missing arch_update_cpu_topology() call sched: let arch_update_cpu_topology indicate if topology changed sched: idle_balance() does not call load_balance_newidle() sched: fix sd_parent_degenerate on non-numa smp machine sched: add uid information to sched_debug for CONFIG_USER_SCHED sched: move double_unlock_balance() higher sched: update comment for move_task_off_dead_cpu sched: fix inconsistency when redistribute per-cpu tg->cfs_rq shares sched/rt: removed unneeded defintion sched: add hierarchical accounting to cpu accounting controller sched: include group statistics in /proc/sched_debug sched: rename SCHED_NO_NO_OMIT_FRAME_POINTER => SCHED_OMIT_FRAME_POINTER sched: clean up SCHED_CPUMASK_ALLOC ...	2008-12-28 12:27:58 -08:00
Linus Torvalds	b0f4b285d7	Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits) sched, trace: update trace_sched_wakeup() tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3 Revert "x86: disable X86_PTRACE_BTS" ring-buffer: prevent false positive warning ring-buffer: fix dangling commit race ftrace: enable format arguments checking x86, bts: memory accounting x86, bts: add fork and exit handling ftrace: introduce tracing_reset_online_cpus() helper tracing: fix warnings in kernel/trace/trace_sched_switch.c tracing: fix warning in kernel/trace/trace.c tracing/ring-buffer: remove unused ring_buffer size trace: fix task state printout ftrace: add not to regex on filtering functions trace: better use of stack_trace_enabled for boot up code trace: add a way to enable or disable the stack tracer x86: entry_64 - introduce FTRACE_ frame macro v2 tracing/ftrace: add the printk-msg-only option tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp() x86, bts: correctly report invalid bts records ... Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits being already partly merged by the SH merge.	2008-12-28 12:21:10 -08:00
Linus Torvalds	be9c5ae4ee	Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (246 commits) x86: traps.c replace #if CONFIG_X86_32 with #ifdef CONFIG_X86_32 x86: PAT: fix address types in track_pfn_vma_new() x86: prioritize the FPU traps for the error code x86: PAT: pfnmap documentation update changes x86: PAT: move track untrack pfnmap stubs to asm-generic x86: PAT: remove follow_pfnmap_pte in favor of follow_phys x86: PAT: modify follow_phys to return phys_addr prot and return value x86: PAT: clarify is_linear_pfn_mapping() interface x86: ia32_signal: remove unnecessary declaration x86: common.c boot_cpu_stack and boot_exception_stacks should be static x86: fix intel x86_64 llc_shared_map/cpu_llc_id anomolies x86: fix warning in arch/x86/kernel/microcode_amd.c x86: ia32.h: remove unused struct sigfram32 and rt_sigframe32 x86: asm-offset_64: use rt_sigframe_ia32 x86: sigframe.h: include headers for dependency x86: traps.c declare functions before they get used x86: PAT: update documentation to cover pgprot and remap_pfn related changes - v3 x86: PAT: add pgprot_writecombine() interface for drivers - v3 x86: PAT: change pgprot_noncached to uc_minus instead of strong uc - v3 x86: PAT: implement track/untrack of pfnmap regions for x86 - v3 ...	2008-12-28 12:07:57 -08:00
Linus Torvalds	bb26c6c29b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (105 commits) SELinux: don't check permissions for kernel mounts security: pass mount flags to security_sb_kern_mount() SELinux: correctly detect proc filesystems of the form "proc/foo" Audit: Log TIOCSTI user namespaces: document CFS behavior user namespaces: require cap_set{ug}id for CLONE_NEWUSER user namespaces: let user_ns be cloned with fairsched CRED: fix sparse warnings User namespaces: use the current_user_ns() macro User namespaces: set of cleanups (v2) nfsctl: add headers for credentials coda: fix creds reference capabilities: define get_vfs_caps_from_disk when file caps are not enabled CRED: Allow kernel services to override LSM settings for task actions CRED: Add a kernel_service object class to SELinux CRED: Differentiate objective and effective subjective credentials on a task CRED: Documentation CRED: Use creds in file structs CRED: Prettify commoncap.c CRED: Make execve() take advantage of copy-on-write credentials ...	2008-12-28 11:43:54 -08:00
Linus Torvalds	e14e61e967	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (57 commits) crypto: aes - Precompute tables crypto: talitos - Ack done interrupt in isr instead of tasklet crypto: testmgr - Correct comment about deflate parameters crypto: salsa20 - Remove private wrappers around various operations crypto: des3_ede - permit weak keys unless REQ_WEAK_KEY set crypto: sha512 - Switch to shash crypto: sha512 - Move message schedule W[80] to static percpu area crypto: michael_mic - Switch to shash crypto: wp512 - Switch to shash crypto: tgr192 - Switch to shash crypto: sha256 - Switch to shash crypto: md5 - Switch to shash crypto: md4 - Switch to shash crypto: sha1 - Switch to shash crypto: rmd320 - Switch to shash crypto: rmd256 - Switch to shash crypto: rmd160 - Switch to shash crypto: rmd128 - Switch to shash crypto: null - Switch to shash crypto: hash - Make setkey optional ...	2008-12-28 11:43:22 -08:00
Linus Torvalds	cb10ea549f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (367 commits) ALSA: ASoC: fix a typo in omp-pcm.c ASoC: Fix DSP formats in SSM2602 audio codec ASoC: Fix incorrect DSP format in OMAP McBSP DAI and affected drivers ALSA: hda: fix incorrect mixer index values for 92hd83xx ALSA: hda: dinput_mux check ALSA: hda - Add quirk for another HP dv7 ALSA: ASoC - Add missing __devexit annotation to wm8350.c ALSA: ASoc: DaVinci: davinci-evm use dsp_b mode ALSA: ASoC: DaVinci: i2s, evm, pass same value to codec and cpu_dai ALSA: ASoC: tlv320aic3x add dsp_a ALSA: ASoC: DaVinci: document I2S limitations ALSA: ASoC: DaVinci: davinci-i2s clean up ALSA: ASoC: DaVinci: davinci-i2s clean up ALSA: ASoC: DaVinci: davinci-i2s add comments to explain polarity ALSA: ASoC: DaVinci: davinvi-evm, make requests explicit ALSA: ca0106 - disable 44.1kHz capture ALSA: ca0106 - Add missing card->private_data initialization ALSA: ca0106 - Check ac97 availability at PM ALSA: hda - Power up always when no jack detection is available ALSA: hda - Fix unused variable warnings in patch_sigmatel.c ...	2008-12-28 11:41:32 -08:00
Yinghai Lu	13a0c3c269	sparseirq: work around compiler optimizing away __weak functions Impact: fix panic on null pointer with sparseirq Some GCC versions seem to inline the weak global function, when that function is empty. Work it around, by making the functions return a (dummy) integer. Signed-off-by: Yinghai <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-27 13:24:00 +01:00
KOSAKI Motohiro	18eefedfe8	irq: simplify for_each_irq_desc() usage Impact: cleanup all for_each_irq_desc() usage point have !desc check. then its check can move into for_each_irq_desc() macro. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-26 09:48:18 +01:00
KOSAKI Motohiro	f9af0e7091	irq: for_each_irq_desc() move to irqnr.h Impact: cleanup before CONFIG_SPARSE_IRQ age, for_each_irq_desc() sat in irqnr.h and could be called from generic code. CONFIG_SPARSE_IRQ breaks this assumption, but SPARSE_IRQ version for_each_irq_desc() also can move into irqnr.h easily. Also, this patch unifies CONFIG_SPARSE_IRQ and !CONFIG_SPARSE_IRQ for_each_irq_desc(). Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-26 09:48:17 +01:00
Ingo Molnar	32e8d18683	Merge branches 'timers/clocksource', 'timers/hpet', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/rtc' into timers/core	2008-12-25 18:02:25 +01:00
Ingo Molnar	860cf8894b	Merge branches 'irq/sparseirq', 'irq/genirq' and 'irq/urgent'; commit 'v2.6.28' into irq/core	2008-12-25 16:27:54 +01:00
Ingo Molnar	6638101c11	Merge branches 'core/debugobjects', 'core/iommu', 'core/locking', 'core/printk', 'core/rcu', 'core/resources', 'core/softirq' and 'core/stacktrace' into core/core	2008-12-25 14:06:29 +01:00
Ingo Molnar	cc37d3d206	Merge branch 'core/futexes' into core/core	2008-12-25 13:54:14 +01:00
Ingo Molnar	0b271ef452	Merge commit 'v2.6.28' into core/core	2008-12-25 13:51:46 +01:00
Ingo Molnar	4e202284e6	Merge branch 'sched/urgent'; commit 'v2.6.28' into sched/core	2008-12-25 13:42:23 +01:00
Ingo Molnar	5250d329e3	Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'tracing/ring-buffer'; commit 'v2.6.28' into tracing/core	2008-12-25 13:11:00 +01:00
Takashi Iwai	a802269781	Merge branch 'topic/jack-mechanical' into to-push	2008-12-25 11:40:29 +01:00
Takashi Iwai	a65056205c	Merge branch 'topic/hda' into to-push	2008-12-25 11:40:28 +01:00
Takashi Iwai	5c8261e44e	Merge branch 'topic/asoc' into to-push	2008-12-25 11:40:25 +01:00
James Morris	cbacc2c7f0	Merge branch 'next' into for-linus	2008-12-25 11:40:09 +11:00
Herbert Xu	0426c16642	libcrc32c: Add crc32c_le macro The bnx2x driver actually uses the crc32c_le name so this patch restores the crc32c_le symbol through a macro. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:43 +11:00
Herbert Xu	69c35efcf1	libcrc32c: Move implementation to crypto crc32c This patch swaps the role of libcrc32c and crc32c. Previously the implementation was in libcrc32c and crc32c was a wrapper. Now the code is in crc32c and libcrc32c just calls the crypto layer. The reason for the change is to tap into the algorithm selection capability of the crypto API so that optimised implementations such as the one utilising Intel's CRC32C instruction can be used where available. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:40 +11:00
Herbert Xu	5f7082ed4f	crypto: hash - Export shash through hash This patch allows shash algorithms to be used through the old hash interface. This is a transitional measure so we can convert the underlying algorithms to shash before converting the users across. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:33 +11:00
Herbert Xu	dec8b78606	crypto: hash - Add import/export interface It is often useful to save the partial state of a hash function so that it can be used as a base for two or more computations. The most prominent example is HMAC where all hashes start from a base determined by the key. Having an import/export interface means that we only have to compute that base once rather than for each message. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:30 +11:00
Herbert Xu	3b2f6df082	crypto: hash - Export shash through ahash This patch allows shash algorithms to be used through the ahash interface. This is required before we can convert digest algorithms over to shash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:28 +11:00
Herbert Xu	7b5a080b3c	crypto: hash - Add shash interface The shash interface replaces the current synchronous hash interface. It improves over hash in two ways. Firstly shash is reentrant, meaning that the same tfm may be used by two threads simultaneously as all hashing state is stored in a local descriptor. The other enhancement is that shash no longer takes scatter list entries. This is because shash is specifically designed for synchronous algorithms and as such scatter lists are unnecessary. All existing hash users will be converted to shash once the algorithms have been completely converted. There is also a new finup function that combines update with final. This will be extended to ahash once the algorithm conversion is done. This is also the first time that an algorithm type has their own registration function. Existing algorithm types will be converted to this way in due course. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:26 +11:00
Herbert Xu	7b0bac64cd	crypto: api - Rebirth of crypto_alloc_tfm This patch reintroduces a completely revamped crypto_alloc_tfm. The biggest change is that we now take two crypto_type objects when allocating a tfm, a frontend and a backend. In fact this simply formalises what we've been doing behind the API's back. For example, as it stands crypto_alloc_ahash may use an actual ahash algorithm or a crypto_hash algorithm. Putting this in the API allows us to do this much more cleanly. The existing types will be converted across gradually. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:24 +11:00
Herbert Xu	4a7794860b	crypto: api - Move type exit function into crypto_tfm The type exit function needs to undo any allocations done by the type init function. However, the type init function may differ depending on the upper-level type of the transform (e.g., a crypto_blkcipher instantiated as a crypto_ablkcipher). So we need to move the exit function out of the lower-level structure and into crypto_tfm itself. As it stands this is a no-op since nobody uses exit functions at all. However, all cases where a lower-level type is instantiated as a different upper-level type (such as blkcipher as ablkcipher) will be converted such that they allocate the underlying transform and use that instead of casting (e.g., crypto_ablkcipher casted into crypto_blkcipher). That will need to use a different exit function depending on the upper-level type. This patch also allows the type init/exit functions to call (or not) cra_init/cra_exit instead of always calling them from the top level. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:23 +11:00
Ingo Molnar	db8862eafe	Merge branch 'linus' into tracing/hw-branch-tracing	2008-12-24 21:08:26 +01:00
Takashi Iwai	7645c4bfbb	Merge branch 'fix/hda' into topic/hda	2008-12-24 11:04:08 +01:00
Olga Kornievskaia	61054b14d5	nfsd: support callbacks with gss flavors This patch adds server-side support for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:00 -05:00
Olga Kornievskaia	608207e888	rpc: pass target name down to rpc level on callbacks The rpc client needs to know the principal that the setclientid was done as, so it can tell gssd who to authenticate to. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:40 -05:00
Olga Kornievskaia	68e76ad0ba	nfsd: pass client principal name in rsc downcall Two principals are involved in krb5 authentication: the target, who we authenticate to (normally the name of the server, like nfs/server.citi.umich.edu@CITI.UMICH.EDU), and the source, we we authenticate as (normally a user, like bfields@UMICH.EDU) In the case of NFSv4 callbacks, the target of the callback should be the source of the client's setclientid call, and the source should be the nfs server's own principal. Therefore we allow svcgssd to pass down the name of the principal that just authenticated, so that on setclientid we can store that principal name with the new client, to be used later on callbacks. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:15 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	c381060869	rpc: add an rpc_pipe_open method We want to transition to a new gssd upcall which is text-based and more easily extensible. To simplify upgrades, as well as testing and debugging, it will help if we can upgrade gssd (to a version which understands the new upcall) without having to choose at boot (or module-load) time whether we want the new or the old upcall. We will do this by providing two different pipes: one named, as currently, after the mechanism (normally "krb5"), and supporting the old upcall. One named "gssd" and supporting the new upcall version. We allow gssd to indicate which version it supports by its choice of which pipe to open. As we have no interest in supporting simultaneous use of both versions, we'll forbid opening both pipes at the same time. So, add a new pipe_open callback to the rpc_pipefs api, which the gss code can use to track which pipes have been open, and to refuse opens of incompatible pipes. We only need this to be called on the first open of a given pipe. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:08:32 -05:00
Benny Halevy	c977a2ef40	sunrpc: get rid of rpc_rqst.rq_bufsize rq_bufsize is not used. Signed-off-by: Mike Sager <Mike.Sager@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:13 -05:00
Peter Staubach	64672d55d9	optimize attribute timeouts for "noac" and "actimeo=0" Hi. I've been looking at a bugzilla which describes a problem where a customer was advised to use either the "noac" or "actimeo=0" mount options to solve a consistency problem that they were seeing in the file attributes. It turned out that this solution did not work reliably for them because sometimes, the local attribute cache was believed to be valid and not timed out. (With an attribute cache timeout of 0, the cache should always appear to be timed out.) In looking at this situation, it appears to me that the problem is that the attribute cache timeout code has an off-by-one error in it. It is assuming that the cache is valid in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The cache should be considered valid only in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo). With this change, the options, "noac" and "actimeo=0", work as originally expected. This problem was previously addressed by special casing the attrtimeo == 0 case. However, since the problem is only an off- by-one error, the cleaner solution is address the off-by-one error and thus, not require the special case. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	dc0b027dfa	NFSv4: Convert the open and close ops to use fmode Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	bd7bf9d540	NFSv4: Convert delegation->type field to fmode_t Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:53 -05:00
Trond Myklebust	95d35cb4c4	NFSv4: Remove nfs_client->cl_sem Now that we're using the flags to indicate state that needs to be recovered, as well as having implemented proper refcounting and spinlocking on the state and open_owners, we can get rid of nfs_client->cl_sem. The only remaining case that was dubious was the file locking, and that case is now covered by the nfsi->rwsem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:45 -05:00
Chuck Lever	0cb2659b81	NLM: allow lockd requests from an unprivileged port If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's lockd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:38 -05:00
Chuck Lever	d740351bf0	NFS: add "[no]resvport" mount option The standard default security setting for NFS is AUTH_SYS. An NFS client connects to NFS servers via a privileged source port and a fixed standard destination port (2049). The client sends raw uid and gid numbers to identify users making NFS requests, and the server assumes an appropriate authority on the client has vetted these values because the source port is privileged. On Linux, by default in-kernel RPC services use a privileged port in the range between 650 and 1023 to avoid using source ports of well- known IP services. Using such a small range limits the number of NFS mount points and the number of unique NFS servers to which a client can connect concurrently. An NFS client can use unprivileged source ports to expand the range of source port numbers, allowing more concurrent server connections and more NFS mount points. Servers must explicitly allow NFS connections from unprivileged ports for this to work. In the past, bumping the value of the sunrpc.max_resvport sysctl on the client would permit the NFS client to use unprivileged ports. Bumping this setting also changes the maximum port number used by other in-kernel RPC services, some of which still required a port number less than 1023. This is exacerbated by the way source port numbers are chosen by the Linux RPC client, which starts at the top of the range and works downwards. It means that bumping the maximum means all RPC services requesting a source port will likely get an unprivileged port instead of a privileged one. Changing this setting effects all NFS mount points on a client. A sysadmin could not selectively choose which mount points would use non-privileged ports and which could not. Lastly, this mechanism of expanding the limit on the number of NFS mount points was entirely undocumented. To address the need for the NFS client to use a large range of source ports without interfering with the activity of other in-kernel RPC services, we introduce a new NFS mount option. This option explicitly tells only the NFS client to use a non-privileged source port when communicating with the NFS server for one specific mount point. This new mount option is called "resvport," like the similar NFS mount option on FreeBSD and Mac OS X. A sister patch for nfs-utils will be submitted that documents this new option in nfs(5). The default setting for this new mount option requires the NFS client to use a privileged port, as before. Explicitly specifying the "noresvport" mount option allows the NFS client to use an unprivileged source port for this mount point when connecting to the NFS server port. This mount option is supported only for text-based NFS mounts. [ Sidebar: it is widely known that security mechanisms based on the use of privileged source ports are ineffective. However, the NFS client can combine the use of unprivileged ports with the use of secure authentication mechanisms, such as Kerberos. This allows a large number of connections and mount points while ensuring a useful level of security. Eventually we may change the default setting for this option depending on the security flavor used for the mount. For example, if the mount is using only AUTH_SYS, then the default setting will be "resvport;" if the mount is using a strong security flavor such as krb5, the default setting will be "noresvport." ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Trond.Myklebust@netapp.com: Fixed a bug whereby nfs4_init_client() was being called with incorrect arguments.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:37 -05:00
Chuck Lever	146ec944bb	NFS: Move declaration of nfs_mount() to fs/nfs/internal.h Clean up: The nfs_mount() function is not to be used outside of the NFS client. Move its public declaration to fs/nfs/internal.h. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:34 -05:00
Trond Myklebust	88a9fe8cae	SUNRPC: Remove the last remnant of the BKL... Somehow, this escaped the previous purge. There should be no need to keep any extra locks in the XDR callbacks. The NFS client XDR code only writes into private objects, whereas all reads of shared objects are confined to fields that do not change, such as filehandles... Ditto for lockd, the NFSv2/v3 client mount code, and rpcbind. The nfsd XDR code may require the BKL, but since it does a synchronous RPC call from a thread that already holds the lock, that issue is moot. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:31 -05:00
Ingo Molnar	bed4f13065	Merge branch 'x86/irq' into x86/core	2008-12-23 16:30:31 +01:00
Ingo Molnar	fa623d1b02	Merge branches 'x86/apic', 'x86/cleanups', 'x86/cpufeature', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core	2008-12-23 16:27:23 +01:00
Ingo Molnar	bf8bd66d05	Merge branch 'x86/apic' into x86/irq Conflicts: arch/x86/kernel/apic.c	2008-12-23 16:24:15 +01:00
Neil Horman	908a7a16b8	net: Remove unused netdev arg from some NAPI interfaces. When the napi api was changed to separate its 1:1 binding to the net_device struct, the netif_rx_[prep\|schedule\|complete] api failed to remove the now vestigual net_device structure parameter. This patch cleans up that api by properly removing it.. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-22 20:43:12 -08:00
David Vrabel	a01777ecf2	uwb: remove unused include/linux/uwb/debug.h Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-12-22 18:30:29 +00:00
Yevgeny Petrilin	b8dd786f94	mlx4_core: Add support for multiple completion event vectors When using MSI-X mode, create a completion event queue for each CPU. Report the number of completion EQs in a new struct mlx4_caps member, num_comp_vectors, and extend the mlx4_cq_alloc() interface with a vector parameter so that consumers can specify which completion EQ should be used to report events for the CQ being created. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-12-22 07:15:03 -08:00
Paul Mundt	209aa4fdc3	fb: SH-5 uses __raw I/O accessors now also, drop the special casing. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-12-22 18:44:05 +09:00
Lachlan McIlroy	27a0464a6c	[XFS] Fix merge conflict in fs/xfs/xfs_rename.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: fs/xfs/xfs_rename.c Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-22 17:34:26 +11:00
Don Skidmore	f4314e815e	net: add DCNA attribute to the BCN interface for DCB Adds the Backward Congestion Notification Address (BCNA) attribute to the Backward Congestion Notification (BCN) interface for Data Center Bridging (DCB), which was missing. Receive the BCNA attribute in the ixgbe driver. The BCNA attribute is for a switch to inform the endstation about the physical port identification in order to support BCN on aggregated links. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Eric W Multanen <eric.w.multanen@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2008-12-21 20:10:29 -08:00
Lai Jiangshan	3ddeb912f4	ftrace: enable format arguments checking Impact: broaden gcc printf format checks for ftrace_printk() format arguments checking for ftrace_printk() is __printf(1, 2), not __printf(1, 0). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-21 09:46:45 +01:00
Anton Vorontsov	749820928a	of/gpio: Implement of_gpio_count() This function is used to count how many GPIOs are specified for a device node. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:14 +11:00
Markus Metzger	c5dee6177f	x86, bts: memory accounting Impact: move the BTS buffer accounting to the mlock bucket Add alloc_locked_buffer() and free_locked_buffer() functions to mm/mlock.c to kalloc a buffer and account the locked memory to current. Account the memory for the BTS buffer to the tracer. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-20 09:15:47 +01:00
Markus Metzger	bf53de907d	x86, bts: add fork and exit handling Impact: introduce new ptrace facility Add arch_ptrace_untrace() function that is called when the tracer detaches (either voluntarily or when the tracing task dies); ptrace_disable() is only called on a voluntary detach. Add ptrace_fork() and arch_ptrace_fork(). They are called when a traced task is forked. Clear DS and BTS related fields on fork. Release DS resources and reclaim memory in ptrace_untrace(). This releases resources already when the tracing task dies. We used to do that when the traced task dies. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-20 09:15:46 +01:00
venkatesh.pallipadi@intel.com	34801ba9bf	x86: PAT: move track untrack pfnmap stubs to asm-generic Impact: Cleanup and branch hints only. Move the track and untrack pfn stub routines from memory.c to asm-generic. Also add unlikely to pfnmap related calls in fork and exit path. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
venkatesh.pallipadi@intel.com	982d789ab7	x86: PAT: remove follow_pfnmap_pte in favor of follow_phys Impact: Cleanup - removes a new function in favor of a recently modified older one. Replace follow_pfnmap_pte in pat code with follow_phys. follow_phys lso returns protection eliminating the need of pte_pgprot call. Using follow_phys also eliminates the need for pte_pa. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
venkatesh.pallipadi@intel.com	d87fe6607c	x86: PAT: modify follow_phys to return phys_addr prot and return value Impact: Changes and globalizes an existing static interface. Follow_phys does similar things as follow_pfnmap_pte. Make a minor change to follow_phys so that it can be used in place of follow_pfnmap_pte. Physical address return value with 0 as error return does not work in follow_phys as the actual physical address 0 mapping may exist in pte. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
venkatesh.pallipadi@intel.com	6bd9cd50c8	x86: PAT: clarify is_linear_pfn_mapping() interface Impact: Documentation only Incremental patches to address the review comments from Nick Piggin for v3 version of x86 PAT pfnmap changes patchset here http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/01330.html This patch: Clarify is_linear_pfn_mapping() and its usage. It is used by x86 PAT code for performance reasons. Identifying pfnmap as linear over entire vma helps speedup reserve and free of memtype for the region. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
James Morris	12204e24b1	security: pass mount flags to security_sb_kern_mount() Pass mount flags to security_sb_kern_mount(), so security modules can determine if a mount operation is being performed by the kernel. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>	2008-12-20 09:02:39 +11:00
Sujith	094d05dc32	mac80211: Fix HT channel selection HT management is done differently for AP and STA modes, unify to just the ->config() callback since HT is fundamentally a PHY property and cannot be per-BSS. Rename enum nl80211_sec_chan_offset as nl80211_channel_type to denote the channel type ( NO_HT, HT20, HT40+, HT40- ). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-19 15:22:54 -05:00
Henning Rogge	420e7fabd9	nl80211: Add signal strength and bandwith to nl80211station info This patch adds signal strength and transmission bitrate to the station_info of nl80211. Signed-off-by: Henning Rogge <rogge@fgan.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-19 15:04:54 -05:00
Ingo Molnar	30cd324e97	Merge branches 'tracing/ftrace', 'tracing/ring-buffer' and 'tracing/urgent' into tracing/core Conflicts: include/linux/ftrace.h	2008-12-19 09:42:40 +01:00
Ingo Molnar	06aaf76a7e	sched: move test_sd_parent() to an SMP section of sched.h Impact: build fix Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:56 +01:00
Vaidyanathan Srinivasan	100fdaee70	sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0 Impact: change task balancing to save power more agressively Add SD_BALANCE_NEWIDLE flag at MC level and CPU level if sched_mc is set. This helps power savings and will not affect performance when sched_mc=0 Ingo and Mike Galbraith have optimised the SD flags by removing SD_BALANCE_NEWIDLE at MC and CPU level. This helps performance but hurts power savings since this slows down task consolidation by reducing the number of times load_balance is run. sched: fine-tune SD_MC_INIT commit `1480098470` Author: Mike Galbraith <efault@gmx.de> Date: Fri Nov 7 15:26:50 2008 +0100 sched: re-tune balancing -- revert commit `9fcd18c9e6` Author: Ingo Molnar <mingo@elte.hu> Date: Wed Nov 5 16:52:08 2008 +0100 This patch selectively enables SD_BALANCE_NEWIDLE flag only when sched_mc is set to 1 or 2. This helps power savings by task consolidation and also does not hurt performance at sched_mc=0 where all power saving optimisations are turned off. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:55 +01:00
Gautham R Shenoy	afb8a9b70b	sched: framework for sched_mc/smt_power_savings=N Impact: extend range of /sys/devices/system/cpu/sched_mc_power_savings Currently the sched_mc/smt_power_savings variable is a boolean, which either enables or disables topology based power savings. This patch extends the behaviour of the variable from boolean to multivalued, such that based on the value, we decide how aggressively do we want to perform powersavings balance at appropriate sched domain based on topology. Variable levels of power saving tunable would benefit end user to match the required level of power savings vs performance trade-off depending on the system configuration and workloads. This version makes the sched_mc_power_savings global variable to take more values (0,1,2). Later versions can have a single tunable called sched_power_savings instead of sched_{mc,smt}_power_savings. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:46 +01:00
Vaidyanathan Srinivasan	716707b299	sched: convert BALANCE_FOR_xx_POWER to inline functions Impact: cleanup BALANCE_FOR_MC_POWER and similar macros defined in sched.h are not constants and have various condition checks and significant amount of code that is not suitable to be contain in a macro. Also there could be side effects on the expressions passed to some of them like test_sd_parent(). This patch converts all complex macros related to power savings balance to inline functions. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:45 +01:00
Takashi Iwai	0ff555192a	Merge branch 'fix/hda' into topic/hda	2008-12-19 08:22:57 +01:00
Mike Travis	e057d7aea9	cpumask: add sysfs displays for configured and disabled cpu maps Impact: add new sysfs files. Add sysfs files "kernel_max" and "offline" to display the max CPU index allowed (NR_CPUS-1), and the map of cpus that are offline. Cpus can be offlined via HOTPLUG, disabled by the BIOS ACPI tables, or if they exceed the number of cpus allowed by the NR_CPUS config option, or the "maxcpus=NUM" kernel start parameter. The "possible_cpus=NUM" parameter can also extend the number of possible cpus allowed, in which case the cpus not present at startup will be in the offline state. (These cpus can be HOTPLUGGED ON after system startup [pending a follow-on patch to provide the capability via the /sys/devices/sys/cpu/cpuN/online mechanism to bring them online.]) By design, the "offlined cpus > possible cpus" display will always use the following formats: * all possible cpus online: "x$" or "x-y$" * some possible cpus offline: ".,x$" or ".,x-y$" where: x == number of possible cpus (nr_cpu_ids); and y == number of cpus >= NR_CPUS or maxcpus (if y > x). One use of this feature is for distros to select (or configure) the appropriate kernel to install for the resident system. Notes: * cpus offlined <= possible cpus will be printed for all architectures. * cpus offlined > possible cpus will only be printed for arches that set 'total_cpus' [X86 only in this patch]. Based on tip/cpus4096 + .../rusty/linux-2.6-for-ingo.git/master + x86-only-patches sent 12/15. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-19 17:47:04 +10:30
Mike Travis	7b4967c532	cpumask: Add alloc_cpumask_var_node() Impact: New API This will be needed in x86 code to allocate the domain and old_domain cpumasks on the same node as where the containing irq_cfg struct is allocated. (Also fixes double-dump_stack on rare CONFIG_DEBUG_PER_CPU_MAPS case) Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (re-impl alloc_cpumask_var)	2008-12-19 16:56:37 +10:30
Yinghai Lu	078a55db07	sparseirq: add kernel-doc notation for new member in irq_desc, -v2 Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 02:06:53 +01:00
Russell King	fdb0a1a67e	Merge branch 'next-merged' of git://aeryn.fluff.org.uk/bjdooks/linux into devel	2008-12-18 22:15:30 +00:00
venkatesh.pallipadi@intel.com	2ab640379a	x86: PAT: hooks in generic vm code to help archs to track pfnmap regions - v3 Impact: Introduces new hooks, which are currently null. Introduce generic hooks in remap_pfn_range and vm_insert_pfn and corresponding copy and free routines with reserve and free tracking. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-18 13:30:15 -08:00
venkatesh.pallipadi@intel.com	e121e41844	x86: PAT: add follow_pfnmp_pte routine to help tracking pfnmap pages - v3 Impact: New currently unused interface. Add a generic interface to follow pfn in a pfnmap vma range. This is used by one of the subsequent x86 PAT related patch to keep track of memory types for vma regions across vma copy and free. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-18 13:30:15 -08:00
venkatesh.pallipadi@intel.com	3c8bb73ace	x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3 Impact: Code transformation, new functions added should have no effect. Drivers use mmap followed by pgprot_* and remap_pfn_range or vm_insert_pfn, in order to export reserved memory to userspace. Currently, such mappings are not tracked and hence not kept consistent with other mappings (/dev/mem, pci resource, ioremap) for the sme memory, that may exist in the system. The following patchset adds x86 PAT attribute tracking and untracking for pfnmap related APIs. First three patches in the patchset are changing the generic mm code to fit in this tracking. Last four patches are x86 specific to make things work with x86 PAT code. The patchset aso introduces pgprot_writecombine interface, which gives writecombine mapping when enabled, falling back to pgprot_noncached otherwise. This patch: While working on x86 PAT, we faced some hurdles with trackking remap_pfn_range() regions, as we do not have any information to say whether that PFNMAP mapping is linear for the entire vma range or it is smaller granularity regions within the vma. A simple solution to this is to use vm_pgoff as an indicator for linear mapping over the vma region. Currently, remap_pfn_range only sets vm_pgoff for COW mappings. Below patch changes the logic and sets the vm_pgoff irrespective of COW. This will still not be enough for the case where pfn is zero (vma region mapped to physical address zero). But, for all the other cases, we can look at pfnmap VMAs and say whether the mappng is for the entire vma region or not. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-18 13:30:15 -08:00
Paul E. McKenney	64db4cfff9	"Tree RCU": scalable classic RCU implementation This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 21:56:04 +01:00
Ingo Molnar	d110ec3a1e	Merge branch 'linus' into core/rcu	2008-12-18 21:54:49 +01:00
Linus Torvalds	b3806c3b94	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: bnx2: Fix bug in bnx2_free_rx_mem(). irda: Add irda_skb_cb qdisc related padding jme: Fixed a typo net: kernel BUG at drivers/net/phy/mdio_bus.c:165! drivers/net: starfire: Fix napi ->poll() weight handling tlan: Fix pci memory unmapping enc28j60: use netif_rx_ni() to deliver RX packets tlan: Fix small (< 64 bytes) datagram transmissions netfilter: ctnetlink: fix missing CTA_NAT_SEQ_UNSPEC	2008-12-18 12:00:46 -08:00
Mark Brown	40aa4a30d0	ASoC: Add WM8350 AudioPlus codec driver The WM8350 is an integrated audio and power management subsystem which provides a single-chip solution for portable audio and multimedia systems. The integrated audio CODEC provides all the necessary functions for high-quality stereo recording and playback. Programmable on-chip amplifiers allow for the direct connection of headphones and microphones with a minimum of external components. A programmable low-noise bias voltage is available to feed one or more electret microphones. Additional audio features include programmable high-pass filter in the ADC input path. This driver was originally written by Liam Girdwood with further updates from me. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2008-12-18 17:21:07 +00:00
KOSAKI Motohiro	74c8a61304	locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP Impact: simplify code commit "08678b0: generic: sparse irqs: use irq_desc() [...]" introduced the irq_desc_lock_class variable. But it is used only if CONFIG_SPARSE_IRQ=Y or CONFIG_TRACE_IRQFLAGS=Y. Otherwise, following warnings happen: CC kernel/irq/handle.o kernel/irq/handle.c:26: warning: 'irq_desc_lock_class' defined but not used Actually, current early_init_irq_lock_class has a bit strange and messy ifdef. In addition, it is not valueable. 1. this function is protected by !CONFIG_SPARSE_IRQ, but that is not necessary. if CONFIG_SPARSE_IRQ=Y, desc of all irq number are initialized by NULL at first - then this function calling is safe. 2. this function protected by CONFIG_TRACE_IRQFLAGS too. but it is not necessary either, because lockdep_set_class() doesn't have bad side effect even if CONFIG_TRACE_IRQFLAGS=n. This patch bloat kernel size a bit on CONFIG_TRACE_IRQFLAGS=n and CONFIG_SPARSE_IRQ=Y - but that's ok. early_init_irq_lock_class() is not a fastpatch at all. To avoid messy ifdefs is more important than a few bytes diet. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 14:35:53 +01:00
Ken Chen	9c2c48020e	schedstat: consolidate per-task cpu runtime stats Impact: simplify code When we turn on CONFIG_SCHEDSTATS, per-task cpu runtime is accumulated twice. Once in task->se.sum_exec_runtime and once in sched_info.cpu_time. These two stats are exactly the same. Given that task->se.sum_exec_runtime is always accumulated by the core scheduler, sched_info can reuse that data instead of duplicate the accounting. Signed-off-by: Ken Chen <kenchen@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 13:54:01 +01:00
Steven Rostedt	f38f1d2aa5	trace: add a way to enable or disable the stack tracer Impact: enhancement to stack tracer The stack tracer currently is either on when configured in or off when it is not. It can not be disabled when it is configured on. (besides disabling the function tracer that it uses) This patch adds a way to enable or disable the stack tracer at run time. It defaults off on bootup, but a kernel parameter 'stacktrace' has been added to enable it on bootup. A new sysctl has been added "kernel.stack_tracer_enabled" to let the user enable or disable the stack tracer at run time. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 12:56:24 +01:00
Ingo Molnar	b9974dc6bd	Merge branch 'linus' into cpus4096	2008-12-18 11:48:30 +01:00
Paul Mackerras	c280266a32	Merge branch 'linux-2.6' into next	2008-12-18 11:06:12 +11:00
Rémi Denis-Courmont	57c81fffc8	Phonet: allocate separate ARP type for GPRS over a Phonet pipe A separate xmit lock class supports GPRS over a Phonet pipe over a TUN device (type ARPHRD_NONE). Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-17 15:47:48 -08:00
Rémi Denis-Courmont	2d91d78b68	Phonet: allocate a non-Ethernet ARP type Also leave some room for more 802.11 types. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-17 15:47:29 -08:00
Phil Endecott	9a9fafb894	USB: fix comment about endianness of descriptors This patch fixes a comment and clarifies the documentation about the endianness of descriptors. The current policy is that descriptors will be little-endian at the API even on big-endian systems; however the /proc/bus/usb API predates this policy and presents descriptors with some multibyte fields byte-swapped. Signed-off-by: Phil Endecott <usb_endian_patch@chezphil.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-12-17 10:49:14 -08:00
Ian Campbell	b81ea27b23	swiotlb: add arch hook to force mapping Impact: generalize the sw-IOTLB range checks Some architectures require special rules to determine whether a range needs mapping or not. This adds a weak function for architectures to override. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-17 18:58:13 +01:00
Ian Campbell	e08e1f7adb	swiotlb: allow architectures to override phys<->bus<->phys conversions Impact: generalize phys<->bus<->phys conversions in the swiotlb code Architectures may need to override these conversions. Implement a __weak hook point containing the default implementation. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-17 18:58:09 +01:00
Ingo Molnar	855caa37b9	Merge branch 'x86/crashdump' into cpus4096 Conflicts: arch/x86/kernel/crash.c Merged for semantic conflict: arch/x86/kernel/reboot.c	2008-12-17 13:24:52 +01:00
Ingo Molnar	948a7b2b5e	Merge branch 'irq/sparseirq' into cpus4096 Conflicts: arch/x86/kernel/io_apic.c Merge irq/sparseirq here, to resolve conflicts.	2008-12-17 13:16:08 +01:00
Ingo Molnar	1f3f424a6b	Merge branch 'linus' into cpus4096	2008-12-17 13:07:48 +01:00
Andy Fleming	b31a1d8b41	gianfar: Convert gianfar to an of_platform_driver Does the same for the accompanying MDIO driver, and then modifies the TBI configuration method. The old way used fields in einfo, which no longer exists. The new way is to create an MDIO device-tree node for each instance of gianfar, and create a tbi-handle property to associate ethernet controllers with the TBI PHYs they are connected to. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 15:29:15 -08:00
David S. Miller	354ade9058	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/enc28j60.c	2008-12-16 15:23:54 -08:00
Yinghai Lu	48a1b10aff	x86, sparseirq: move irq_desc according to smp_affinity, v7 Impact: improve NUMA handling by migrating irq_desc on smp_affinity changes if CONFIG_NUMA_MIGRATE_IRQ_DESC is set: - make irq_desc to go with affinity aka irq_desc moving etc - call move_irq_desc in irq_complete_move() - legacy irq_desc is not moved, because they are allocated via static array for logical apic mode, need to add move_desc_in_progress_in_same_domain, otherwise it will not be moved ==> also could need two phases to get irq_desc moved. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-17 00:14:01 +01:00
Ian Campbell	0016fdee92	swiotlb: move some definitions to header Impact: cleanup Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-16 21:31:40 +01:00
Jeremy Fitzhardinge	8c5df16bec	swiotlb: allow architectures to override swiotlb pool allocation Impact: generalize swiotlb allocation code Architectures may need to allocate memory specially for use with the swiotlb. Create the weak function swiotlb_alloc_boot() and swiotlb_alloc() defaulting to the current behaviour. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-16 21:31:38 +01:00
Ingo Molnar	9dfc3bc7d2	Merge branches 'tracing/fastboot', 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/hw-branch-tracing' into tracing/core	2008-12-16 12:03:38 +01:00
Yang Hongyang	b24a2516d1	ipv6: Add IPV6_PKTINFO sticky option support to setsockopt() There are three reasons for me to add this support: 1.When no interface is specified in an IPV6_PKTINFO ancillary data item, the interface specified in an IPV6_PKTINFO sticky optionis is used. RFC3542: 6.7. Summary of Outgoing Interface Selection This document and [RFC-3493] specify various methods that affect the selection of the packet's outgoing interface. This subsection summarizes the ordering among those in order to ensure deterministic behavior. For a given outgoing packet on a given socket, the outgoing interface is determined in the following order: 1. if an interface is specified in an IPV6_PKTINFO ancillary data item, the interface is used. 2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky option, the interface is used. 2.When no IPV6_PKTINFO ancillary data is received,getsockopt() should return the sticky option value which set with setsockopt(). RFC 3542: Issuing getsockopt() for the above options will return the sticky option value i.e., the value set with setsockopt(). If no sticky option value has been set getsockopt() will return the following values: 3.Make the setsockopt implementation POSIX compliant. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 02:06:23 -08:00
Steve Glendinning	bc02ff95fe	net: Refactor full duplex flow control resolution These 4 drivers have identical full duplex flow control resolution functions. This patch changes them all to use one common function. The function in question decides whether a device should enable TX and RX flow control in a standard way (IEEE 802.3-2005 table 28B-3), so this should also be useful for other drivers. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 02:00:48 -08:00
Steve Glendinning	e18ce34654	net: Move flow control definitions to mii.h flags used within drivers for indicating tx and rx flow control are defined in 4 drivers (and probably more), move these constants to mii.h. The 3 SMSC drivers use the same constants (FLOW_CTRL_TX), but TG3 uses TG3_FLOW_CTRL_TX, so this patch also renames the constants within TG3. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 02:00:00 -08:00
Pablo Neira Ayuso	092cab7e2c	netfilter: ctnetlink: fix missing CTA_NAT_SEQ_UNSPEC This patch fixes an inconsistency in nfnetlink_conntrack.h that I introduced myself. The problem is that CTA_NAT_SEQ_UNSPEC is missing from enum ctattr_natseq. This inconsistency may lead to problems in the message parsing in userspace (if the message contains the CTA_NAT_SEQ_* attributes, of course). This patch breaks backward compatibility, however, the only known client of this code is libnetfilter_conntrack which indeed crashes because it assumes the existence of CTA_NAT_SEQ_UNSPEC to do the parsing. The CTA_NAT_SEQ_* attributes were introduced in 2.6.25. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 01:19:41 -08:00
Herbert Xu	b240a0e564	ethtool: Add GGRO and SGRO ops This patch adds the ethtool ops to enable and disable GRO. It also makes GRO depend on RX checksum offload much the same as how TSO depends on SG support. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:44:31 -08:00
Herbert Xu	71d93b39e5	net: Add skb_gro_receive This patch adds the helper skb_gro_receive to merge packets for GRO. The current method is to allocate a new header skb and then chain the original packets to its frag_list. This is done to make it easier to integrate into the existing GSO framework. In future as GSO is moved into the drivers, we can undo this and simply chain the original packets together. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:42:33 -08:00
Herbert Xu	d565b0a1a9	net: Add Generic Receive Offload infrastructure This patch adds the top-level GRO (Generic Receive Offload) infrastructure. This is pretty similar to LRO except that this is protocol-independent. Instead of holding packets in an lro_mgr structure, they're now held in napi_struct. For drivers that intend to use this, they can set the NETIF_F_GRO bit and call napi_gro_receive instead of netif_receive_skb or just call netif_rx. The latter will call napi_receive_skb automatically. When napi_gro_receive is used, the driver must either call napi_complete/napi_rx_complete, or call napi_gro_flush in softirq context if the driver uses the primitives __napi_complete/__napi_rx_complete. Protocols will set the gro_receive and gro_complete function pointers in order to participate in this scheme. In addition to the packet, gro_receive will get a list of currently held packets. Each packet in the list has a same_flow field which is non-zero if it is a potential match for the new packet. For each packet that may match, they also have a flush field which is non-zero if the held packet must not be merged with the new packet. Once gro_receive has determined that the new skb matches a held packet, the held packet may be processed immediately if the new skb cannot be merged with it. In this case gro_receive should return the pointer to the existing skb in gro_list. Otherwise the new skb should be merged into the existing packet and NULL should be returned, unless the new skb makes it impossible for any further merges to be made (e.g., FIN packet) where the merged skb should be returned. Whenever the skb is merged into an existing entry, the gro_receive function should set NAPI_GRO_CB(skb)->same_flow. Note that if an skb merely matches an existing entry but can't be merged with it, then this shouldn't be set. If gro_receive finds it pointless to hold the new skb for future merging, it should set NAPI_GRO_CB(skb)->flush. Held packets will be flushed by napi_gro_flush which is called by napi_complete and napi_rx_complete. Currently held packets are stored in a singly liked list just like LRO. The list is limited to a maximum of 8 entries. In future, this may be expanded to use a hash table to allow more flows to be held for merging. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:38:52 -08:00
Herbert Xu	1a881f27c5	net: Add frag_list support to GSO This patch allows GSO to handle frag_list in a limited way for the purposes of allowing packets merged by GRO to be refragmented on output. Most hardware won't (and aren't expected to) support handling GRO frag_list packets directly. Therefore we will perform GSO in software for those cases. However, for drivers that can support it (such as virtual NICs) we may not have to segment the packets at all. Whether the added overhead of GRO/GSO is worthwhile for bridges and routers when weighed against the benefit of potentially increasing the MTU within the host is still an open question. However, for the case of host nodes this is undoubtedly a win. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:27:47 -08:00
Kay Sievers	b53c7583e2	rapidio: struct device - replace bus_id with dev_name(), dev_set_name() Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-16 15:53:41 +11:00
David S. Miller	eb14f01959	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c	2008-12-15 20:03:50 -08:00
Paul Mackerras	1e1c568d6c	Merge branch 'merge' into next	2008-12-16 14:38:58 +11:00
Linus Torvalds	7004405cb8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: Phonet: keep TX queue disabled when the device is off SCHED: netem: Correct documentation comment in code. netfilter: update rwlock initialization for nat_table netlabel: Compiler warning and NULL pointer dereference fix e1000e: fix double release of mutex IA64: HP_SIMETH needs to depend upon NET netpoll: fix race on poll_list resulting in garbage entry ipv6: silence log messages for locally generated multicast sungem: improve ethtool output with internal pcs and serdes tcp: tcp_vegas cong avoid fix sungem: Make PCS PHY support partially work again.	2008-12-15 16:30:22 -08:00
Rusty Russell	d2ff911882	Define smp_call_function_many for UP Otherwise those using it in transition patches (eg. kvm) can't compile with CONFIG_SMP=n: arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function 'make_all_cpus_request': arch/x86/kvm/../../../virt/kvm/kvm_main.c:380: error: implicit declaration of function 'smp_call_function_many' Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-15 16:28:57 -08:00
Ben Dooks	b690ace50b	[ARM] S3C6400: serial support for S3C6400 and S3C6410 SoCs Add support to the Samsung serial driver for the S3C6400 and S3C6410 serial ports. Signed-off-by: Ben Dooks <ben-linux@fluff.org>	2008-12-15 21:58:11 +00:00
Rusty Russell	968ea6d80e	Merge ../linux-2.6-x86 Conflicts: arch/x86/kernel/io_apic.c kernel/sched.c kernel/sched_stats.h	2008-12-13 21:55:51 +10:30
Rusty Russell	7be7585393	cpumask: Use all NR_CPUS bits unless CONFIG_CPUMASK_OFFSTACK Impact: futureproof as we convert more code to new APIs The old cpumask operators treat all NR_CPUS bits as relevent, the new ones use nr_cpumask_bits. For large NR_CPUS and small nr_cpu_ids, this makes a difference. However, mixing the two can cause problems with undefined bits. An arch which sets CONFIG_CPUMASK_OFFSTACK should have converted across to the new operators, so it's safe in that case. (Thanks to Stephen Rothwell for bisecting the initial unused-bits bug, and Mike Travis for this solution). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Mike Travis <travis@sgi.com>	2008-12-13 21:20:28 +10:30
Rusty Russell	320ab2b0b1	cpumask: convert struct clock_event_device to cpumask pointers. Impact: change calling convention of existing clock_event APIs struct clock_event_timer's cpumask field gets changed to take pointer, as does the ->broadcast function. Another single-patch change. For safety, we BUG_ON() in clockevents_register_device() if it's not set. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu>	2008-12-13 21:20:26 +10:30
Rusty Russell	0de26520c7	cpumask: make irq_set_affinity() take a const struct cpumask Impact: change existing irq_chip API Not much point with gentle transition here: the struct irq_chip's setaffinity method signature needs to change. Fortunately, not widely used code, but hits a few architectures. Note: In irq_select_affinity() I save a temporary in by mangling irq_desc[irq].affinity directly. Ingo, does this break anything? (Folded in fix from KOSAKI Motohiro) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Grant Grundler <grundler@parisc-linux.org> Acked-by: Ingo Molnar <mingo@redhat.com> Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: jeremy@xensource.com Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>	2008-12-13 21:20:26 +10:30
Rusty Russell	29c0177e6a	cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers. Impact: change calling convention of existing cpumask APIs Most cpumask functions started with cpus_: these have been replaced by cpumask_ ones which take struct cpumask pointers as expected. These four functions don't have good replacement names; fortunately they're rarely used, so we just change them over. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: paulus@samba.org Cc: mingo@redhat.com Cc: tony.luck@intel.com Cc: ralf@linux-mips.org Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: cl@linux-foundation.org Cc: srostedt@redhat.com	2008-12-13 21:20:25 +10:30
Johannes Berg	4dec9b807b	rfkill: strip pointless notifier chain No users, so no reason to have it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-12 14:45:25 -05:00
Senthil Balasubramanian	bb608e9db7	wireless: Incorrect LEAP authentication algorithm identifier. This patch fixes a regression introduced by "wireless: avoid some net/ieee80211.h vs. linux/ieee80211.h conflicts" LEAP authentication algorithm identifier should be 128. Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-12 13:48:20 -05:00
Mike Frysinger	c29541b24f	linux/timex.h: cleanup for userspace Impact: fix user-space exported use Move all the kernel-specific defines and includes into the __KERNEL__ section so that they don't get leaked into userspace. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-12-12 17:01:38 +01:00
Oleg Nesterov	27af4245b6	posix-timers: use "struct pid" instead of "struct task_struct" Impact: restructure, clean up code k_itimer holds the ref to the ->it_process until sys_timer_delete(). This allows to pin up to RLIMIT_SIGPENDING dead task_struct's. Change the code to use "struct pid *" instead. The patch doesn't kill ->it_process, it places ->it_pid into the union. ->it_process is still used by do_cpu_nanosleep() as before. It would be trivial to change the nanosleep code as well, but since it uses it_process in a special way I think it is better to keep this field for grep. The patch bloats the kernel by 104 bytes and it also adds the new pointer, ->it_signal, to k_itimer. It is used by lock_timer() to verify that the found timer was not created by another process. It is not clear why do we use the global database (and thus the global idr_lock) for posix timers. We still need the signal_struct->posix_timers which contains all useable timers, perhaps it is better to use some form of per-process array instead. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-12-12 17:00:07 +01:00
Stefano Panella	5b37717a23	uwb: improved MAS allocator and reservation conflict handling Greatly enhance the MAS allocator: - Handle row and column reservations. - Permit all the available MAS to be allocated. - Follows the WiMedia rules on MAS selection. Take appropriate action when reservation conflicts are detected. - Correctly identify which reservation wins the conflict. - Protect alien BP reservations. - If an owned reservation loses, resize/move it. - Follow the backoff procedure before requesting additional MAS. When reservations are terminated, move the remaining reservations (if necessary) so they keep following the MAS allocation rules. Signed-off-by: Stefano Panella <stefano.panella@csr.com> Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-12-12 13:00:06 +00:00
Ingo Molnar	8299608f14	Merge branches 'irq/sparseirq', 'x86/quirks' and 'x86/reboot' into cpus4096 We merge the irq/sparseirq, x86/quirks and x86/reboot trees into the cpus4096 tree because the io-apic changes in the sparseirq change conflict with the cpumask changes in the cpumask tree, and we want to resolve those.	2008-12-12 13:49:24 +01:00
Ingo Molnar	45ab6b0c76	Merge branch 'sched/core' into cpus4096 Conflicts: include/linux/ftrace.h kernel/sched.c	2008-12-12 13:48:57 +01:00
Heiko Carstens	ee79d1bdb6	sched: let arch_update_cpu_topology indicate if topology changed Change arch_update_cpu_topology so it returns 1 if the cpu topology changed and 0 if it didn't change. This will be useful for the next patch which adds a call to this function in partition_sched_domains. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 13:47:21 +01:00
Ingo Molnar	81444a7995	Merge branch 'tracing/fastboot' into cpus4096	2008-12-12 12:43:05 +01:00
Ingo Molnar	30cb367ea2	sparse irqs: add irqnr.h to the user headers list Impact: fix build error /home/mingo/tip/usr/include/linux/random.h:11: included file 'linux/irqnr.h' is not exported Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 12:29:10 +01:00
Ingo Molnar	0ebb26e7a4	sparse irqs: handle !GENIRQ platforms Impact: build fix fix: In file included from /home/mingo/tip/arch/m68k/amiga/amiints.c:39: /home/mingo/tip/include/linux/interrupt.h:21: error: expected identifier or '(' /home/mingo/tip/arch/m68k/amiga/amiints.c: In function 'amiga_init_IRQ': Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 12:28:50 +01:00
Ingo Molnar	fd10902797	Merge commit 'v2.6.28-rc8' into x86/irq	2008-12-12 11:59:39 +01:00
Frederic Weisbecker	bcbc4f20b5	tracing/function-graph-tracer: annotate do_IRQ and smp_apic_timer_interrupt Impact: move most important x86 irq entry-points to a separate subsection Annotate do_IRQ and smp_apic_timer_interrupt to put them into the .irqentry.text subsection. These function will so be recognized as hardirq entrypoints for the function-graph-tracer. We could also annotate other irq entries but the others are far less important but they can be added on request. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 11:14:08 +01:00
Ingo Molnar	c1dfdc7597	Merge commit 'v2.6.28-rc8' into sched/core	2008-12-12 10:29:35 +01:00
Markus Metzger	c2724775ce	x86, bts: provide in-kernel branch-trace interface Impact: cleanup Move the BTS bits from ptrace.c into ds.c. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 08:08:12 +01:00
Ingo Molnar	f3134de606	Merge branches 'tracing/function-graph-tracer' and 'tracing/ring-buffer' into tracing/core	2008-12-12 07:40:08 +01:00
Christoph Hellwig	4d4be482a4	[XFS] add a FMODE flag to make XFS invisible I/O less hacky XFS has a mode called invisble I/O that doesn't update any of the timestamps. It's used for HSM-style applications and exposed through the nasty open by handle ioctl. Instead of doing directly assignment of file operations that set an internal flag for it add a new FMODE_NOCMTIME flag that we can check in the normal file operations. (addition of the generic VFS flag has been ACKed by Al as an interims solution) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-11 13:14:41 +11:00
Benjamin Thery	8229efdaef	netns: ip6mr: enable namespace support in ipv6 multicast forwarding code This last patch makes the appropriate changes to use and propagate the network namespace where needed in IPv6 multicast forwarding code. This consists mainly in replacing all the remaining init_net occurences with current netns pointer retrieved from sockets, net devices or mfc6_caches depending on the routines' contexts. Some routines receive a new 'struct net' parameter to propagate the current netns: * ip6mr_get_route * ip6mr_cache_report * ip6mr_cache_find * ip6mr_cache_unresolved * mif6_add/mif6_delete * ip6mr_mfc_add/ip6mr_mfc_delete * ip6mr_reg_vif All the IPv6 multicast forwarding variables moved to struct netns_ipv6 by the previous patches are now referenced in the correct namespace. Changelog: ========== * Take into account the net associated to mfc6_cache when matching entries in mfc_unres_queue list. * Call mroute_clean_tables() in ip6mr_net_exit() to free memory allocated per-namespace. * Call dev_net_set() in ip6mr_reg_vif() to initialize dev->nd_net correctly. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 16:30:15 -08:00
Benjamin Thery	58701ad411	netns: ip6mr: store netns in struct mfc6_cache This patch stores into struct mfc6_cache the network namespace each mfc6_cache belongs to. The new member is mfc6_net. mfc6_net is assigned at cache allocation and doesn't change during the rest of the cache entry life. This will help to retrieve the current netns around the IPv6 multicast forwarding code. At the moment, all mfc6_cache are allocated in init_net. Changelog: ========== * Use write_pnet()/read_pnet() to set and get mfc6_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 16:22:34 -08:00
Benjamin Thery	bd91b8bf37	netns: ip6mr: allocate mroute6_socket per-namespace. Preliminary work to make IPv6 multicast forwarding netns-aware. Make IPv6 multicast forwarding mroute6_socket per-namespace, moves it into struct netns_ipv6. At the moment, mroute6_socket is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 16:07:08 -08:00
Steve Glendinning	2107fb8b5b	smsc911x: add dynamic bus configuration Convert the driver to select 16-bit or 32-bit bus access at runtime, at a small performance cost. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 15:12:45 -08:00
Hugh Dickins	9c24624727	KSYM_SYMBOL_LEN fixes Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked to my `966c8c12dc` sprint_symbol(): use less stack exposing a bug in slub's list_locations() - kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was beyond the end of page provided. The 100 slop which list_locations() allows at end of page looks roughly enough for all the other stuff it might print after the symbol before it checks again: break out KSYM_SYMBOL_LEN earlier than before. Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies them. [akpm@linux-foundation.org: ftrace.h needs module.h] Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc Miles Lane <miles.lane@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:54 -08:00
Andrew Morton	02d2116887	revert "percpu_counter: new function percpu_counter_sum_and_set" Revert commit `e8ced39d5e` Author: Mingming Cao <cmm@us.ibm.com> Date: Fri Jul 11 19:27:31 2008 -0400 percpu_counter: new function percpu_counter_sum_and_set As described in revert "percpu counter: clean up percpu_counter_sum_and_set()" the new percpu_counter_sum_and_set() is racy against updates to the cpu-local accumulators on other CPUs. Revert that change. This means that ext4 will be slow again. But correct. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:52 -08:00
Andrew Morton	71c5576fbd	revert "percpu counter: clean up percpu_counter_sum_and_set()" Revert commit `1f7c14c62c` Author: Mingming Cao <cmm@us.ibm.com> Date: Thu Oct 9 12:50:59 2008 -0400 percpu counter: clean up percpu_counter_sum_and_set() Before this patch we had the following: percpu_counter_sum(): return the percpu_counter's value percpu_counter_sum_and_set(): return the percpu_counter's value, copying that value into the central value and zeroing the per-cpu counters before returning. After this patch, percpu_counter_sum_and_set() has gone, and percpu_counter_sum() gets the old percpu_counter_sum_and_set() functionality. Problem is, as Eric points out, the old percpu_counter_sum_and_set() functionality was racy and wrong. It zeroes out counters on "other" cpus, without holding any locks which will prevent races agaist updates from those other CPUS. This patch reverts `1f7c14c62c`. This means that percpu_counter_sum_and_set() still has the race, but percpu_counter_sum() does not. Note that this is not a simple revert - ext4 has since started using percpu_counter_sum() for its dirty_blocks counter as well. Note that this revert patch changes percpu_counter_sum() semantics. Before the patch, a call to percpu_counter_sum() will bring the counter's central counter mostly up-to-date, so a following percpu_counter_read() will return a close value. After this patch, a call to percpu_counter_sum() will leave the counter's central accumulator unaltered, so a subsequent call to percpu_counter_read() can now return a significantly inaccurate result. If there is any code in the tree which was introduced after `e8ced39d5e` was merged, and which depends upon the new percpu_counter_sum() semantics, that code will break. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:52 -08:00
Mark Brown	cdc6936432	ALSA: Add support for mechanical jack insertion Some systems support both mechanical and electrical jack detection, allowing them to report that a jack is physically present but does not have any functioning connections. Add a new jack type for these, allowing user space to report faulty connections. Thanks to Guillem Jover for the suggestion. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2008-12-10 15:10:44 +01:00
Robert Richter	e09373f22e	ring_buffer: add remaining cpu functions to ring_buffer.h These functions are not yet in ring_buffer.h though they seems to be part of the API. Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Robert Richter <robert.richter@amd.com>	2008-12-10 14:20:17 +01:00
Robert Richter	5849448758	oprofile: update comment for oprofile_add_sample() The cpu argument is no longer part of the parameter list. Signed-off-by: Robert Richter <robert.richter@amd.com>	2008-12-10 14:20:03 +01:00
Neil Horman	7b363e4400	netpoll: fix race on poll_list resulting in garbage entry A few months back a race was discused between the netpoll napi service path, and the fast path through net_rx_action: http://kerneltrap.org/mailarchive/linux-netdev/2007/10/16/345470 A patch was submitted for that bug, but I think we missed a case. Consider the following scenario: INITIAL STATE CPU0 has one napi_struct A on its poll_list CPU1 is calling netpoll_send_skb and needs to call poll_napi on the same napi_struct A that CPU0 has on its list CPU0 CPU1 net_rx_action poll_napi !list_empty (returns true) locks poll_lock for A poll_one_napi napi->poll netif_rx_complete __napi_complete (removes A from poll_list) list_entry(list->next) In the above scenario, net_rx_action assumes that the per-cpu poll_list is exclusive to that cpu. netpoll of course violates that, and because the netpoll path can dequeue from the poll list, its possible for CPU0 to detect a non-empty list at the top of the while loop in net_rx_action, but have it become empty by the time it calls list_entry. Since the poll_list isn't surrounded by any other structure, the returned data from that list_entry call in this situation is garbage, and any number of crashes can result based on what exactly that garbage is. Given that its not fasible for performance reasons to place exclusive locks arround each cpus poll list to provide that mutal exclusion, I think the best solution is modify the netpoll path in such a way that we continue to guarantee that the poll_list for a cpu is in fact exclusive to that cpu. To do this I've implemented the patch below. It adds an additional bit to the state field in the napi_struct. When executing napi->poll from the netpoll_path, this bit will be set. When a driver calls netif_rx_complete, if that bit is set, it will not remove the napi_struct from the poll_list. That work will be saved for the next iteration of net_rx_action. I've tested this and it seems to work well. About the biggest drawback I can see to it is the fact that it might result in an extra loop through net_rx_action in the event that the device is actually contended for (i.e. the netpoll path actually preforms all the needed work no the device, and the call to net_rx_action winds up doing nothing, except removing the napi_struct from the poll_list. However I think this is probably a small price to pay, given that the alternative is a crash. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-09 23:22:26 -08:00
Linus Torvalds	b749e3f8d7	Merge branch 'audit.b59' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b59' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] fix broken timestamps in AVC generated by kernel threads [patch 1/1] audit: remove excess kernel-doc [PATCH] asm/generic: fix bug - kernel fails to build when enable some common audit code on Blackfin [PATCH] return records for fork() both to child and parent [PATCH] Audit: make audit=0 actually turn off audit	2008-12-09 08:28:13 -08:00
Al Viro	1e641743f0	Audit: Log TIOCSTI AUDIT_TTY records currently log all data read by processes marked for TTY input auditing, even if the data was "pushed back" using the TIOCSTI ioctl, not typed by the user. This patch records all TIOCSTI calls to disambiguate the input. It generates one audit message per character pushed back; considering TIOCSTI is used very rarely, this simple solution is probably good enough. (The only program I could find that uses TIOCSTI is mailx/nail in "header editing" mode, e.g. using the ~h escape. mailx is used very rarely, and the escapes are used even rarer.) Signed-Off-By: Miloslav Trmac <mitr@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Morris <jmorris@namei.org>	2008-12-09 20:32:06 +11:00
Al Viro	48887e63d6	[PATCH] fix broken timestamps in AVC generated by kernel threads Timestamp in audit_context is valid only if ->in_syscall is set. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-09 02:27:41 -05:00
Al Viro	a64e64944f	[PATCH] return records for fork() both to child and parent Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-09 02:27:38 -05:00
Linus Torvalds	f7a8db89c1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: tproxy: fixe a possible read from an invalid location in the socket match zd1211rw: use unaligned safe memcmp() in-place of compare_ether_addr() mac80211: use unaligned safe memcmp() in-place of compare_ether_addr() ipw2200: fix netif_*_queue() removal regression iwlwifi: clean key table in iwl_clear_stations_table function tcp: tcp_vegas ssthresh bug fix can: omit received RTR frames for single ID filter lists ATM: CVE-2008-5079: duplicate listen() on socket corrupts the vcc table netx-eth: initialize per device spinlock tcp: make urg+gso work for real this time enc28j60: Fix sporadic packet loss (corrected again) hysdn: fix writing outside the field on 64 bits b1isa: fix b1isa_exit() to really remove registered capi controllers can: Fix CAN_(EFF\|RTR)_FLAG handling in can_filter Phonet: do not dump addresses from other namespaces netlabel: Fix a potential NULL pointer dereference bnx2: Add workaround to handle missed MSI. xfrm: Fix kernel panic when flush and dump SPD entries	2008-12-08 19:52:43 -08:00
Yinghai Lu	240d367b4e	sparseirq: fix Alpha build failure Impact: build fix on Alpha -tip testing found this build failure on the Alpha defconfig: /home/mingo/tip/fs/proc/stat.c: In function 'show_stat': /home/mingo/tip/fs/proc/stat.c:48: error: implicit declaration of function 'for_each_irq_desc' /home/mingo/tip/fs/proc/stat.c:48: error: expected ';' before '{' token can not use irq_desc() in stat.c on older architectures. Signed-off-by: Yinghai Lu <yinghai@kernel.orgg> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-09 04:16:54 +01:00
David Vrabel	c35fa3ea1a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-upstream	2008-12-08 16:18:47 +00:00
Frederic Weisbecker	380c4b1411	tracing/function-graph-tracer: append the tracing_graph_flag Impact: Provide a way to pause the function graph tracer As suggested by Steven Rostedt, the previous patch that prevented from spinlock function tracing shouldn't use the raw_spinlock to fix it. It's much better to follow lockdep with normal spinlock, so this patch adds a new flag for each task to make the function graph tracer able to be paused. We also can send an ftrace_printk whithout worrying of the irrelevant traced spinlock during insertion. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 15:11:45 +01:00
Frederic Weisbecker	8b96f01198	tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions Impact: trace more functions When the function graph tracer is configured, three more files are not traced to prevent only four functions to be traced. And this impacts the normal function tracer too. arch/x86/kernel/process_64/32.c: I had crashes when I let this file traced. After some debugging, I saw that the "current" task point was changed inside__swtich_to(), ie: "write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store the original return address of the function inside current, we had crashes. Only __switch_to() has to be excluded from tracing. kernel/module.c and kernel/extable.c: Because of a function used internally by the function graph tracer: __kernel_text_address() To let the other functions inside these files to be traced, this patch introduces the __notrace_funcgraph function prefix which is __notrace if function graph tracer is configured and nothing if not. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 15:11:44 +01:00
Yinghai Lu	3145e941fc	x86, MSI: pass irq_cfg and irq_desc Impact: simplify code Pass irq_desc and cfg around, instead of raw IRQ numbers - this way we dont have to look it up again and again. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 14:31:59 +01:00
Yinghai Lu	0b8f1efad3	sparse irq_desc[] array: core kernel and x86 changes Impact: new feature Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with NR_CPUS set to large values. The goal is to be able to scale up to much larger NR_IRQS value without impacting the (important) common case. To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of irq_desc pointers. When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc, this also makes the IRQ descriptors NUMA-local (to the site that calls request_irq()). This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now uses desc->chip_data for x86 to store irq_cfg. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 14:31:51 +01:00
Lai Jiangshan	361b73d5c3	ring_buffer: fix comments Impact: comments cleanup fix incorrect comments for enum ring_buffer_type Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 13:54:05 +01:00
Ingo Molnar	4d117c5c6b	Merge branch 'sched/urgent' into sched/core	2008-12-08 13:52:00 +01:00
Gerrit Renker	6fdd34d43b	dccp ccid-2: Phase out the use of boolean Ack Vector sysctl This removes the use of the sysctl and the minisock variable for the Send Ack Vector feature, as it now is handled fully dynamically via feature negotiation (i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per RFC 4341, 4.). Using a sysctl in parallel to this implementation would open the door to crashes, since much of the code relies on tests of the boolean minisock / sysctl variable. Thus, this patch replaces all tests of type if (dccp_msk(sk)->dccpms_send_ack_vector) /* ... / with if (dp->dccps_hc_rx_ackvec != NULL) / ... / The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature negotiation concluded that Ack Vectors are to be used on the half-connection. Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child), so that the test is a valid one. The activation handler for Ack Vectors is called as soon as the feature negotiation has concluded at the server when the Ack marking the transition RESPOND => OPEN arrives; * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN. Adding the sequence number of the Response packet to the Ack Vector has been removed, since (a) connection establishment implies that the Response has been received; (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e. this entry will always be ignored; (c) it can not be used for anything useful - to detect loss for instance, only packets received after the loss can serve as pseudo-dupacks. There was a FIXME to change the error code when dccp_ackvec_add() fails. I removed this after finding out that: * the check whether ackno < ISN is already made earlier, * this Response is likely the 1st packet with an Ackno that the client gets, * so when dccp_ackvec_add() fails, the reason is likely not a packet error. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:19:06 -08:00
Gerrit Renker	4098dce5be	dccp: Remove manual influence on NDP Count feature Updating the NDP count feature is handled automatically now: * for CCID-2 it is disabled, since the code does not use NDP counts; * for CCID-3 it is enabled, as NDP counts are used to determine loss lengths. Allowing the user to change NDP values leads to unpredictable and failing behaviour, since it is then possible to disable NDP counts even when they are needed (e.g. in CCID-3). This means that only those user settings are sensible that agree with the values for Send NDP Count implied by the choice of CCID. But those settings are already activated by the feature negotiation (CCID dependency tracking), hence this form of support is redundant. At startup the initialisation of the NDP count feature uses the default value of 0, which is done implicitly by the zeroing-out of the socket when it is allocated. If the choice of CCID or feature negotiation enables NDP count, this will then be updated via the NDP activation handler. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:18:37 -08:00
Gerrit Renker	0049bab5e7	dccp: Remove obsolete parts of the old CCID interface The TX/RX CCIDs of the minisock are now redundant: similar to the Ack Vector case, their value equals initially that of the sysctl, but at the end of feature negotiation may be something different. The old interface removed by this patch thus has been replaced by the newer interface to dynamically query the currently loaded CCIDs. Also removed are the constructors for the TX CCID and the RX CCID, since the switch "rx <-> non-rx" is done by the handler in minisocks.c (and the handler is the only place in the code where CCIDs are loaded). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:18:05 -08:00
Wang Chen	b74ca3a896	netdevice: Kill netdev->priv This is the last shoot of this series. After I removing all directly reference of netdev->priv, I am killing "priv" of "struct net_device" and fixing relative comments/docs. Anyone will not be allowed to reference netdev->priv directly. If you want to reference the memory of private data, use netdev_priv() instead. If the private data is not allocted when alloc_netdev(), use netdev->ml_priv to point that memory after you creating that private data. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:14:16 -08:00
David S. Miller	730c30ec64	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-core.c drivers/net/wireless/iwlwifi/iwl-sta.c	2008-12-05 22:54:40 -08:00
Linus Torvalds	f2f1fa78a1	Enforce a minimum SG_IO timeout There's no point in having too short SG_IO timeouts, since if the command does end up timing out, we'll end up through the reset sequence that is several seconds long in order to abort the command that timed out. As a result, shorter timeouts than a few seconds simply do not make sense, as the recovery would be longer than the timeout itself. Add a BLK_MIN_SG_TIMEOUT to match the existign BLK_DEFAULT_SG_TIMEOUT. Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-05 14:49:18 -08:00
Matthew Garrett	e088e4c9cd	[CPUFREQ] Disable sysfs ui for p4-clockmod. p4-clockmod has a long history of abuse. It pretends to be a CPU frequency scaling driver, even though it doesn't actually change the CPU frequency, but instead just modulates the frequency with wait-states. The biggest misconception is that when running at the lower 'frequency' p4-clockmod is saving power. This isn't the case, as workloads running slower take longer to complete, preventing the CPU from entering deep C states. However p4-clockmod does have a purpose. It can prevent overheating. Having it hooked up to the cpufreq interfaces is the wrong way to achieve cooling however. It should instead be hooked up to ACPI. This diff introduces a means for a cpufreq driver to register with the cpufreq core, but not present a sysfs interface. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-12-05 15:20:10 -05:00
Luis R. Rodriguez	10ec4f1d08	nl80211: relicense nl80211.h under the ISC We have a few BSD/ISC licensed userspace applications which include nl80211.h from the kernel. To avoid legal ambiguity for usage of the header file in these projects we rather simply relicense the header file under the ISC. We've received consent from all contributors to it. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Acked-by: Luis Carlos Cobo <luisca@cozybit.com> Acked-by: Michael Buesch <mb@bu3sch.de> Acked-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Colin McCabe <colin@cozybit.com> Acked-by: Javier Cardona <javier@cozybit.com> Cc: johannes@sipsolutions.net Cc: altape@eden.rutgers.edu Cc: luisca@cozybit.com Cc: mb@bu3sch.de Cc: jouni.malinen@atheros.com Cc: colin@cozybit.com Cc: javier@cozybit.com Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-05 09:32:12 -05:00
Jouni Malinen	72bdcf3438	nl80211: Add frequency configuration (including HT40) This patch adds new NL80211_CMD_SET_WIPHY attributes NL80211_ATTR_WIPHY_FREQ and NL80211_ATTR_WIPHY_SEC_CHAN_OFFSET to allow userspace to set the operating channel (e.g., hostapd for AP mode). Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-05 09:32:11 -05:00
Frederic Weisbecker	21a8c466f9	tracing/ftrace: provide the macro task_curr_ret_stack() Impact: cleanup As suggested by Steven Rostedt, this patch provide a new macro task_curr_ret_stack() to move the cpp conditionnal CONFIG into the linux/ftrace.h headers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-05 14:47:44 +01:00
Ingo Molnar	970987beb9	Merge branches 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/urgent' into tracing/core	2008-12-05 14:45:22 +01:00
Lachlan McIlroy	14d676f56f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-12-05 15:27:43 +11:00
David S. Miller	c9bb6003dd	of: Fix comment, sparc no longer uses of_device objects on special busses. It only uses of_platform_bus_type. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-04 09:16:45 -08:00
Christoph Hellwig	fc9161e54d	[PATCH 2/2] documnt FMODE_ constants Make sure all FMODE_ constants are documents, and ensure a coherent style for the already existing comments. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-04 04:22:58 -05:00
Christoph Hellwig	fd4ce1acd0	[PATCH 1/2] kill FMODE_NDELAY_NOW Update FMODE_NDELAY before each ioctl call so that we can kill the magic FMODE_NDELAY_NOW. It would be even better to do this directly in setfl(), but for that we'd need to have FMODE_NDELAY for all files, not just block special files. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-04 04:22:57 -05:00
Peter Zijlstra	00ef9f7348	lockdep: change a held lock's class Impact: introduce new lockdep API Allow to change a held lock's class. Basically the same as the existing code to change a subclass therefore reuse all that. The XFS code will be able to use this to annotate their inode locking. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-04 10:08:18 +01:00
Steven Rostedt	5ef6476190	pid: fix the do_each_pid_task() macro Impact: macro side-effects fix This patch adds parenthesis around 'pid' in the do_each_pid_task macro to allow callers to pass in more complex parameters. e.g. do_each_pid_task(*pid, type, task) Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-04 09:09:37 +01:00
Steven Rostedt	ea4e2bc4d9	ftrace: graph of a single function This patch adds the file: /debugfs/tracing/set_graph_function which can be used along with the function graph tracer. When this file is empty, the function graph tracer will act as usual. When the file has a function in it, the function graph tracer will only trace that function. For example: # echo blk_unplug > /debugfs/tracing/set_graph_function # cat /debugfs/tracing/trace [...] ------------------------------------------ \| 2) make-19003 => kjournald-2219 ------------------------------------------ 2) \| blk_unplug() { 2) \| dm_unplug_all() { 2) \| dm_get_table() { 2) 1.381 us \| _read_lock(); 2) 0.911 us \| dm_table_get(); 2) 1. 76 us \| _read_unlock(); 2) + 12.912 us \| } 2) \| dm_table_unplug_all() { 2) \| blk_unplug() { 2) 0.778 us \| generic_unplug_device(); 2) 2.409 us \| } 2) 5.992 us \| } 2) 0.813 us \| dm_table_put(); 2) + 29. 90 us \| } 2) + 34.532 us \| } You can add up to 32 functions into this file. Currently we limit it to 32, but this may change with later improvements. To add another function, use the append '>>': # echo sys_read >> /debugfs/tracing/set_graph_function # cat /debugfs/tracing/set_graph_function blk_unplug sys_read Using the '>' will clear out the function and write anew: # echo sys_write > /debug/tracing/set_graph_function # cat /debug/tracing/set_graph_function sys_write Note, if you have function graph running while doing this, the small time between clearing it and updating it will cause the graph to record all functions. This should not be an issue because after it sets the filter, only those functions will be recorded from then on. If you need to only record a particular function then set this file first before starting the function graph tracer. In the future this side effect may be corrected. The set_graph_function file is similar to the set_ftrace_filter but it does not take wild cards nor does it allow for more than one function to be set with a single write. There is no technical reason why this is the case, I just do not have the time yet to implement that. Note, dynamic ftrace must be enabled for this to appear because it uses the dynamic ftrace records to match the name to the mcount call sites. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-04 09:09:34 +01:00
Ingo Molnar	cb9c34e6d0	Merge commit 'v2.6.28-rc7' into core/locking	2008-12-04 08:52:14 +01:00
James Morris	ec98ce480a	Merge branch 'master' into next Conflicts: fs/nfsd/nfs4recover.c Manually fixed above to use new creds API functions, e.g. nfs4_save_creds(). Signed-off-by: James Morris <jmorris@namei.org>	2008-12-04 17:16:36 +11:00
David Woodhouse	8865c418ca	atm: 32-bit ioctl compatibility We lack compat ioctl support through most of the ATM code. This patch deals with most of it, and I can now at least use BR2684 and PPPoATM with 32-bit userspace. I haven't added a .compat_ioctl method to struct atm_ioctl, because AFAICT none of the current users need any conversion -- so we can just call the ->ioctl() method in every case. I looked at br2684, clip, lec, mpc, pppoatm and atmtcp. In svc_compat_ioctl() the only mangling which is needed is to change COMPAT_ATM_ADDPARTY to ATM_ADDPARTY. Although it's defined as _IOW('a', ATMIOC_SPECIAL+4,struct atm_iobuf) it doesn't actually _take_ a struct atm_iobuf as an argument -- it takes a struct sockaddr_atmsvc, which _is_ the same between 32-bit and 64-bit code, so doesn't need conversion. Almost all of vcc_ioctl() would have been identical, so I converted that into a core do_vcc_ioctl() function with an 'int compat' argument. I've done the same with atm_dev_ioctl(), where there _are_ a few differences, but still it's relatively contained and there would otherwise have been a lot of duplication. I haven't done any of the actual device-specific ioctls, although I've added a compat_ioctl method to struct atmdev_ops. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-03 22:12:38 -08:00
Oliver Hartkopp	d253eee201	can: Fix CAN_(EFF\|RTR)_FLAG handling in can_filter Due to a wrong safety check in af_can.c it was not possible to filter for SFF frames with a specific CAN identifier without getting the same selected CAN identifier from a received EFF frame also. This fix has a minimum (but user visible) impact on the CAN filter API and therefore the CAN version is set to a new date. Indeed the 'old' API is still working as-is. But when now setting CAN_(EFF\|RTR)_FLAG in can_filter.can_mask you might get less traffic than before - but still the stuff that you expected to get for your defined filter ... Thanks to Kurt Van Dijck for pointing at this issue and for the review. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Acked-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-03 15:52:35 -08:00
Milan Broz	0e435ac26e	block: fix setting of max_segment_size and seg_boundary mask Fix setting of max_segment_size and seg_boundary mask for stacked md/dm devices. When stacking devices (LVM over MD over SCSI) some of the request queue parameters are not set up correctly in some cases by default, namely max_segment_size and and seg_boundary mask. If you create MD device over SCSI, these attributes are zeroed. Problem become when there is over this mapping next device-mapper mapping - queue attributes are set in DM this way: request_queue max_segment_size seg_boundary_mask SCSI 65536 0xffffffff MD RAID1 0 0 LVM 65536 -1 (64bit) Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of physical segments according to these parameters. During the generic_make_request() is segment cout recalculated and can increase bio->bi_phys_segments count over the allowed limit. (After bio_clone() in stack operation.) Thi is specially problem in CCISS driver, where it produce OOPS here BUG_ON(creq->nr_phys_segments > MAXSGENTRIES); (MAXSEGENTRIES is 31 by default.) Sometimes even this command is enough to cause oops: dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10 This command generates bios with 250 sectors, allocated in 32 4k-pages (last page uses only 1024 bytes). For LVM layer, it allocates bio with 31 segments (still OK for CCISS), unfortunatelly on lower layer it is recalculated to 32 segments and this violates CCISS restriction and triggers BUG_ON(). The patch tries to fix it by: * initializing attributes above in queue request constructor blk_queue_make_request() * make sure that blk_queue_stack_limits() inherits setting (DM uses its own function to set the limits because it blk_queue_stack_limits() was introduced later. It should probably switch to use generic stack limit function too.) * sets the default seg_boundary value in one place (blkdev.h) * use this mask as default in DM (instead of -1, which differs in 64bit) Bugs related to this: https://bugzilla.redhat.com/show_bug.cgi?id=471639 http://bugzilla.kernel.org/show_bug.cgi?id=8672 Signed-off-by: Milan Broz <mbroz@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-03 12:55:55 +01:00
Tejun Heo	53a08807c0	block: internal dequeue shouldn't start timer blkdev_dequeue_request() and elv_dequeue_request() are equivalent and both start the timeout timer. Barrier code dequeues the original barrier request but doesn't passes the request itself to lower level driver, only broken down proxy requests; however, as the original barrier code goes through the same dequeue path and timeout timer is started on it. If barrier sequence takes long enough, this timer expires but the low level driver has no idea about this request and oops follows. Timeout timer shouldn't have been started on the original barrier request as it never goes through actual IO. This patch unexports elv_dequeue_request(), which has no external user anyway, and makes it operate on elevator proper w/o adding the timer and make blkdev_dequeue_request() call elv_dequeue_request() and add timer. Internal users which don't pass the request to driver - barrier code and end_that_request_last() - are converted to use elv_dequeue_request(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-03 12:41:26 +01:00
Anton Vorontsov	b908b53d58	of/gpio: Implement of_get_gpio_flags() This adds a new function, of_get_gpio_flags, which is like of_get_gpio(), but accepts a new "flags" argument. This new function will be used by the drivers that need to retrieve additional GPIO information, such as active-low flag. Also, this changes the default ("simple") .xlate routine to warn about bogus (< 2) #gpio-cells usage: the second cell should always be present for GPIO flags. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-03 21:04:05 +11:00
Paul Mackerras	5274918855	Merge branch 'merge'	2008-12-03 20:11:06 +11:00
Steven Rostedt	e49dc19c6a	ftrace: function graph return for function entry Impact: feature, let entry function decide to trace or not This patch lets the graph tracer entry function decide if the tracing should be done at the end as well. This requires all function graph entry functions return 1 if it should trace, or 0 if the return should not be traced. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-03 08:56:26 +01:00
Steven Rostedt	14a866c567	ftrace: add ftrace_graph_stop() Impact: new ftrace_graph_stop function While developing more features of function graph, I hit a bug that caused the WARN_ON to trigger in the prepare_ftrace_return function. Well, it was hard for me to find out that was happening because the bug would not print, it would just cause a hard lockup or reboot. The reason is that it is not safe to call printk from this function. Looking further, I also found that it calls unregister_ftrace_graph, which grabs a mutex and calls kstop machine. This would definitely lock the box up if it were to trigger. This patch adds a fast and safe ftrace_graph_stop() which will stop the function tracer. Then it is safe to call the WARN ON. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-03 08:56:23 +01:00
Steven Rostedt	8789a9e7df	ring-buffer: read page interface Impact: new API to ring buffer This patch adds a new interface into the ring buffer that allows a page to be read from the ring buffer on a given CPU. For every page read, one must also be given to allow for a "swap" of the pages. rpage = ring_buffer_alloc_read_page(buffer); if (!rpage) goto err; ret = ring_buffer_read_page(buffer, &rpage, cpu, full); if (!ret) goto empty; process_page(rpage); ring_buffer_free_read_page(rpage); The caller of these functions must handle any waits that are needed to wait for new data. The ring_buffer_read_page will simply return 0 if there is no data, or if "full" is set and the writer is still on the current page. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-03 08:56:21 +01:00
Ingo Molnar	dfdc5437bd	Merge commit 'v2.6.28-rc7'; branch 'x86/dumpstack' into tracing/ftrace Merge x86/dumpstack into tracing/ftrace because upcoming ftrace changes depend on cleanups already in x86/dumpstack. Also merge to latest upstream -rc.	2008-12-03 08:55:34 +01:00
David S. Miller	aa2ba5f108	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c drivers/net/smc91x.c	2008-12-02 19:50:27 -08:00
Linus Torvalds	e1825e7515	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) MAINTAINERS: add netdev to ATM ATM: horizon, fix hrz_probe fail path pppol2tp: Add missing sock_put() in pppol2tp_release() net: Fix soft lockups/OOM issues w/ unix garbage collector macvlan: don't broadcast PAUSE frames to macvlan devices Phonet: fix oops in phonet_address_del() on non-Phonet device netfilter: ctnetlink: fix GFP_KERNEL allocation under spinlock sungem: Fix PCS_MIICTRL register write in gem_init_phy(). net: make skb_truesize_bug() call WARN() net: hp-plus uses eip_poll net/wireless/reg.c: fix bad WARN_ON in if statement ath5k: disable beacon filter when station is not associated ath5k: fix Security issue in DebugFS part of ath5k ath9k: correct expected max RX buffer size ath9k: Fix SW-IOMMU bounce buffer starvation mac80211 : Fix setting ad-hoc mode and non-ibss channel iwlagn: fix DMA sync phylib: Add Vitesse VSC8221 SGMII PHY rose: zero length frame filtering in af_rose.c bridge: netfilter: fix update_pmtu crash with GRE ...	2008-12-02 15:55:05 -08:00
Linus Torvalds	e2e29831cc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: alim15x3: fix sparse warning ide: remove dead code from drive_is_ready() ide: fix build for DEBUG_PM ide: respect current DMA setting during resume ide: add SAMSUNG SP0822N with firmware WA100-10 to ivb_list[] amd74xx: workaround unreliable AltStatus register for nVidia controllers ide: fix the ide_release_lock imbalance	2008-12-02 15:53:10 -08:00
Junjiro R. Okajima	1b79cd04fa	nfsd: fix vm overcommit crash fix #2 The previous patch from Alan Cox ("nfsd: fix vm overcommit crash", commit `731572d39f`) fixed the problem where knfsd crashes on exported shmemfs objects and strict overcommit is set. But the patch forgot supporting the case when CONFIG_SECURITY is disabled. This patch copies a part of his fix which is mainly for detecting a bug earlier. Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Junjiro R. Okajima <hooanon05@yahoo.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-02 15:50:40 -08:00

... 3 4 5 6 7 ...

13801 commits