There are a couple of places where (P)Dprintk is used which is an old
compile time enabled printk wrapper. Convert it to the generic
pr_debug().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netfilter: nf_conntrack_sctp: fix sparse warnings
netfilter: nf_nat_sip: c= is optional for session
netfilter: xt_TCPMSS: collapse tcpmss_reverse_mtu{4,6} into one function
netfilter: nfnetlink_log: send complete hardware header
netfilter: xt_time: fix time's time_mt()'s use of do_div()
netfilter: accounting rework: ct_extend + 64bit counters (v4)
netlink: add NLA_PUT_BE64 macro
netfilter: nf_nat_core: eliminate useless find_appropriate_src for IP_NAT_RANGE_PROTO_RANDOM
hdlcdrv: Fix CRC calculation.
Revert "pkt_sched: Make default qdisc nonshared-multiqueue safe."
net: In __netif_schedule() use WARN_ON instead of BUG_ON
net: Improve simple_tx_hash().
pkt_sched: Remove unused variable skb in dev_deactivate_queue function.
sunhme: Remove stop/wake TX queue calls in set-multicast-list handler.
ucc_geth: do not touch net queue in adjust_link phylib callback
gianfar: do not touch net queue in adjust_link phylib callback
atl1: Do not wake queue before queue has been started.
* 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (160 commits)
x86: remove extra calling to get ext cpuid level
x86: use setup_clear_cpu_cap() when disabling the lapic
KVM: fix exception entry / build bug, on 64-bit
x86: add unknown_nmi_panic kernel parameter
x86, VisWS: turn into generic arch, eliminate leftover files
x86: add ->pre_time_init to x86_quirks
x86: extend and use x86_quirks to clean up NUMAQ code
x86: introduce x86_quirks
x86: improve debug printout: add target bootmem range in early_res_to_bootmem()
Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM
x86: remove arch_get_ram_range
x86: Add a debugfs interface to dump PAT memtype
x86: Add a arch directory for x86 under debugfs
x86: i386: reduce boot fixmap space
i386/xen: add proper unwind annotations to xen_sysenter_target
x86: reduce force_mwait visibility
x86: reduce forbid_dac's visibility
x86: fix two modpost warnings
x86: check function status in EDD boot code
x86_64: ia32_signal.c: remove signal number conversion
...
* 'for-linus' of git://neil.brown.name/md: (52 commits)
md: Protect access to mddev->disks list using RCU
md: only count actual openers as access which prevent a 'stop'
md: linear: Make array_size sector-based and rename it to array_sectors.
md: Make mddev->array_size sector-based.
md: Make super_type->rdev_size_change() take sector-based sizes.
md: Fix check for overlapping devices.
md: Tidy up rdev_size_store a bit:
md: Remove some unused macros.
md: Turn rdev->sb_offset into a sector-based quantity.
md: Make calc_dev_sboffset() return a sector count.
md: Replace calc_dev_size() by calc_num_sectors().
md: Make update_size() take the number of sectors.
md: Better control of when do_md_stop is allowed to stop the array.
md: get_disk_info(): Don't convert between signed and unsigned and back.
md: Simplify restart_array().
md: alloc_disk_sb(): Return proper error value.
md: Simplify sb_equal().
md: Simplify uuid_equal().
md: sb_equal(): Fix misleading printk.
md: Fix a typo in the comment to cmd_match().
...
This patch adds some fields to NFLOG to be able to send the complete
hardware header with all necessary informations.
It sends to userspace:
* the type of hardware link
* the lenght of hardware header
* the hardware header
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.
This patch:
- reimplements accounting with respect to the extension infrastructure,
- makes one global version of seq_print_acct() instead of two seq_print_counters(),
- makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
- makes it possible to enable/disable it at runtime by sysctl or sysfs,
- extends counters from 32bit to 64bit,
- renames ip_conntrack_counter -> nf_conn_counter,
- enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
- set initial accounting enable state based on CONFIG_NF_CT_ACCT
- removes buggy IPCT_COUNTER_FILLING event handling.
If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add NLA_PUT_BE64 macro required for 64bit counters in netfilter
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a bvec merge function for device mapper devices
for dynamic size restrictions.
This code ensures the requested biovec lies within a single
target and then calls a target-specific function to check
against any constraints imposed by underlying devices.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-tip testing found this build bug:
arch/x86/kvm/built-in.o:(.text.fixup+0x1): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0xb): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x15): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x1f): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x29): relocation truncated to fit: R_X86_64_32 against `.text'
Introduced by commit 4ecac3fd. The problem is that 'push' will default
to 32-bit, which is not wide enough as a fixup address. (and which would
crash on any real fixup event even if it was wide enough)
Introduce KVM_EX_PUSH to get the proper address push width on 64-bit too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All modifications and most access to the mddev->disks list are made
under the reconfig_mutex lock. However there are three places where
the list is walked without any locking. If a reconfig happens at this
time, havoc (and oops) can ensue.
So use RCU to protect these accesses:
- wrap them in rcu_read_{,un}lock()
- use list_for_each_entry_rcu
- add to the list with list_add_rcu
- delete from the list with list_del_rcu
- delay the 'free' with call_rcu rather than schedule_work
Note that export_rdev did a list_del_init on this list. In almost all
cases the entry was not in the list anymore so it was a no-op and so
safe. It is no longer safe as after list_del_rcu we may not touch
the list_head.
An audit shows that export_rdev is called:
- after unbind_rdev_from_array, in which case the delete has
already been done,
- after bind_rdev_to_array fails, in which case the delete isn't needed.
- before the device has been put on a list at all (e.g. in
add_new_disk where reading the superblock fails).
- and in autorun devices after a failure when the device is on a
different list.
So remove the list_del_init call from export_rdev, and add it back
immediately before the called to export_rdev for that last case.
Note also that ->same_set is sometimes used for lists other than
mddev->list (e.g. candidates). In these cases rcu is not needed.
Signed-off-by: NeilBrown <neilb@suse.de>
Open isn't the only thing that increments ->active. e.g. reading
/proc/mdstat will increment it briefly. So to avoid false positives
in testing for concurrent access, introduce a new counter that counts
just the number of times the md device it open.
Signed-off-by: NeilBrown <neilb@suse.de>
This patch renames the array_size field of struct mddev_s to array_sectors
and converts all instances to use units of 512 byte sectors instead of 1k
blocks.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits)
nfsd: nfs4xdr.c do-while is not a compound statement
nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c
lockd: Pass "struct sockaddr *" to new failover-by-IP function
lockd: get host reference in nlmsvc_create_block() instead of callers
lockd: minor svclock.c style fixes
lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock
lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock
lockd: nlm_release_host() checks for NULL, caller needn't
file lock: reorder struct file_lock to save space on 64 bit builds
nfsd: take file and mnt write in nfs4_upgrade_open
nfsd: document open share bit tracking
nfsd: tabulate nfs4 xdr encoding functions
nfsd: dprint operation names
svcrdma: Change WR context get/put to use the kmem cache
svcrdma: Create a kmem cache for the WR contexts
svcrdma: Add flush_scheduled_work to module exit function
svcrdma: Limit ORD based on client's advertised IRD
svcrdma: Remove unused wait q from svcrdma_xprt structure
svcrdma: Remove unneeded spin locks from __svc_rdma_free
svcrdma: Add dma map count and WARN_ON
...
* 'for-linus' of git://git.o-hand.com/linux-mfd:
mfd: let asic3 use mem resource instead of bus_shift
mfd: remove DS1WM register definitions from asic3.h
mfd: add ASIC3_CONFIG_GPIO templates
mfd: fix the asic3 irq demux code
mfd: asic3 should depend on gpiolib
mfd: fix asic3 config array initialisation
mfd: move asic3 probe functions into __init section
mfd: Use uppercase only for asic3 macros and defines
mfd: use dev_* macros for asic3 debugging
mfd: New asic3 gpio configuration code
mfd: asic3 children platform data removal
mfd: asic3 gpiolib support
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (70 commits)
KVM: Adjust smp_call_function_mask() callers to new requirements
KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts
KVM: x86 emulator: emulate clflush
KVM: MMU: improve invalid shadow root page handling
KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction
KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
KVM: check injected pic irq within valid pic irqs
KVM: x86 emulator: Fix HLT instruction
KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized
KVM: VMX: Add ept_sync_context in flush_tlb
KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held
x86: KVM guest: make kvm_smp_prepare_boot_cpu() static
KVM: SVM: fix suspend/resume support
KVM: s390: rename private structures
KVM: s390: Set guest storage limit and offset to sane values
KVM: Fix memory leak on guest exit
KVM: s390: dont allocate dirty bitmap
KVM: move slots_lock acquision down to vapic_exit
KVM: VMX: Fake emulate Intel perfctr MSRs
KVM: VMX: Fix a wrong usage of vmcs_config
...
The stab bits can't be referenced uniless the full
packet scheduler layer is enabled.
Reported by Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (1232 commits)
iucv: Fix bad merging.
net_sched: Add size table for qdiscs
net_sched: Add accessor function for packet length for qdiscs
net_sched: Add qdisc_enqueue wrapper
highmem: Export totalhigh_pages.
ipv6 mcast: Omit redundant address family checks in ip6_mc_source().
net: Use standard structures for generic socket address structures.
ipv6 netns: Make several "global" sysctl variables namespace aware.
netns: Use net_eq() to compare net-namespaces for optimization.
ipv6: remove unused macros from net/ipv6.h
ipv6: remove unused parameter from ip6_ra_control
tcp: fix kernel panic with listening_get_next
tcp: Remove redundant checks when setting eff_sacks
tcp: options clean up
tcp: Fix MD5 signatures for non-linear skbs
sctp: Update sctp global memory limit allocations.
sctp: remove unnecessary byteshifting, calculate directly in big-endian
sctp: Allow only 1 listening socket with SO_REUSEADDR
sctp: Do not leak memory on multiple listen() calls
sctp: Support ipv6only AF_INET6 sockets.
...
* 'for-linus' of git://www.jni.nu/cris:
[CRISv10] Clean up compressed/misc.c
[CRISv10] Correct whitespace damage.
[CRIS] Correct definition of subdirs for install_headers.
[CRIS] Correct image makefiles to allow using a separate OBJ-directory.
[CRIS] Build fixes for compressed and rescue images for v10 and v32:
It looks at least odd to apply spin_unlock to a mutex.
cris: compile fixes for 2.6.26-rc5
This patch contains the following possible cleanups:
- amiints.c: add a proper prototype for amiga_init_IRQ() in
include/asm-m68k/amigaints.h
- make the following needlessly global code static:
- config.c: amiga_model
- config.c: amiga_psfreq
- config.c: amiga_serial_console_write()
- #if 0 the following unused functions:
- config.c: amiga_serial_puts()
- config.c: amiga_serial_console_wait_key()
- config.c: amiga_serial_gets()
- remove the following unused variable:
- config.c: amiga_masterclock
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unless I miss something that's code for a sparc machine even the sparc
code no longer supports that got copied to m68k when these files were
copied.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow no CPU/platform type for allnoconfig
- Provide a dummy value for FPSTATESIZE if no CPU type was selected
- Provide a dummy value for NR_IRQS if no platform type was selected
- Warn the user if no CPU or platform type was selected
Note: you still cannot build an allnoconfig kernel, as CONFIG_SWAP=n doesn't
build and we cannot easily fix that
(http://groups.google.com/group/linux.kernel/browse_thread/thread/d430c78b07e1827b)
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6:
configfs: Allow ->make_item() and ->make_group() to return detailed errors.
Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."
Fix up the termios of the people who have not yet got with the program
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch cyclades to use the new tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch the stallion driver to use the tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch istallion to use the new tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch drivers using the old "generic serial" driver to use the tty_port
structures
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch the serial_core based drivers to use the new tty_port structure.
We can't quite use all of it yet because of the dynamically allocated
extras in the serial_core layer.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Every tty driver has its own concept of a port structure and because
they all differ we cannot extract commonality. Begin fixing this by
creating a structure drivers can elect to use so that over time we can
push fields into this and create commonality and then introduce common
methods.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the line disciplines towards a conventional ->ops arrangement. For
the moment the actual 'tty_ldisc' struct in the tty is kept as part of
the tty struct but this can then be changed if it turns out that when it
all settles down we want to refcount ldiscs separately to the tty.
Pull the ldisc code out of /proc and put it with our ldisc code.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serial drivers using DMA (like the atmel_serial driver) tend to get very
confused when the xmit buffer is flushed and nobody told them. They
also tend to spew a lot of garbage since the DMA engine keeps running
after the buffer is flushed and possibly refilled with unrelated data.
This patch adds a new flush_buffer operation to the uart_ops struct,
along with a call to it from uart_flush_buffer() right after the xmit
buffer has been cleared. The driver can implement this in order to
syncronize its internal DMA state with the xmit buffer when the buffer
is flushed.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The bus_shift parameter in platform_data is not needed
as we can tell the driver with the IOMEM_RESOURCE whether
the ASIC is located on a 16bit or 32bit memory bus.
The htc-egpio driver uses a more descriptive bus_width parameter,
but for drivers where the register map size fixed, we don't even
need this.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
There is a dedicated ds1wm driver, no need to duplicate this
information here.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
As ASIC3 GPIO alternate function configuration is expected to be similar
for several devices, it is convenient to define descriptive macros. This
patch is inspired by the PXA MFP configuration, the alternate functions
were observed on hx4700 and blueangel.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Let's be consistent and use uppercase only, for both macro and defines.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The ASIC3 GPIO configuration code is a bit obscure and hardly readable.
This patch changes it so that it is now more readable and understandable,
by being more explicit.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Platform devices should be dynamically allocated, and each supported
device should have its own platform data.
For now we just remove this buggy code.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ASIC3 is, among other things, a GPIO extender. We should thus have it
supporting the current gpiolib API.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
SYS_SUPPORTS_64BIT_KERNEL is enabled for RBTX4927/RBTX4938, but
actually it was broken for long time (or from the beginning). Now it
should work.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
include/asm-mips/mips-boards/sead{,int}.h are now obsolete.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[...]
CC init/main.o
include/asm/bitops.h: In function `start_kernel':
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: error: impossible constraint in `asm'
include/asm/bitops.h:76: error: impossible constraint in `asm'
include/asm/bitops.h:76: error: impossible constraint in `asm'
make[1]: *** [init/main.o] Error 1
[...]
The build error is caused by the ages old gcc bug where gcc at the time of
analyzing the constraints is unable to figure out that an "i" constraint
actually can be satisfied and thus will abort unless an "r" is added to
the constraint. For the actual code generation gcc will only ever use the
"i" constraint.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch fixes the following sparse warnings:
>>>>>>>>>>>>>>>>>>
arch/mips/mm/page.c:284:16: warning: symbol
'build_clear_page' was not declared. Should it be static?
arch/mips/mm/page.c:426:16: warning: symbol 'build_copy_page'
was not declared. Should it be static?
>>>>>>>>>>>>>>>>>>
The fix is to add appropriate prototypes to the header
include/asm-mips/page.h.
Build-tested against Malta defconfig.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
While building the Malta defconfig, sparse spat the following
warnings:
>>>>>>>>>>>>>>>>>>
arch/mips/math-emu/kernel_linkage.c:31:6: warning: symbol
'fpu_emulator_init_fpu' was not declared. Should it be static?
arch/mips/math-emu/kernel_linkage.c:54:5: warning: symbol
'fpu_emulator_save_context' was not declared. Should it be
static?
arch/mips/math-emu/kernel_linkage.c:68:5: warning: symbol
'fpu_emulator_restore_context' was not declared. Should it be
static?
>>>>>>>>>>>>>>>>>>
This patch fixes these errors by adding the proper prototypes
to the include/asm-mips/fpu.h header, and actually using this
header in the sparse-spotted source file.
Build-tested with Malta defconfig.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The pcibios_max_latency variable is needlessly defined global, and this
patch makes it static.
Build-tested using malta_defconfig.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Currently, saa7134 is dependent of ir-kbd-i2c, since it uses a symbol that is
defined there. However, as this symbol is used only on saa7134, there's no
sense on keeping it defined there (or on ir-commons).
So, let's move it to saa7134 and remove one symbol for being exported.
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Those changes, together with some proper patches, will allow out-of-tree
compilation for for kernels < 2.6.19
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This patch adds a simple platform camera device. Useful for testing
cameras with SoC camera host drivers. Only one single pixel format
and resolution combination is supported.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This is V3 of the SuperH Mobile CEU soc_camera driver.
The CEU hardware block is configured in a transparent data fetch
mode, frames are captured from the attached camera and written to
physically contiguous memory buffers provided by the newly added
videobuf-dma-contig queue. Tested on sh7722 and sh7723 processors.
Changes since V2:
- remove SUPERH Kconfig dependency
- move sh_mobile_ceu.h to include/media
- add board callback support with enable_camera()/disable_camera()
- add support for declare_coherent_memory
- rework video memory limit
- more verbose error messages
Changes since V1:
- fixed the CEU driver to work with the newly updated patches
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This is V3 of the physically contiguous videobuf queues patch.
Useful for hardware such as the SuperH Mobile CEU which doesn't
support scatter gatter bus mastering.
Since it may be difficult to allocate large chunks of physically
contiguous memory after some uptime due to fragmentation, this code
allocates memory using dma_alloc_coherent(). Architectures supporting
dma_declare_coherent_memory() can easily avoid fragmentation issues
by using dma_declare_coherent_memory() to force dma_alloc_coherent()
to allocate from a certain pre-allocated memory area.
Changes since V2
- use dma_handle for physical address
- use "scatter gather" instead of "scatter gatter"
Changes since V1:
- use dev_err() instead of pr_err()
- remember size in struct videobuf_dma_contig_memory
- keep struct videobuf_dma_contig_memory in .c file
- let videobuf_to_dma_contig() return dma_addr_t
- implement __videobuf_sync()
- return statements, white space and other minor fixes
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The SuperH Mobile CEU hardware supports 16-bit width bus,
so extend the soc_camera code with SOCAM_DATAWIDTH_16.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This patch moves the spinlock handling from soc_camera.c to the actual
camera host driver. The spinlock_alloc/free callbacks are replaced with
code in init_videobuf(). So far all camera host drivers implement their
own spinlock_alloc/free methods anyway, and videobuf_queue_core_init()
BUGs on a NULL spinlock argument, so, new camera host drivers will not
forget to provide a spinlock when initialising their videobuf queues.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Makes SoC camera videobuf independent. Includes all necessary changes for
PXA camera driver (currently the only driver using soc_camera in the mainline).
These changes are important for the future soc_camera based drivers.
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The tvaudio driver is using "official" I2C device IDs for internal
purpose. There must be some historical reason behind this but anyway,
it shouldn't do that. As the stored values are never used, the easiest
way to fix the problem is simply to remove them altogether.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
I2C_HW_SMBUS_OVFX2 is referenced in ovcamchip_core.c, but no bus uses
this driver ID, so we can remove the reference. As far as I can see,
the Cypress FX2 webcam is handled by a different driver (dvb-usb).
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Also remove some blank lines that were used to split compat code at -devel
tree.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
videodev2: New pixfmt
pac207: Remove the specific decoding.
main: get_buff_size operation added for the subdriver.
Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The low (half) res modes of the spca561 are not spca561 compressed, but are
raw bayer, this patches fixes this and adds a PIX_FMT define for the GBRG
bayer format used by the spca561 in low res mode.
Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Annotations + stop saa7146_i2c from playing fast and loose with
reuse of ->cpu_addr for host-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The cx18 can support transport streams with newer firmwares. Add a TS
capability to the generic cx2341x module.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Various ioctl debugging fixes and improvements:
- use %x rather than %d for control IDs and bitmask fields
- make two arrays const
- show the whole control array for the ext_ctrl ioctls
- print pix_fmt for V4L2_BUF_TYPE_VIDEO_OUTPUT
- show full type name rather than an integer
- fix CROPCAP debugging
- fix G/S_TUNER debugging
- show error code in case of an error
- other small cleanups
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
A number of V4L drivers have a mod param to specify their preferred minors.
This is because it is often desirable for applications to have a static /dev
name for a particular device. However, using minors has several disadvantages:
1) the requested minor may already be taken
2) using a mod param is driver specific
3) it requires every driver to add a param
4) requires configuration by hand
This patch introduces an "index" attribute that when combined with udev rules
can create static device paths like this:
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video1
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video2
$ ls -la /dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
lrwxrwxrwx 1 root root 12 2008-04-28 00:02 /dev/v4l/by-path/pci-0000:00:1d.2-usb-0:1:1.0-video0 -> ../../video1
These paths are steady across reboots and should be resistant to rearranging
across Kernel versions.
video_register_device_index is available to drivers to request a
specific index number.
Signed-off-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Kees Cook <kees@outflux.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The naming for the callbacks that handle the VIDIOC_ENUM_FMT and
VIDIOC_S/G/TRY_FMT ioctls was very confusing. Renamed it to match
the v4l2_buf_type name.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
There was no vidioc_try_fmt_sliced_vbi_output, instead vidioc_try_fmt_vbi_output
was reused.
The VIDIOC_ENUMOUTPUT handling was missing altogether, even though the callback
existed.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The default videodev behavior for VIDIOC_G_STD is not correct for all devices.
Add a new callback that drivers can use instead.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Flush the shadow mmu before removing regions to avoid stale entries.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
While doing some tests with our lcrash implementation I have seen a
naming conflict with prefix_info in kvm_host.h vs. addrconf.h
To avoid future conflicts lets rename private definitions in
asm/kvm_host.h by adding the kvm_s390 prefix.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Instead of prefetching all segment bases before emulation, read them at the
last moment. Since most of them are unneeded, we save some cycles on
Intel machines where this is a bit expensive.
Signed-off-by: Avi Kivity <avi@qumranet.com>
rip relative decoding is relative to the instruction pointer of the next
instruction; by moving address adjustment until after decoding is complete,
we remove the need to determine the instruction size.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Currently kvmtrace is not portable. This will prevent from copying a
trace file from big-endian target to little-endian workstation for analysis.
In the patch, kernel outputs metadata containing a magic number to trace
log, and changes 64-bit words to be u64 instead of a pair of u32s.
Signed-off-by: Tan Li <li.tan@intel.com>
Acked-by: Jerone Young <jyoung5@us.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables coalesced MMIO for ia64 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
[akpm: fix compile error on ia64]
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables coalesced MMIO for powerpc architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch adds all needed structures to coalesce MMIOs.
Until an architecture uses it, it is not compiled.
Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that
can be coalesced:
- KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone.
It requests one parameter (struct kvm_coalesced_mmio_zone) which defines
a memory area where MMIOs can be coalesced until the next switch to
user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX.
- KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside
the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone).
The userspace client can check kernel coalesced MMIO availability by asking
ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability.
The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported,
or the page offset where will be stored the ring buffer.
The page offset depends on the architecture.
After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to
a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is
an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively
to the address of the start of th kvm_run structure. The MMIO ring buffer
is defined by the structure kvm_coalesced_mmio_ring.
[akio: fix oops during guest shutdown]
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).
This modification allows to use kvm_io_device with coalesced MMIO.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Prefixes functions that will be exported with kvm_.
We also prefixed set_segment() even if it still static
to be coherent.
signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Add emulation for the memory type range registers, needed by VMware esx 3.5,
and by pci device assignment.
Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM turns off hardware virtualization extensions during reboot, in order
to disassociate the memory used by the virtualization extensions from the
processor, and in order to have the system in a consistent state.
Unfortunately virtual machines may still be running while this goes on,
and once virtualization extensions are turned off, any virtulization
instruction will #UD on execution.
Fix by adding an exception handler to virtualization instructions; if we get
an exception during reboot, we simply spin waiting for the reset to complete.
If it's a true exception, BUG() so we can have our stack trace.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The KVM MMU tries to detect when a speculative pte update is not actually
used by demand fault, by checking the accessed bit of the shadow pte. If
the shadow pte has not been accessed, we deem that page table flooded and
remove the shadow page table, allowing further pte updates to proceed
without emulation.
However, if the pte itself points at a page table and only used for write
operations, the accessed bit will never be set since all access will happen
through the emulator.
This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels.
The kernel points a kmap_atomic() pte at a page table, and then
proceeds with read-modify-write operations to look at the dirty and accessed
bits. We get a false flood trigger on the kmap ptes, which results in the
mmu spending all its time setting up and tearing down shadows.
Fix by setting the shadow accessed bit on emulated accesses.
Signed-off-by: Avi Kivity <avi@qumranet.com>
To distinguish between real page faults and nested page faults they should be
traced as different events. This is implemented by this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
random uvesafb failures were reported against Gentoo:
http://bugs.gentoo.org/show_bug.cgi?id=222799
and Mihai Moldovan bisected it back to:
> 8f4d37ec07 is first bad commit
> commit 8f4d37ec07
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Fri Jan 25 21:08:29 2008 +0100
>
> sched: high-res preemption tick
Linus suspected it to be hrtick + vm86 interaction and observed:
> Btw, Peter, Ingo: I think that commit is doing bad things. They aren't
> _incorrect_ per se, but they are definitely bad.
>
> Why?
>
> Using random _TIF_WORK_MASK flags is really impolite for doing
> "scheduling" work. There's a reason that arch/x86/kernel/entry_32.S
> special-cases the _TIF_NEED_RESCHED flag: we don't want to exit out of
> vm86 mode unnecessarily.
>
> See the "work_notifysig_v86" label, and how it does that
> "save_v86_state()" thing etc etc.
Right, I never liked having to fiddle with those TIF flags. Initially I
needed it because the hrtimer base lock could not nest in the rq lock.
That however is fixed these days.
Currently the only reason left to fiddle with the TIF flags is remote
wakeups. We cannot program a remote cpu's hrtimer. I've been thinking
about using the new and improved IPI function call stuff to implement
hrtimer_start_on().
However that does require that smp_call_function_single(.wait=0) works
from interrupt context - /me looks at the latest series from Jens - Yes
that does seem to be supported, good.
Here's a stab at cleaning this stuff up ...
Mihai reported test success as well.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Mihai Moldovan <ionic@ionic.de>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Rename CPUMASK_VAR --> CPUMASK_PTR (and simplify)
* Fix a semantic error in CPUMASK_ALLOC
* Add a bit of commentry to cpumask.h
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
so NUMAQ can use that to call numaq_pre_time_init()
This allows us to remove a NUMAQ special from arch/x86/kernel/setup.c.
(and paves the way to remove the NUMAQ subarch)
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add these new x86_quirks methods:
int *mpc_record;
int (*mpc_apic_id)(struct mpc_config_processor *m);
void (*mpc_oem_bus_info)(struct mpc_config_bus *m, char *name);
void (*mpc_oem_pci_bus)(struct mpc_config_bus *m);
void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable,
unsigned short oemsize);
... and move NUMAQ related mps table handling to numaq_32.c.
also move the call to smp_read_mpc_oem() to smp_read_mpc() directly.
Should not change functionality, albeit it would be nice to get it
tested on real NUMAQ as well ...
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
introduce x86_quirks array of boot-time quirk methods.
No change in functionality intended.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add size table functions for qdiscs and calculate packet size in
qdisc_enqueue().
Based on patch by Patrick McHardy
http://marc.info/?l=linux-netdev&m=115201979221729&w=2
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use sockaddr_storage{} for generic socket address storage
and ensures proper alignment.
Use sockaddr{} for pointers to omit several casts.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove redundant checks when setting eff_sacks and make the number of SACKs a
compile time constant. Now that the options code knows how many SACK blocks can
fit in the header, we don't need to have the SACK code guessing at it.
Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This should fix the following bugs:
* Connections with MD5 signatures produce invalid packets whenever SACK
options are included
* MD5 signatures are counted twice in the MSS calculations
Behaviour changes:
* A SYN with MD5 + SACK + TS elicits a SYNACK with MD5 + SACK
This is because we can't fit any SACK blocks in a packet with MD5 + TS
options. There was discussion about disabling SACK rather than TS in
order to fit in better with old, buggy kernels, but that was deemed to
be unnecessary.
* SYNs with MD5 don't include a TS option
See above.
Additionally, it removes a bunch of duplicated logic for calculating options,
which should help avoid these sort of issues in the future.
Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the MD5 code assumes that the SKBs are linear and, in the case
that they aren't, happily goes off and hashes off the end of the SKB and
into random memory.
Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy
Polyakov. Also includes a couple of missed route_caps from Stephen's patch
in [2].
[1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2
[2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2
Signed-off-by: Adam Langley <agl@imperialviolet.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in
kernel units (jiffies) and this leaks out through the netlink API to
user space where the units for jiffies are unknown.
This patches changes the kernel to convert to/from milliseconds. This
changes the ABI, but milliseconds seemed like the most natural unit
for these parameters. Values available via syscall in
/proc/net/rt_cache and netlink will be in milliseconds.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Idea is from Patrick McHardy.
Instead of managing the list of qdiscs on the device level, manage it
in the root qdisc of a netdev_queue. This solves all kinds of
visibility issues during qdisc destruction.
The way to iterate over all qdiscs of a netdev_queue is to visit
the netdev_queue->qdisc, and then traverse it's list.
The only special case is to ignore builting qdiscs at the root when
dumping or doing a qdisc_lookup(). That was not needed previously
because builtin qdiscs were not added to the device's qdisc_list.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a SW_DOCK switch to input.h. ACPI docks currently send their docking
status as a uevent, but not all docks are ACPI or correspond to a device.
In that case, it makes more sense to simply generate an input event on
docking or undocking.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Add a new switch type to the input API for reporting microphone
insertion. This will be used by the ALSA jack reporting API.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The u32_list is just an indirect way of maintaining a reference
to a U32 node on a per-qdisc basis.
Just add an explicit node pointer for u32 to struct Qdisc an do
away with this global list.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new sockopt to reserve some headroom in the mmaped ring frames in
front of the packet payload. This can be used f.i. when the VLAN header
needs to be (re)constructed to avoid moving the entire payload.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a directory for x86 arch under debugfs. Can be used to accumulate all
x86 specific debugfs files.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
As 256 entries are needed, aligning to a 256-entry boundary is
sufficient and still guarantees the single pte table requirement.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It's not used anywhere outside its single referencing file.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* Provide a generic set of CPUMASK_ALLOC macros patterned after the
SCHED_CPUMASK_ALLOC macros. This is used where multiple cpumask_t
variables are declared on the stack to reduce the amount of stack
space required.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* This patch replaces the dangerous lvalue version of cpumask_of_cpu
with new cpumask_of_cpu_ptr macros. These are patterned after the
node_to_cpumask_ptr macros.
In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
the cpumask_of_cpu_map[cpu] entry is used. The cpumask_of_cpu_map
is provided when there is a large NR_CPUS count, reducing
greatly the amount of code generated and stack space used for
cpumask_of_cpu(). The pointer to the cpumask_t value is needed for
calling set_cpus_allowed_ptr() to reduce the amount of stack space
needed to pass the cpumask_t value.
If there isn't a cpumask_of_cpu_map[], then a temporary variable is
declared and filled in with value from cpumask_of_cpu(cpu) as well as
a pointer variable pointing to this temporary variable. Afterwards,
the pointer is used to reference the cpumask value. The compiler
will optimize out the extra dereference through the pointer as well
as the stack space used for the pointer, resulting in identical code.
A good example of the orthogonal usages is in net/sunrpc/svc.c:
case SVC_POOL_PERCPU:
{
unsigned int cpu = m->pool_to[pidx];
cpumask_of_cpu_ptr(cpumask, cpu);
*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, cpumask);
return 1;
}
case SVC_POOL_PERNODE:
{
unsigned int node = m->pool_to[pidx];
node_to_cpumask_ptr(nodecpumask, node);
*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, nodecpumask);
return 1;
}
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Declaring x86 traps under one hood.
Declaring x86 do_traps before defining them.
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The force_mwait variable iss defined either in
arch/x86/kernel/cpu/amd.c or in arch/x86/kernel/setup_64.c, but it is
only initialized and used in arch/x86/kernel/process.c. This patch
moves the declaration to arch/x86/kernel/process.c.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:
scheduler switch to idle task
enable interrupts
Window starts here
----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick
----> interrupt happens (does set NEED_RESCHED)
return from schedule()
cpu_idle(): preempt_disable();
Window ends here
The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.
The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.
Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.
cpu_idle()
{
preempt_disable();
while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop
while (!need_resched())
halt();
tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}
In hindsight we should have done this forever, but ...
/me grabs a large brown paperbag.
Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fallout from commit 33185c504f ("x86:
merge signal_32/64.h")
Thanks to Dick Streefland who provided an useful testcase on
http://lkml.org/lkml/2008/3/17/205 (only applicable to 2.6.24.x), that
helped a lot as a deterministic way to bisect an issue that leaded to
this fix.
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Real-time code needs to know the number of cycles per second
on SGI UV. The information is provided via a run time BIOS
call. This patch provides the linux side of that interface.
This is the first of several run time BIOS calls to be defined
in uv/bios.h and bios_uv.c.
Note that BIOS_CALL() is just a stub for now. The bios
side is being worked on.
Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ricardo M. Correia spotted that the use of __fls() in fls64() did
not seem to make sense. In fact fls64()'s implementation is fine,
but the description of __fls() was wrong. Fix that.
Reported-by: "Ricardo M. Correia" <Ricardo.M.Correia@Sun.COM>
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ mingo@elte.hu: picked up this patch from Maciej, lets make apic=debug
print out more info - we had a lot of APIC changes ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As a microoptimisation, make apic_verbosity unsigned. This will make
apic_printk(APIC_QUIET, ...) expand into just printk(...) with the
surrounding condition and a reference to apic_verbosity removed.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
it's separate functionality that deserves its own file.
This also prepares 32-bit memtest support.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
provide an empty partition_sched_domains() definition for the UP case:
include/linux/cpuset.h: In function ‘rebuild_sched_domains':
include/linux/cpuset.h:163: error: implicit declaration of function ‘partition_sched_domains'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is based on Linus' idea of creating cpu_active_map that prevents
scheduler load balancer from migrating tasks to the cpu that is going
down.
It allows us to simplify domain management code and avoid unecessary
domain rebuilds during cpu hotplug event handling.
Please ignore the cpusets part for now. It needs some more work in order
to avoid crazy lock nesting. Although I did simplfy and unify domain
reinitialization logic. We now simply call partition_sched_domains() in
all the cases. This means that we're using exact same code paths as in
cpusets case and hence the test below cover cpusets too.
Cpuset changes to make rebuild_sched_domains() callable from various
contexts are in the separate patch (right next after this one).
This not only boots but also easily handles
while true; do make clean; make -j 8; done
and
while true; do on-off-cpu 1; done
at the same time.
(on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing).
Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
this on right now in gnome-terminal and things are moving just fine.
Also this is running with most of the debug features enabled (lockdep,
mutex, etc) no BUG_ONs or lockdep complaints so far.
I believe I addressed all of the Dmitry's comments for original Linus'
version. I changed both fair and rt balancer to mask out non-active cpus.
And replaced cpu_is_offline() with !cpu_active() in the main scheduler
code where it made sense (to me).
Signed-off-by: Max Krasnyanskiy <maxk@qualcomm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Gregory Haskins <ghaskins@novell.com>
Cc: dmitry.adamushko@gmail.com
Cc: pj@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are already 7 of them - time to kill some duplicate code.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Proc temporary uses stats from init_net.
BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :)
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only structure declared within is the netns_mib, which will
carry all our mibs within. I didn't put the mibs in the existing
netns_xxx structures to make it possible to mark this one as
properly aligned and get in a separate "read-mostly" cache-line.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use alternatives to select the workaround for the 11AP Pentium erratum
for the affected steppings on the fly rather than build time. Remove the
X86_GOOD_APIC configuration option and replace all the calls to
apic_write_around() with plain apic_write(), protecting accesses to the
ESR as appropriate due to the 3AP Pentium erratum. Remove
apic_read_around() and all its invocations altogether as not needed.
Remove apic_write_atomic() and all its implementing backends. The use of
ASM_OUTPUT2() is not strictly needed for input constraints, but I have
used it for readability's sake.
I had the feeling no one else was brave enough to do it, so I went ahead
and here it is. Verified by checking the generated assembly and tested
with both a 32-bit and a 64-bit configuration, also with the 11AP
"feature" forced on and verified with gdb on /proc/kcore to work as
expected (as an 11AP machines are quite hard to get hands on these days).
Some script complained about the use of "volatile", but apic_write() needs
it for the same reason and is effectively a replacement for writel(), so I
have disregarded it.
I am not sure what the policy wrt defconfig files is, they are generated
and there is risk of a conflict resulting from an unrelated change, so I
have left changes to them out. The option will get removed from them at
the next run.
Some testing with machines other than mine will be needed to avoid some
stupid mistake, but despite its volume, the change is not really that
intrusive, so I am fairly confident that because it works for me, it will
everywhere.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adrian Bunk reported that enabling 4MB page size breaks the build.
The problem is that MAX_ORDER combined with the page shift exceeds the
SECTION_SIZE_BITS we use in asm-sparc64/sparsemem.h
There are several ways I suppose we could work around this. For one
we could define a CONFIG_FORCE_MAX_ZONEORDER to decrease MAX_ORDER in
these higher page size cases.
But I also know that these page size cases are broken wrt. TLB miss
handling especially on pre-hypervisor systems, and there isn't an easy
way to fix that.
These options were meant to be fun experimental hacks anyways, and
only 8K and 64K make any sense to support.
So remove 512K and 4M base page size support. Of course, we still
support these page sizes for huge pages.
Signed-off-by: David S. Miller <davem@davemloft.net>
With this commit all sparc64 header files are moved to asm-sparc.
The remaining files (71 files) were too different to be trivially
merged so divide them up in a _32.h and a _64.h file which
are both included from the file with no bit size.
The following script were used:
cd include
FILES=`wc -l asm-sparc64/*h | grep -v '^ 1' | cut -b 20-`
for FILE in ${FILES}; do
echo $FILE:
BASE=`echo $FILE | cut -d '.' -f 1`
FN32=${BASE}_32.h
FN64=${BASE}_64.h
GUARD=___ASM_SPARC_`echo $BASE | tr '-' '_' | tr [:lower:] [:upper:]`_H
git mv asm-sparc/$FILE asm-sparc/$FN32
git mv asm-sparc64/$FILE asm-sparc/$FN64
echo git mv done
printf "#ifndef %s\n" $GUARD > asm-sparc/$FILE
printf "#define %s\n" $GUARD >> asm-sparc/$FILE
printf "#if defined(__sparc__) && defined(__arch64__)\n" >> asm-sparc/$FILE
printf "#include <asm-sparc/%s>\n" $FN64 >> asm-sparc/$FILE
printf "#else\n" >> asm-sparc/$FILE
printf "#include <asm-sparc/%s>\n" $FN32 >> asm-sparc/$FILE
printf "#endif\n" >> asm-sparc/$FILE
printf "#endif\n" >> asm-sparc/$FILE
git add asm-sparc/$FILE
echo new file done
printf "#include <asm-sparc/%s>\n" $FILE > asm-sparc64/$FILE
git add asm-sparc64/$FILE
echo sparc64 file done
done
The guard contains three '_' to avoid conflict with existing guards.
In additing the two Kbuild files are emptied to avoid breaking
headers_* targets.
We will reintroduce the exported header files when the necessary
kbuild changes are merged.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A manual inspection revealed that the following headerfiles
contained only trivial differences:
hw_irq.h idprom.h kmap_types.h kvm.h spinlock_types.h sunbpp.h unaligned.h
The only noteworthy change are that sparc64 had a volatile
qualifer that sparc missed in spinlock_types.h.
In addition a few comments were updated.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Used the following script to find equal header files:
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
cmp -s asm-sparc/$FILE asm-sparc64/$FILE;
if [ $? = 0 ]; then
printf "#include <asm-sparc/%s>\n" $FILE > asm-sparc64/$FILE
fi
done
A few of the equal files are a simple include from
asm-generic, but by including the file from asm-sparc
we know they are equal for sparc and sparc64.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Used the following script to copy the files:
cd include
set -e
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
if [ -f asm-sparc/$FILE ]; then
echo $FILE exist in asm-sparc
else
git mv asm-sparc64/$FILE asm-sparc/$FILE
printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
git add asm-sparc64/$FILE
fi
done
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Joined the two files as they contain distinct definitions.
Inspired by patch from: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@kernel.org>
sparc64 exports openprom.h to userspace so let sparc follow
the example.
As openprom.h pulled in another not-for-export vaddrs.h header
file it required a few changes to fix the build.
The definition af VMALLOC_* were moved to pgtable as this is
where sparc64 has them.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Copy content of sparc64 file to sparc file.
There is only minimal possibilities for further unification.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Bring the commit e55c57e0b5
("[SPARC64]: Report any user access faults in termios accessors")
over to sparc when unifying the two files.
The diff was manually inspected to contain no
other relevant changes.
This unification therefore changes functionality of sparc.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
The type of tcflag_t differs from 32 and 64 bit.
For 32 bit it is long
For 64 bit it is int
Altough these have same size then I was not sure that
it was OK to change the 64 bit version to long as this
is part of the ABI so it was made conditional.
:$ diff -u include/asm-sparc/termbits.h include/asm-sparc64/termbits.h
:-- include/asm-sparc/termbits.h 2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/termbits.h 2008-06-13 06:42:07.000000000 +0200
:@@ -1,11 +1,11 @@
:-#ifndef _SPARC_TERMBITS_H
:-#define _SPARC_TERMBITS_H
:+#ifndef _SPARC64_TERMBITS_H
:+#define _SPARC64_TERMBITS_H
:
: #include <linux/posix_types.h>
:
: typedef unsigned char cc_t;
: typedef unsigned int speed_t;
:-typedef unsigned long tcflag_t;
:+typedef unsigned int tcflag_t;
:
: #define NCC 8
: struct termio {
:@@ -102,7 +102,7 @@
: #define IXANY 0x00000800
: #define IXOFF 0x00001000
: #define IMAXBEL 0x00002000
:-#define IUTF8 0x00004000
:+#define IUTF8 0x00004000
:
: /* c_oflag bits */
: #define OPOST 0x00000001
:@@ -171,7 +171,6 @@
: #define HUPCL 0x00000400
: #define CLOCAL 0x00000800
: #define CBAUDEX 0x00001000
:-/* We'll never see these speeds with the Zilogs, but for completeness... */
: #define BOTHER 0x00001000
: #define B57600 0x00001001
: #define B115200 0x00001002
:@@ -199,7 +198,7 @@
: #define B3500000 0x00001012
: #define B4000000 0x00001013 */
: #define CIBAUD 0x100f0000 /* input baud rate (not used) */
:-#define CMSPAR 0x40000000 /* mark or space (stick) parity */
:+#define CMSPAR 0x40000000 /* mark or space (stick) parity */
: #define CRTSCTS 0x80000000 /* flow control */
:
: #define IBSHIFT 16 /* Shift from CBAUD to CIBAUD */
:@@ -258,4 +257,4 @@
: #define TCSADRAIN 1
: #define TCSAFLUSH 2
:
:-#endif /* !(_SPARC_TERMBITS_H) */
:+#endif /* !(_SPARC64_TERMBITS_H) */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
RLIM_INFINITY differ from 32 and 64 bit.
The rest is equal.
:$ diff -u include/asm-sparc/resource.h include/asm-sparc64/resource.h
:-- include/asm-sparc/resource.h 2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/resource.h 2008-06-13 06:46:39.000000000 +0200
:@@ -1,11 +1,11 @@
: /*
: * resource.h: Resource definitions.
: *
:- * Copyright (C) 1995 David S. Miller (davem@caip.rutgers.edu)
:+ * Copyright (C) 1996 David S. Miller (davem@caip.rutgers.edu)
: */
:
:-#ifndef _SPARC_RESOURCE_H
:-#define _SPARC_RESOURCE_H
:+#ifndef _SPARC64_RESOURCE_H
:+#define _SPARC64_RESOURCE_H
:
: /*
: * These two resource limit IDs have a Sparc/Linux-specific ordering,
:@@ -14,13 +14,6 @@
: #define RLIMIT_NOFILE 6 /* max number of open files */
: #define RLIMIT_NPROC 7 /* max number of processes */
:
:-/*
:- * SuS says limits have to be unsigned.
:- * We make this unsigned, but keep the
:- * old value for compatibility:
:- */
:-#define RLIM_INFINITY 0x7fffffff
:-
: #include <asm-generic/resource.h>
:
:-#endif /* !(_SPARC_RESOURCE_H) */
:+#endif /* !(_SPARC64_RESOURCE_H) */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
There were only a few trivial changes and a few additions
in the sparc64 variant of this file.
This patch copies the sparc64 specific bits to the sparc version
of fbio.h so they are equal. A later patch will merge the two.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Padding in the sembuf structure made conditional
as only 32 bit sparc did so.
:$ diff -u include/asm-sparc/sembuf.h include/asm-sparc64/sembuf.h
:-- include/asm-sparc/sembuf.h 2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/sembuf.h 2008-06-13 06:42:07.000000000 +0200
:@@ -1,21 +1,18 @@
:-#ifndef _SPARC_SEMBUF_H
:-#define _SPARC_SEMBUF_H
:+#ifndef _SPARC64_SEMBUF_H
:+#define _SPARC64_SEMBUF_H
:
: /*
:- * The semid64_ds structure for sparc architecture.
:+ * The semid64_ds structure for sparc64 architecture.
: * Note extra padding because this structure is passed back and forth
: * between kernel and user space.
: *
: * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
: */
:
: struct semid64_ds {
: struct ipc64_perm sem_perm; /* permissions .. see ipc.h */
:- unsigned int __pad1;
: __kernel_time_t sem_otime; /* last semop time */
:- unsigned int __pad2;
: __kernel_time_t sem_ctime; /* last change time */
: unsigned long sem_nsems; /* no. of semaphores in array */
: unsigned long __unused1;
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Padding from 32 bit sparc kept using preprocessor magic
:$ diff -u include/asm-sparc/msgbuf.h include/asm-sparc64/msgbuf.h
:-- include/asm-sparc/msgbuf.h 2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/msgbuf.h 2008-06-13 06:42:07.000000000 +0200
:@@ -7,17 +7,13 @@
: * between kernel and user space.
: *
: * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
: */
:
: struct msqid64_ds {
: struct ipc64_perm msg_perm;
:- unsigned int __pad1;
: __kernel_time_t msg_stime; /* last msgsnd time */
:- unsigned int __pad2;
: __kernel_time_t msg_rtime; /* last msgrcv time */
:- unsigned int __pad3;
: __kernel_time_t msg_ctime; /* last change time */
: unsigned long msg_cbytes; /* current number of bytes on queue */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Trivial differenses in comments - used the version from sparc64
:$ diff -u include/asm-sparc/ioctls.h include/asm-sparc64/ioctls.h
:-- include/asm-sparc/ioctls.h 2008-06-13 08:46:29.000000000 +0200
:++ include/asm-sparc64/ioctls.h 2008-06-13 08:46:29.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _ASM_SPARC_IOCTLS_H
:-#define _ASM_SPARC_IOCTLS_H
:+#ifndef _ASM_SPARC64_IOCTLS_H
:+#define _ASM_SPARC64_IOCTLS_H
:
: #include <asm/ioctl.h>
:
:@@ -22,7 +22,7 @@
:
: /* Note that all the ioctls that are not available in Linux have a
: * double underscore on the front to: a) avoid some programs to
:- * thing we support some ioctls under Linux (autoconfiguration stuff)
:+ * think we support some ioctls under Linux (autoconfiguration stuff)
: */
: /* Little t */
: #define TIOCGETD _IOR('t', 0, int)
:@@ -110,7 +110,7 @@
: #define TIOCSERGETLSR 0x5459 /* Get line status register */
: #define TIOCSERGETMULTI 0x545A /* Get multiport config */
: #define TIOCSERSETMULTI 0x545B /* Set multiport config */
:-#define TIOCMIWAIT 0x545C /* Wait input */
:+#define TIOCMIWAIT 0x545C /* Wait for change on serial input line(s) */
: #define TIOCGICOUNT 0x545D /* Read serial port inline interrupt counts */
:
: /* Kernel definitions */
:@@ -133,4 +133,4 @@
: #define TIOCPKT_NOSTOP 16
: #define TIOCPKT_DOSTOP 32
:
:-#endif /* !(_ASM_SPARC_IOCTLS_H) */
:+#endif /* !(_ASM_SPARC64_IOCTLS_H) */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Copy was done using the following simple script:
set -e
SPARC64="h display7seg.h envctrl.h psrcompat.h pstate.h uctx.h utrap.h watchdog.h"
for FILE in ${SPARC64}; do
if [ -f asm-sparc/$FILE ]; then
echo $FILE exist in asm-sparc
fi
cat asm-sparc64/$FILE > asm-sparc/$FILE
printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
done
The name of the copied files are added to asm-sparc/Kbuild
to keep "make headers_check" functional.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
This seems to be left from the long gone AP1000 support.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to have exclusive access to the given qdisc anyways, so
doing even more locking is superfluous.
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sch_tree_lock() lock the qdisc's root. All of the
users hold the RTNL semaphore and the root qdisc is not
changing.
Implement tbf_tree_{lock,unlock}() simply in terms of
sch_tree_{lock,unlock}().
Signed-off-by: David S. Miller <davem@davemloft.net>
When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.
Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.
Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.
Signed-off-by: David S. Miller <davem@davemloft.net>
When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.
Add qdisc_root_lock() to represent this and convert the
easiest cases.
In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently it is associated with a netdev_queue, but when we have
qdisc sharing that no longer makes any sense.
Signed-off-by: David S. Miller <davem@davemloft.net>
We liberate any dangling gso_skb during qdisc destruction.
It really only matters for the root qdisc. But when qdiscs
can be shared by multiple netdev_queue objects, we can't
have the gso_skb in the netdev_queue any more.
Signed-off-by: David S. Miller <davem@davemloft.net>
The only behavior change is that we do not drop packets under any
circumstances. If that is absolutely needed, we could easily add it
back.
With cleanups and help from Johannes Berg.
Signed-off-by: David S. Miller <davem@davemloft.net>
Devices or device layers can set this to control the queue selection
performed by dev_pick_tx().
This function runs under RCU protection, which allows overriding
functions to have some way of synchronizing with things like dynamic
->real_num_tx_queues adjustments.
This makes the spinlock prefetch in dev_queue_xmit() a little bit
less effective, but that's the price right now for correctness.
Signed-off-by: David S. Miller <davem@davemloft.net>
The private area of a netdev is now at a fixed offset once more.
Unfortunately, some assumptions that netdev_priv() == netdev->priv
crept back into the tree. In particular this happened in the
loopback driver. Make it use netdev->ml_priv.
Signed-off-by: David S. Miller <davem@davemloft.net>
This effectively "flips the switch" by making the core networking
and multiqueue-aware drivers use the new TX multiqueue structures.
Non-multiqueue drivers need no changes. The interfaces they use such
as netif_stop_queue() degenerate into an operation on TX queue zero.
So everything "just works" for them.
Code that really wants to do "X" to all TX queues now invokes a
routine that does so, such as netif_tx_wake_all_queues(),
netif_tx_stop_all_queues(), etc.
pktgen and netpoll required a little bit more surgery than the others.
In particular the pktgen changes, whilst functional, could be largely
improved. The initial check in pktgen_xmit() will sometimes check the
wrong queue, which is mostly harmless. The thing to do is probably to
invoke fill_packet() earlier.
The bulk of the netpoll changes is to make the code operate solely on
the TX queue indicated by by the SKB queue mapping.
Setting of the SKB queue mapping is entirely confined inside of
net/core/dev.c:dev_pick_tx(). If we end up needing any kind of
special semantics (drops, for example) it will be implemented here.
Finally, we now have a "real_num_tx_queues" which is where the driver
indicates how many TX queues are actually active.
With IGB changes from Jeff Kirsher.
Signed-off-by: David S. Miller <davem@davemloft.net>
This actually fixes a bug added by the RR scheduler changes. The
->bands and ->prio2band parameters were being set outside of the
sch_tree_lock() and thus could result in strange behavior and
inconsistencies.
It might be possible, in the new design (where there will be one qdisc
per device TX queue) to allow similar functionality via a TX hash
algorithm for RR but I really see no reason to export this aspect of
how these multiqueue cards actually implement the scheduling of the
the individual DMA TX rings and the single physical MAC/PHY port.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need for a feature bit for something that
can be tested by simply checking the TX queue count.
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.
Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces. This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.
Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.
Signed-off-by: David S. Miller <davem@davemloft.net>
All callers of async_tx_sync_epilog have called async_tx_quiesce on the
depend_tx, so async_tx_sync_epilog need only call the callback to
complete the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The configfs operations ->make_item() and ->make_group() currently
return a new item/group. A return of NULL signifies an error. Because
of this, -ENOMEM is the only return code bubbled up the stack.
Multiple folks have requested the ability to return specific error codes
when these operations fail. This patch adds that ability by changing the
->make_item/group() ops to return ERR_PTR() values. These errors are
bubbled up appropriately. NULL returns are changed to -ENOMEM for
compatibility.
Also updated are the in-kernel users of configfs.
This is a rework of reverted commit 11c3b79218.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Merge the GDT_ENTRY() macro between arch/x86/boot/pm.c and
arch/x86/kernel/acpi/sleep.c and put the new one in
<asm-x86/segment.h>.
While we're at it, correct the bitmasks for the limit and flags. The
new version relies on using ULL constants in order to cause type
promotion rather than explicit casts; this avoids having to include
<linux/types.h> in <asm-x86/segments.h>.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
[PATCH] ocfs2: fix oops in mmap_truncate testing
configfs: call drop_link() to cleanup after create_link() failure
configfs: Allow ->make_item() and ->make_group() to return detailed errors.
configfs: Fix failing mkdir() making racing rmdir() fail
configfs: Fix deadlock with racing rmdir() and rename()
configfs: Make configfs_new_dirent() return error code instead of NULL
configfs: Protect configfs_dirent s_links list mutations
configfs: Introduce configfs_dirent_lock
ocfs2: Don't snprintf() without a format.
ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs
ocfs2/net: Silence build warnings on sparc64
ocfs2: Handle error during journal load
ocfs2: Silence an error message in ocfs2_file_aio_read()
ocfs2: use simple_read_from_buffer()
ocfs2: fix printk format warnings with OCFS2_FS_STATS=n
[PATCH 2/2] ocfs2: Instrument fs cluster locks
[PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix asm/e820.h for userspace inclusion
x86: fix numaq_tsc_disable
x86: fix kernel_physical_mapping_init() for large x86 systems
asm-x86/e820.h is included from userspace. 'x86: make e820.c to have
common functions' (b79cd8f126) broke it:
make -C Documentation/lguest
cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c -lz -o lguest
In file included from ../../include/asm-x86/bootparam.h:8,
from lguest.c:45:
../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
fix dangling zombie when new parent ignores children
do_wait: return security_task_wait() error code in place of -ECHILD
ptrace children revamp
do_wait reorganization
in __neigh_event_send, if we have a neighbour entry which is in
NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
to the neighbours arp_queue, which is default capped to a length of 3
skbs. If that queue exceeds its set length, it will drop an skb on
the queue to enqueue the newly arrived skb. This results in a drop
for which we have no statistics incremented. This patch adds an
unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
lost frames.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Done with NET_XXX_STATS macros :)
To be continued...
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This one is tricky.
The thing is that this macro is only used when killing tw buckets,
but since this killer is promiscuous wrt to which net each particular
tw belongs to, I have to use it only when NET_NS is off. When the net
namespaces are on, I use the INET_INC_STATS_BH for each bucket.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't
have where to get the net from.
I decided to add a sk argument, not the net itself, only to factor
all the required sock_net(sk) calls inside the enter_memory_pressure
callback itself.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Same as before - the sock is always there to get the net from,
but there are also some places with the net already saved on
the stack.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fortunately (almost) all the TCP code has a sock to get the net from :)
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This one sets TCP MIBs after zeroing them, and thus requires
the net.
The existing single caller can use init_net (temporarily).
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP_INC_STATS_USER and TCP_ADD_STATS_BH are currently unused.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Very simple - only ip_evictor (fragments) requires such.
This patch ends up the IP_XXX_STATS patching.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
All the callers already have either the net itself, or the place
where to get it from.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ptrace no longer fiddles with the children/sibling links, and the
old ptrace_children list is gone. Now ptrace, whether of one's own
children or another's via PTRACE_ATTACH, just uses the new ptraced
list instead.
There should be no user-visible difference that matters. The only
change is the order in which do_wait() sees multiple stopped
children and stopped ptrace attachees. Since wait_task_stopped()
was changed earlier so it no longer reorders the children list, we
already know this won't cause any new problems.
Signed-off-by: Roland McGrath <roland@redhat.com>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
Revert "x86/PCI: ACPI based PCI gap calculation"
PCI: remove unnecessary volatile in PCIe hotplug struct controller
x86/PCI: ACPI based PCI gap calculation
PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
PCI PM: Fix pci_prepare_to_sleep
x86/PCI: Fix PCI config space for domains > 0
Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
PCI: Simplify PCI device PM code
PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
PCI ACPI: Rework PCI handling of wake-up
ACPI: Introduce new device wakeup flag 'prepared'
ACPI: Introduce acpi_device_sleep_wake function
PCI: rework pci_set_power_state function to call platform first
PCI: Introduce platform_pci_power_manageable function
ACPI: Introduce acpi_bus_power_manageable function
PCI: make pci_name use dev_name
PCI: handle pci_name() being const
PCI: add stub for pci_set_consistent_dma_mask()
PCI: remove unused arch pcibios_update_resource() functions
PCI: fix pci_setup_device()'s sprinting into a const buffer
...
Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.
This converts the FSL Book-E PTE access and TLB miss handling to match
with the recent changes to 44x that introduce support for non-atomic PTE
operations in pgtable-ppc32.h and removes write back to the PTE from
the TLB miss handlers. In addition, the DSI interrupt code no longer
tries to fixup write permission, this is left to generic code, and
_PAGE_HWWRITE is gone.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Now that arch/ppc is gone we always define CONFIG_PPC_CPM_NEW_BINDING so
we can remove all the code associated with !CONFIG_PPC_CPM_NEW_BINDING.
Also fixed some asm/of_platform.h to linux/of_platform.h (and of_device.h)
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Mostly having to do with not marking things __iomem. And some failure
to use appropriate accessors to read MMIO regs.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Basic PM support for 83xx. Standby is implemented as sleep.
Suspend-to-RAM is implemented as "deep sleep" (with the processor
turned off) on 831x.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6:
UBIFS: include to compilation
UBIFS: add new flash file system
UBIFS: add brief documentation
MAINTAINERS: add UBIFS section
do_mounts: allow UBI root device name
VFS: export sync_sb_inodes
VFS: move inode_lock into sync_sb_inodes
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
IDE: Report errors during drive reset back to user space
Update documentation of HDIO_DRIVE_RESET ioctl
IDE: Remove unused code
IDE: Fix HDIO_DRIVE_RESET handling
hd.c: remove the #include <linux/mc146818rtc.h>
update the BLK_DEV_HD help text
move ide/legacy/hd.c to drivers/block/
ide/legacy/hd.c: use late_initcall()
remove BLK_DEV_HD_ONLY
ide: endian annotations in ide-floppy.c
ide-floppy: zero out the whole struct ide_atapi_pc on init
ide-floppy: fold idefloppy_create_test_unit_ready_cmd into idefloppy_open
ide-cd: move request prep chunk from cdrom_do_newpc_cont to rq issue path
ide-cd: move request prep from cdrom_start_rw_cont to rq issue path
ide-cd: move request prep from cdrom_start_seek_continuation to rq issue path
ide-cd: fold cdrom_start_seek into ide_cd_do_request
ide-cd: simplify request issuing path
ide-cd: mv ide_do_rw_cdrom ide_cd_do_request
ide-cd: cdrom_start_seek: remove unused argument block
ide-cd: ide_do_rw_cdrom: add the catch-all bad request case to the if-else block
...
* 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6: (87 commits)
Fix FADT parsing
Add the ability to reset the machine using the RESET_REG in ACPI's FADT table.
ACPI: use dev_printk when possible
PNPACPI: add support for HP vendor-specific CCSR descriptors
PNP: avoid legacy IDE IRQs
PNP: convert resource options to single linked list
ISAPNP: handle independent options following dependent ones
PNP: remove extra 0x100 bit from option priority
PNP: support optional IRQ resources
PNP: rename pnp_register_*_resource() local variables
PNPACPI: ignore _PRS interrupt numbers larger than PNP_IRQ_NR
PNP: centralize resource option allocations
PNP: remove redundant pnp_can_configure() check
PNP: make resource assignment functions return 0 (success) or -EBUSY (failure)
PNP: in debug resource dump, make empty list obvious
PNP: improve resource assignment debug
PNP: increase I/O port & memory option address sizes
PNP: introduce pnp_irq_mask_t typedef
PNP: make resource option structures private to PNP subsystem
PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM
...
Fail integrity check gracefully when request does not have a bio
attached (BLOCK_PC).
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (82 commits)
NFSv4: Remove BKL from the nfsv4 state recovery
SUNRPC: Remove the BKL from the callback functions
NFS: Remove BKL from the readdir code
NFS: Remove BKL from the symlink code
NFS: Remove BKL from the sillydelete operations
NFS: Remove the BKL from the rename, rmdir and unlink operations
NFS: Remove BKL from NFS lookup code
NFS: Remove the BKL from nfs_link()
NFS: Remove the BKL from the inode creation operations
NFS: Remove BKL usage from open()
NFS: Remove BKL usage from the write path
NFS: Remove the BKL from the permission checking code
NFS: Remove attribute update related BKL references
NFS: Remove BKL requirement from attribute updates
NFS: Protect inode->i_nlink updates using inode->i_lock
nfs: set correct fl_len in nlmclnt_test()
SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon
SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier
SUNRPC: None of rpcb_create's callers wants a privileged source port
SUNRPC: Introduce a specific rpcb_create for contacting localhost
...
ISAPNP, PNPBIOS, and ACPI describe the "possible resource settings" of
a device, i.e., the possibilities an OS bus driver has when it assigns
I/O port, MMIO, and other resources to the device.
PNP used to maintain this "possible resource setting" information in
one independent option structure and a list of dependent option
structures for each device. Each of these option structures had lists
of I/O, memory, IRQ, and DMA resources, for example:
dev
independent options
ind-io0 -> ind-io1 ...
ind-mem0 -> ind-mem1 ...
...
dependent option set 0
dep0-io0 -> dep0-io1 ...
dep0-mem0 -> dep0-mem1 ...
...
dependent option set 1
dep1-io0 -> dep1-io1 ...
dep1-mem0 -> dep1-mem1 ...
...
...
This data structure was designed for ISAPNP, where the OS configures
device resource settings by writing directly to configuration
registers. The OS can write the registers in arbitrary order much
like it writes PCI BARs.
However, for PNPBIOS and ACPI devices, the OS uses firmware interfaces
that perform device configuration, and it is important to pass the
desired settings to those interfaces in the correct order. The OS
learns the correct order by using firmware interfaces that return the
"current resource settings" and "possible resource settings," but the
option structures above doesn't store the ordering information.
This patch replaces the independent and dependent lists with a single
list of options. For example, a device might have possible resource
settings like this:
dev
options
ind-io0 -> dep0-io0 -> dep1->io0 -> ind-io1 ...
All the possible settings are in the same list, in the order they
come from the firmware "possible resource settings" list. Each entry
is tagged with an independent/dependent flag. Dependent entries also
have a "set number" and an optional priority value. All dependent
entries must be assigned from the same set. For example, the OS can
use all the entries from dependent set 0, or all the entries from
dependent set 1, but it cannot mix entries from set 0 with entries
from set 1.
Prior to this patch PNP didn't keep track of the order of this list,
and it assigned all independent options first, then all dependent
ones. Using the example above, that resulted in a "desired
configuration" list like this:
ind->io0 -> ind->io1 -> depN-io0 ...
instead of the list the firmware expects, which looks like this:
ind->io0 -> depN-io0 -> ind-io1 ...
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch adds an IORESOURCE_IRQ_OPTIONAL flag for use when
assigning resources to a device. If the flag is set and we are
unable to assign an IRQ to the device, we can leave the IRQ
disabled but allow the overall resource allocation to succeed.
Some devices request an IRQ, but can run without an IRQ
(possibly with degraded performance). This flag lets us run
the device without the IRQ instead of just leaving the
device disabled.
This is a reimplementation of this previous change by Rene
Herman <rene.herman@gmail.com>:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3b73a223661ed137c5d3d2635f954382e94f5a43
I reimplemented this for two reasons:
- to prepare for converting all resource options into a single linked
list, as opposed to the per-resource-type lists we have now, and
- to preserve the order and number of resource options.
In PNPBIOS and ACPI, we configure a device by giving firmware a
list of resource assignments. It is important that this list
has exactly the same number of resources, in the same order,
as the "template" list we got from the firmware in the first
place.
The problem of a sound card MPU401 being left disabled for want of
an IRQ was reported by Uwe Bugla <uwe.bugla@gmx.de>.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Nothing outside the PNP subsystem should need access to a
device's resource options, so this patch moves the option
structure declarations to a private header file.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
PNP previously defined PNP_PORT_FLAG_16BITADDR and PNP_PORT_FLAG_FIXED
in a private header file, but put those flags in struct resource.flags
fields. Better to make them IORESOURCE_IO_* flags like the existing
IRQ, DMA, and MEM flags.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
As part of a heuristic to identify modem devices, 8250_pnp.c
checks to see whether a device can be configured at any of the
legacy COM port addresses.
This patch moves the code that traverses the PNP "possible resource
options" from 8250_pnp.c to the PNP subsystem. This encapsulation
is important because a future patch will change the implementation
of those resource options.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
PNP used to have a fixed-size pnp_resource_table for tracking the
resources used by a device. This table often overflowed, so we've
had to increase the table size, which wastes memory because most
devices have very few resources.
This patch replaces the table with a linked list of resources where
the entries are allocated on demand.
This removes messages like these:
pnpacpi: exceeded the max number of IO resources
00:01: too many I/O port resources
References:
http://bugzilla.kernel.org/show_bug.cgi?id=9535http://bugzilla.kernel.org/show_bug.cgi?id=9740http://lkml.org/lkml/2007/11/30/110
This patch also changes the way PNP uses the IORESOURCE_UNSET,
IORESOURCE_AUTO, and IORESOURCE_DISABLED flags.
Prior to this patch, the pnp_resource_table entries used the flags
like this:
IORESOURCE_UNSET
This table entry is unused and available for use. When this flag
is set, we shouldn't look at anything else in the resource structure.
This flag is set when a resource table entry is initialized.
IORESOURCE_AUTO
This resource was assigned automatically by pnp_assign_{io,mem,etc}().
This flag is set when a resource table entry is initialized and
cleared whenever we discover a resource setting by reading an ISAPNP
config register, parsing a PNPBIOS resource data stream, parsing an
ACPI _CRS list, or interpreting a sysfs "set" command.
Resources marked IORESOURCE_AUTO are reinitialized and marked as
IORESOURCE_UNSET by pnp_clean_resource_table() in these cases:
- before we attempt to assign resources automatically,
- if we fail to assign resources automatically,
- after disabling a device
IORESOURCE_DISABLED
Set by pnp_assign_{io,mem,etc}() when automatic assignment fails.
Also set by PNPBIOS and PNPACPI for:
- invalid IRQs or GSI registration failures
- invalid DMA channels
- I/O ports above 0x10000
- mem ranges with negative length
After this patch, there is no pnp_resource_table, and the resource list
entries use the flags like this:
IORESOURCE_UNSET
This flag is no longer used in PNP. Instead of keeping
IORESOURCE_UNSET entries in the resource list, we remove
entries from the list and free them.
IORESOURCE_AUTO
No change in meaning: it still means the resource was assigned
automatically by pnp_assign_{port,mem,etc}(), but these functions
now set the bit explicitly.
We still "clean" a device's resource list in the same places,
but rather than reinitializing IORESOURCE_AUTO entries, we
just remove them from the list.
Note that IORESOURCE_AUTO entries are always at the end of the
list, so removing them doesn't reorder other list entries.
This is because non-IORESOURCE_AUTO entries are added by the
ISAPNP, PNPBIOS, or PNPACPI "get resources" methods and by the
sysfs "set" command. In each of these cases, we completely free
the resource list first.
IORESOURCE_DISABLED
In addition to the cases where we used to set this flag, ISAPNP now
adds an IORESOURCE_DISABLED resource when it reads a configuration
register with a "disabled" value.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Some callers use pnp_port_start() and similar functions without
making sure the resource is valid. This patch makes us fall
back to returning the initial values if the resource is not
valid or not even present.
This mostly preserves the previous behavior, where we would just
return the initial values set by pnp_init_resource_table(). The
original 2.6.25 code didn't range-check the "bar", so it would
return garbage if the bar exceeded the table size. This code
returns sensible values instead.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
"idle=nomwait" disables the use of the MWAIT
instruction from both C1 (C1_FFH) and deeper (C2C3_FFH)
C-states.
When MWAIT is unavailable, the BIOS and OS generally
negotiate to use the HALT instruction for C1,
and use IO accesses for deeper C-states.
This option is useful for power and performance
comparisons, and also to work around BIOS bugs
where broken MWAIT support is advertised.
http://bugzilla.kernel.org/show_bug.cgi?id=10807http://bugzilla.kernel.org/show_bug.cgi?id=10914
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
"idle=halt" limits the idle loop to using
the halt instruction. No MWAIT, no IO accesses,
no C-states deeper than C1.
If something is broken in the idle code,
"idle=halt" is a less severe workaround
than "idle=poll" which disables all power savings.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Allow users to enable/disable/clear a specific & valid GPE/Fixed Event
in user space.
This is useful for debugging, especially for some
interrupt storm issues.
All wakeup GPEs are disabled and they can not be enabled at runtime,
and we mark them as invalid.
All GPEs that don't have a _Lxx/_Exx method are marked as invalid.
All Fixed Events that don't have an event handler are marked as invalid
and they can't be enabled until an event handler is registered.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ling Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Eliminated unnecessary operands; eliminated use of negative index
in loop. Operands now displayed in correct order, not backwards.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Now supports the 2007 intel Virtualization Technology for Directed
I/O specification.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Synchronized tables with current specifications.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Update version to 20080514
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Mostly MODULE_NAME and printf format strings.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
No longer needed; replaced mostly with u32, but also acpi_size
where a type that changes 32/64 bit on 32/64-bit platforms is
required.
v2: Fix a cast of a 32-bit int to a pointer in ACPI to avoid a compiler warning.
from David Howells
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Added NULL fields to the exception string arrays to eliminate
the -1 subtraction on the SubStatus field.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Changed ACPI_MODULE_NAME and ACPI_FUNCTION_NAME to use arrays of
strings instead of pointers to static strings. Jan Beulich and
Bob Moore.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Fixes problem where the new method argument count validation mechanism
will enter an infinite loop when a GPE method is dispatched.
Problem fixed be removing the obsolete code that passes GPE block
information to the notify handler via the control method parameter pointer.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Error if too few arguments, warning if too many. This applies
only to external programmatic control method execution, not
method-to-method calls within the AML.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
The freezer currently attempts to distinguish kernel threads from
user space tasks by checking if their mm pointer is unset and it
does not send fake signals to kernel threads. However, there are
kernel threads, mostly related to networking, that behave like
user space tasks and may want to be sent a fake signal to be frozen.
Introduce the new process flag PF_FREEZER_NOSIG that will be set
by default for all kernel threads and make the freezer only send
fake signals to the tasks having PF_FREEZER_NOSIG unset. Provide
the set_freezable_with_signal() function to be called by the kernel
threads that want to be sent a fake signal for freezing.
This patch should not change the freezer's observable behavior.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
Get rid of a superfluous acpi_pm_device_sleep_state() parameter. The
only legitimate value of that parameter must be derived from the first
parameter, which is what all the callers already do. (However, this
does not address the fact that ACPI still doesn't set up those flags.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Reorder the mutex names to match the preceding #defines
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Implemented another change for the GPE disable. We now perform a
read-change-write of the enable register instead of simply writing out the
cached enable mask. This will prevent inadvertent enabling of GPEs if a rogue
GPE is received during initialization (before GPE handlers are installed.)
http://bugzilla.kernel.org/show_bug.cgi?id=6217
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Change processors from an array sized by NR_CPUS to a per_cpu variable.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
This closes some arcane holes in single-step handling that can arise
only when user programs set TF directly (via popf or sigreturn) and
then use vDSO (syscall/sysenter) system call entry. In those entry
paths, the clear_TF_reenable case hits and we must check TIF_SINGLESTEP
to be sure our bookkeeping stays correct wrt the user's view of TF.
Signed-off-by: Roland McGrath <roland@redhat.com>
This unifies and cleans up the syscall tracing code on i386 and x86_64.
Using a single function for entry and exit tracing on 32-bit made the
do_syscall_trace() into some terrible spaghetti. The logic is clear and
simple using separate syscall_trace_enter() and syscall_trace_leave()
functions as on 64-bit.
The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
tracing either 32-bit or 64-bit tasks. It behaves just like 32-bit.
Changing syscall_trace_enter() to return the syscall number shortens
all the assembly paths, while adding the SYSEMU feature in a simple way.
Signed-off-by: Roland McGrath <roland@redhat.com>
This unifies the treatment of TIF_SINGLESTEP on i386 and x86_64.
The bit is now excluded from _TIF_WORK_MASK on i386 as it has been
on x86_64. This means the do_notify_resume() path using it is never
used, so TIF_SINGLESTEP is not cleared on returning to user mode.
Both now leave TIF_SINGLESTEP set when returning to user, so that
it's already set on an int $0x80 system call entry. This removes
the need for testing TF on the system_call path. Doing it this way
fixes the regression for PTRACE_SINGLESTEP into a sigreturn syscall,
introduced by commit 1e2e99f0e4.
The clear_TF_reenable case that sets TIF_SINGLESTEP can only happen
on a non-exception kernel entry, i.e. sysenter/syscall instruction.
That will always get to the syscall exit tracing path.
Signed-off-by: Roland McGrath <roland@redhat.com>
Remove some code which has been made obsolete and hasn't worked properly
before anyway. Part of the infrastructure may be reintroduced in a
follow up patch to implement a working command aborting facility.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Currently, the code path executing an HDIO_DRIVE_RESET ioctl is broken
in various ways. Most importantly, it is treated as an out of band
request in an illegal way which may very likely lead to system lock ups.
Use the drive's request queue to avoid this problem (and fix a locking
issue for free along the way).
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Change ->port_init_devs method to take 'ide_drive_t *' as an argument
instead of 'ide_hwif_t *' and rename it to ->init_dev.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'parent' field to hw_regs_t for optional parent device pointer (needed
by macio PMAC IDE controllers) and set hwif->dev in ide_init_port_hw().
* Update au1xxx-ide.c, sgiioc4.c, pmac.c and setup-pci.c accordingly.
v2:
* Update scc_pata.c.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Move PIO blacklist to ide-pio-blacklist.c.
While at it:
- fix comment
- fix whitespace damage
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
All ide_pio_cycle_time() users already select CONFIG_IDE_TIMINGS
so move the function from ide-lib.c to ide-timings.c.
While at it:
- convert ide_pio_cycle_time() to use ide_timing_find_mode()
- cleanup ide_pio_cycle_time() a bit
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Don't include ide-timing.h in cs5535 and sis5513 host drivers
(they don't need it currently).
* Convert ide-timing.h to ide-timings.c library and add CONFIG_IDE_TIMINGS
config option to be selected by host drivers using the library.
While at it:
- fix ide_timing_find_mode() placement
v2:
* Add missing EXPORT_SYMBOLs. (Stephen Rothwell <sfr@canb.auug.org.au>)
There should be no functional changes caused by this patch.
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Move struct ide_timing and IDE_TIMING_* defines to <linux/ide.h>
from drivers/ide/ide-timing.h.
While at it:
- use u8/u16 instead of short for struct ide_timing fields
- use enum for IDE_TIMING_*
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The standard ticket spinlocks are very expensive in a virtual
environment, because their performance depends on Xen's scheduler
giving vcpus time in the order that they're supposed to take the
spinlock.
This implements a Xen-specific spinlock, which should be much more
efficient.
The fast-path is essentially the old Linux-x86 locks, using a single
lock byte. The locker decrements the byte; if the result is 0, then
they have the lock. If the lock is negative, then locker must spin
until the lock is positive again.
When there's contention, the locker spin for 2^16[*] iterations waiting
to get the lock. If it fails to get the lock in that time, it adds
itself to the contention count in the lock and blocks on a per-cpu
event channel.
When unlocking the spinlock, the locker looks to see if there's anyone
blocked waiting for the lock by checking for a non-zero waiter count.
If there's a waiter, it traverses the per-cpu "lock_spinners"
variable, which contains which lock each CPU is waiting on. It picks
one CPU waiting on the lock and sends it an event to wake it up.
This allows efficient fast-path spinlock operation, while allowing
spinning vcpus to give up their processor time while waiting for a
contended lock.
[*] 2^16 iterations is threshold at which 98% locks have been taken
according to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around". Therefore, we'd expect the lock and unlock slow
paths will only be entered 2% of the time.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implement a version of the old spinlock algorithm, in which everyone
spins waiting for a lock byte. In order to be compatible with the
ticket-lock's use of a zero initializer, this uses the convention of
'0' for unlocked and '1' for locked.
This algorithm is much better than ticket locks in a virtual
envionment, because it doesn't interact badly with the vcpu scheduler.
If there are multiple vcpus spinning on a lock and the lock is
released, the next vcpu to be scheduled will take the lock, rather
than cycling around until the next ticketed vcpu gets it.
To use this, you must call paravirt_use_bytelocks() very early, before
any spinlocks have been taken.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ticket spinlocks have absolutely ghastly worst-case performance
characteristics in a virtual environment. If there is any contention
for physical CPUs (ie, there are more runnable vcpus than cpus), then
ticket locks can cause the system to end up spending 90+% of its time
spinning.
The problem is that (v)cpus waiting on a ticket spinlock will be
granted access to the lock in strict order they got their tickets. If
the hypervisor scheduler doesn't give the vcpus time in that order,
they will burn timeslices waiting for the scheduler to give the right
vcpu some time. In the worst case it could take O(n^2) vcpu scheduler
timeslices for everyone waiting on the lock to get it, not counting
new cpus trying to take the lock while the log-jam is sorted out.
These hooks allow a paravirt backend to replace the spinlock
implementation.
At the very least, this could revert the implementation back to the
old lock algorithm, which allows the next scheduled vcpu to take the
lock, and has basically fairly good performance.
It also allows the spinlocks to take advantages of the hypervisor
features to make locks more efficient (spin and block, for example).
The cost to native execution is an extra direct call when using a
spinlock function. There's no overhead if CONFIG_PARAVIRT is turned
off.
The lock structure is fixed at a single "unsigned int", initialized to
zero, but the spinlock implementation can use it as it wishes.
Thanks to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around" for pointing out this problem.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
AMD only supports "syscall" from 32-bit compat usermode.
Intel and Centaur(?) only support "sysenter" from 32-bit compat usermode.
Set the X86 feature bits accordingly, and set up the vdso in
accordance with those bits. On the offchance we run on in a 64-bit
environment which supports neither syscall nor sysenter from 32-bit
mode, then fall back to the int $0x80 vdso.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
fix:
arch/x86/xen/built-in.o: In function `set_page_prot':
enlighten.c:(.text+0x111d): undefined reference to `xen_raw_printk'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Xen hypercall interface is allowed to trash any or all of the
argument registers, so we need to be careful that the kernel state
isn't damaged. On 32-bit kernels, the hypercall parameter registers
same as a regparm function call, so we've got away without explicit
clobbering so far. The 64-bit ABI defines lots of caller-save
registers, so save them all for safety. We can trim this set later by
re-distributing the responsibility for saving all these registers.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
64-bit hypercall interface can pass a maddr in one argument rather
than splitting it.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use callback_op hypercall to register callbacks in a 32/64-bit
independent way (64-bit doesn't need a code segment, but that detail
is hidden in XEN_CALLBACK).
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When building initial pagetables in 64-bit kernel the pud/pmd pointer may
be in ioremap/fixmap space, so we need to walk the pagetable to look up the
physical address.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Set up the initial pagetables to map the kernel mapping into the
physical mapping space. This makes __va() usable, since it requires
physical mappings.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As a stopgap until Mike Travis's x86-64 gs-based percpu patches are
ready, provide workaround functions for x86_read/write_percpu for
Xen's use.
Specifically, this means that we can't really make use of vcpu
placement, because we can't use a single gs-based memory access to get
to vcpu fields. So disable all that for now.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need extra pv_mmu_ops for 64-bit, to deal with the extra level of
pagetable.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The 64-bit calling convention for hypercalls uses different registers
from 32-bit. Annoyingly, gcc's asm syntax doesn't have a way to
specify one of the extra numeric reigisters in a constraint, so we
must use explicitly placed register variables. Given that we have to
do it for some args, may as well do it for all.
Also fix syntax gcc generates for the call instruction itself. We
need a plain direct call, but the asm expansion which works on 32-bit
generates a rip-relative addressing mode in 64-bit, which is treated
as an indirect call. The alternative is to pass the hypercall page
offset into the asm, and have it add it to the hypercall page start
address to generate the call.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
64-bit guests can pass 64-bit quantities in a single argument,
so fix up the hypercalls.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Copy 64-bit definitions of various interface structures into place.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add xen_timer_resume() hook.
Timer resume should be done after event channel is resumed.
add xen_arch_resume() hook when ipi becomes usable after resume.
After resume, some cpu specific resource must be reinitialized
on ia64 that can't be set by another cpu.
However available hooks is run once on only one cpu so that ipi has
to be used.
During stop_machine_run() ipi can't be used because interrupt is masked.
So add another hook after stop_machine_run().
Another approach might be use resume hook which is run by
device_resume(). However device_resume() may be executed on
suspend error recovery path.
So it is necessary to determine whether it is executed on real resume path
or error recovery path.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows Xen's xen_cpu_up() to allocate a pda for the new CPU.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Call paravirt_pagetable_setup_{start,done}
These paravirt_ops functions were not being called on x86_64.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (249 commits)
powerpc: Fix pte_update for CONFIG_PTE_64BIT and !PTE_ATOMIC_UPDATES
powerpc: Fix a build problem on ppc32 with new DMA_ATTRs
ibm_newemac: Add MII mode support to the EMAC RGMII bridge.
powerpc: Don't spin on sync instruction at boot time
powerpc: Add VSX load/store alignment exception handler
powerpc: fix giveup_vsx to save registers correctly
powerpc: support for latencytop
powerpc: Remove unnecessary condition when sanity-checking WIMG bits
powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT
powerpc: Add driver for Barrier Synchronization Register
powerpc: mman.h export fixups
powerpc/fsl: update crypto node definition and device tree instances
powerpc/fsl: Refactor device bindings
powerpc/85xx: Minor fixes for 85xxds and 8536ds board.
powerpc: Add 82xx/83xx/86xx to 6xx Multiplatform
powerpc/85xx: publish of device for cds platforms
powerpc/booke: don't reinitialize time base
powerpc/86xx: Refactor pic init
powerpc/CPM: Add i2c pins to dts and board setup
cpm_uart: Support uart_wait_until_sent()
...
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits)
[SCSI] scsi_dh: fix kconfig related build errors
[SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of
[SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h
[SCSI] make struct scsi_{host,target}_type static
[SCSI] fix locking in host use of blk_plug_device()
[SCSI] zfcp: Cleanup external header file
[SCSI] zfcp: Cleanup code in zfcp_erp.c
[SCSI] zfcp: zfcp_fsf cleanup.
[SCSI] zfcp: consolidate sysfs things into one file.
[SCSI] zfcp: Cleanup of code in zfcp_aux.c
[SCSI] zfcp: Cleanup of code in zfcp_scsi.c
[SCSI] zfcp: Move status accessors from zfcp to SCSI include file.
[SCSI] zfcp: Small QDIO cleanups
[SCSI] zfcp: Adapter reopen for large number of unsolicited status
[SCSI] zfcp: Fix error checking for ELS ADISC requests
[SCSI] zfcp: wait until adapter is finished with ERP during auto-port
[SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver
[SCSI] sg: Add target reset support
[SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC
[SCSI] sd: Move scsi_disk() accessor function to sd.h
...
Introduce a new API to register RPC services on IPv6 interfaces to allow
the NFS server and lockd to advertise on IPv6 networks.
Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind
protocol version 4 to contact the local rpcbind daemon. The version 4
SET/UNSET procedures allow services to register address families besides
AF_INET, register at specific network interfaces, and register transport
protocols besides UDP and TCP. All of this functionality is exposed via
the new rpcb_v4_register() kernel API.
A user-space rpcbind daemon implementation that supports version 4 of the
rpcbind protocol is required in order to make use of this new API.
Note that rpcbind version 3 is sufficient to support the new rpcbind
facilities listed above, but most extant implementations use version 4.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (54 commits)
[MIPS] Remove mips_machtype for LASAT machines
[MIPS] Remove mips_machtype from EMMA2RH machines
[MIPS] Remove mips_machtype from ARC based machines
[MIPS] MTX-1 flash partition setup move to platform devices registration
[MIPS] TXx9: cleanup and fix some sparse warnings
[MIPS] TXx9: rename asm-mips/mach-jmr3927 to asm-mips/mach-tx39xx
[MIPS] remove machtype for group Toshiba
[MIPS] separate rbtx4927_time_init() and rbtx4937_time_init()
[MIPS] separate rbtx4927_arch_init() and rbtx4937_arch_init()
[MIPS] txx9_cpu_clock setup move to rbtx4927_time_init()
[MIPS] txx9_board_vec set directly without mips_machtype
[MIPS] IP22: Add platform device for Indy volume buttons
[MIPS] cmbvr4133: Remove support
[MIPS] remove wrppmc_machine_power_off()
[MIPS] replace inline assembler to cpu_wait()
[MIPS] IP22/28: Add platform devices for HAL2
[MIPS] TXx9: Update and merge defconfigs
[MIPS] TXx9: Make single kernel can support multiple boards
[MIPS] TXx9: Update defconfigs
[MIPS] TXx9: Reorganize PCI code
...
* 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
generic-ipi: more merge fallout
generic-ipi: merge fix
x86, visws: use mach-default/entry_arch.h
x86, visws: fix generic-ipi build
generic-ipi: fixlet
generic-ipi: fix s390 build bug
generic-ipi: fix linux-next tree build failure
fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
fix "smp_call_function: get rid of the unused nonatomic/retry argument"
on_each_cpu(): kill unused 'retry' parameter
smp_call_function: get rid of the unused nonatomic/retry argument
sh: convert to generic helpers for IPI function calls
parisc: convert to generic helpers for IPI function calls
mips: convert to generic helpers for IPI function calls
m32r: convert to generic helpers for IPI function calls
arm: convert to generic helpers for IPI function calls
alpha: convert to generic helpers for IPI function calls
ia64: convert to generic helpers for IPI function calls
powerpc: convert to generic helpers for IPI function calls
...
Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually
* 'core/rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (23 commits)
rcu classic: update qlen when cpu offline
rcu: make rcutorture even more vicious: invoke RCU readers from irq handlers (timers)
rcu: make quiescent rcutorture less power-hungry
rcu, rcutorture: make quiescent rcutorture less power-hungry
rcu: make rcutorture more vicious: reinstate boot-time testing
rcu: make rcutorture more vicious: add stutter feature
rcutorture: WARN_ON_ONCE(1) when detecting an error
rcu: remove unused field struct rcu_data::rcu_tasklet
Revert "prohibit rcutorture from being compiled into the kernel"
rcu: fix nf_conntrack_helper.c build bug
rculist.h: fix include in net/netfilter/nf_conntrack_netlink.c
rcu: remove duplicated include in kernel/rcupreempt.c
rcu: remove duplicated include in kernel/rcupreempt_trace.c
RCU, rculist.h: fix list iterators
rcu: fix rcu_try_flip_waitack_needed() to prevent grace-period stall
rculist.h: use the rcu API
rcu: split list.h and move rcu-protected lists into rculist.h
sched: 1Q08 RCU doc update, add call_rcu_sched()
rcu: add call_rcu_sched() and friends to rcutorture
rcu: add rcu_barrier_sched() and rcu_barrier_bh()
...
Commit 1ea0704e0d aka "mm: add a ptep_modify_prot transaction abstraction"
caused:
| CC init/main.o
|In file included from include2/asm/pgtable.h:68,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/mm.h:39,
| from include2/asm/uaccess.h:8,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/poll.h:13,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/rtc.h:113,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/efi.h:19,
| from /home/bigeasy/git/linux-2.6-m68k/init/main.c:43:
|/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_start':
|/linux-2.6/include/asm-generic/pgtable.h:209: error: implicit declaration of function 'ptep_get_and_clear'
|/linux-2.6/include/asm-generic/pgtable.h:209: error: incompatible types in return
|/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_commit':
|/linux-2.6/include/asm-generic/pgtable.h:220: error: implicit declaration of function 'set_pte_at'
|make[2]: *** [init/main.o] Error 1
|make[1]: *** [init] Error 2
|make: *** [sub-make] Error 2
on my m68knommu box.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to
allow for future support of IPv6. Also provide additional sanity
checking in failover_unlock_ip() when constructing the server's IP
address.
As an added bonus, provide clean kerneldoc comments on related NLM
interfaces which were recently added.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'sbp2-spindown' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: sbp2: spin disks down on suspend and shutdown
firewire: fw-sbp2: spin disks down on suspend and shutdown
ieee1394: sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
firewire: fw-sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
scsi: sd: optionally set power condition in START STOP UNIT