Commit graph

9317 commits

Author SHA1 Message Date
Paul Mackerras
aa04b4cc5b KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests
This adds infrastructure which will be needed to allow book3s_hv KVM to
run on older POWER processors, including PPC970, which don't support
the Virtual Real Mode Area (VRMA) facility, but only the Real Mode
Offset (RMO) facility.  These processors require a physically
contiguous, aligned area of memory for each guest.  When the guest does
an access in real mode (MMU off), the address is compared against a
limit value, and if it is lower, the address is ORed with an offset
value (from the Real Mode Offset Register (RMOR)) and the result becomes
the real address for the access.  The size of the RMA has to be one of
a set of supported values, which usually includes 64MB, 128MB, 256MB
and some larger powers of 2.

Since we are unlikely to be able to allocate 64MB or more of physically
contiguous memory after the kernel has been running for a while, we
allocate a pool of RMAs at boot time using the bootmem allocator.  The
size and number of the RMAs can be set using the kvm_rma_size=xx and
kvm_rma_count=xx kernel command line options.

KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability
of the pool of preallocated RMAs.  The capability value is 1 if the
processor can use an RMA but doesn't require one (because it supports
the VRMA facility), or 2 if the processor requires an RMA for each guest.

This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the
pool and returns a file descriptor which can be used to map the RMA.  It
also returns the size of the RMA in the argument structure.

Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION
ioctl calls from userspace.  To cope with this, we now preallocate the
kvm->arch.ram_pginfo array when the VM is created with a size sufficient
for up to 64GB of guest memory.  Subsequently we will get rid of this
array and use memory associated with each memslot instead.

This moves most of the code that translates the user addresses into
host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level
to kvmppc_core_prepare_memory_region.  Also, instead of having to look
up the VMA for each page in order to check the page size, we now check
that the pages we get are compound pages of 16MB.  However, if we are
adding memory that is mapped to an RMA, we don't bother with calling
get_user_pages_fast and instead just offset from the base pfn for the
RMA.

Typically the RMA gets added after vcpus are created, which makes it
inconvenient to have the LPCR (logical partition control register) value
in the vcpu->arch struct, since the LPCR controls whether the processor
uses RMA or VRMA for the guest.  This moves the LPCR value into the
kvm->arch struct and arranges for the MER (mediated external request)
bit, which is the only bit that varies between vcpus, to be set in
assembly code when going into the guest if there is a pending external
interrupt request.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
Paul Mackerras
371fefd6f2 KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7.  The host still has to run single-threaded.

This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability.  The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.

To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode.  KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline).  To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c.  In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it.  Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.

When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host.  This number is exported
to userspace via the KVM_CAP_PPC_SMT capability.  If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.

We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host.  We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked.  This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.

When a vcore starts to run, it executes in the context of one of the
vcpu threads.  The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).

It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running.  In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest.  It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.

Note that there is no fixed relationship between the hardware thread
number and the vcpu number.  Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
David Gibson
54738c0971 KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode
This improves I/O performance for guests using the PAPR
paravirtualization interface by making the H_PUT_TCE hcall faster, by
implementing it in real mode.  H_PUT_TCE is used for updating virtual
IOMMU tables, and is used both for virtual I/O and for real I/O in the
PAPR interface.

Since this moves the IOMMU tables into the kernel, we define a new
KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables.  The
ioctl returns a file descriptor which can be used to mmap the newly
created table.  The qemu driver models use them in the same way as
userspace managed tables, but they can be updated directly by the
guest with a real-mode H_PUT_TCE implementation, reducing the number
of host/guest context switches during guest IO.

There are certain circumstances where it is useful for userland qemu
to write to the TCE table even if the kernel H_PUT_TCE path is used
most of the time.  Specifically, allowing this will avoid awkwardness
when we need to reset the table.  More importantly, we will in the
future need to write the table in order to restore its state after a
checkpoint resume or migration.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:56 +03:00
Paul Mackerras
de56a948b9 KVM: PPC: Add support for Book3S processors in hypervisor mode
This adds support for KVM running on 64-bit Book 3S processors,
specifically POWER7, in hypervisor mode.  Using hypervisor mode means
that the guest can use the processor's supervisor mode.  That means
that the guest can execute privileged instructions and access privileged
registers itself without trapping to the host.  This gives excellent
performance, but does mean that KVM cannot emulate a processor
architecture other than the one that the hardware implements.

This code assumes that the guest is running paravirtualized using the
PAPR (Power Architecture Platform Requirements) interface, which is the
interface that IBM's PowerVM hypervisor uses.  That means that existing
Linux distributions that run on IBM pSeries machines will also run
under KVM without modification.  In order to communicate the PAPR
hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
to include/linux/kvm.h.

Currently the choice between book3s_hv support and book3s_pr support
(i.e. the existing code, which runs the guest in user mode) has to be
made at kernel configuration time, so a given kernel binary can only
do one or the other.

This new book3s_hv code doesn't support MMIO emulation at present.
Since we are running paravirtualized guests, this isn't a serious
restriction.

With the guest running in supervisor mode, most exceptions go straight
to the guest.  We will never get data or instruction storage or segment
interrupts, alignment interrupts, decrementer interrupts, program
interrupts, single-step interrupts, etc., coming to the hypervisor from
the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
exception entry path so that we don't have to do the KVM test on entry
to those exception handlers.

We do however get hypervisor decrementer, hypervisor data storage,
hypervisor instruction storage, and hypervisor emulation assist
interrupts, so we have to handle those.

In hypervisor mode, real-mode accesses can access all of RAM, not just
a limited amount.  Therefore we put all the guest state in the vcpu.arch
and use the shadow_vcpu in the PACA only for temporary scratch space.
We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
We don't have a shared page with the guest, but we still need a
kvm_vcpu_arch_shared struct to store the values of various registers,
so we include one in the vcpu_arch struct.

The POWER7 processor has a restriction that all threads in a core have
to be in the same partition.  MMU-on kernel code counts as a partition
(partition 0), so we have to do a partition switch on every entry to and
exit from the guest.  At present we require the host and guest to run
in single-thread mode because of this hardware restriction.

This code allocates a hashed page table for the guest and initializes
it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
require that the guest memory is allocated using 16MB huge pages, in
order to simplify the low-level memory management.  This also means that
we can get away without tracking paging activity in the host for now,
since huge pages can't be paged or swapped.

This also adds a few new exports needed by the book3s_hv code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:54 +03:00
Scott Wood
a4cd8b23ac KVM: PPC: e500: enable magic page
This is a shared page used for paravirtualization.  It is always present
in the guest kernel's effective address space at the address indicated
by the hypercall that enables it.

The physical address specified by the hypercall is not used, as
e500 does not have real mode.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:37 +03:00
Avi Kivity
411c588dfb KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0
When CR0.WP=0, we sometimes map user pages as kernel pages (to allow
the kernel to write to them).  Unfortunately this also allows the kernel
to fetch from these pages, even if CR4.SMEP is set.

Adjust for this by also setting NX on the spte in these circumstances.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12 13:16:26 +03:00
Jan Kiszka
58f0964ee4 KVM: Fix KVM_ASSIGN_SET_MSIX_ENTRY documentation
The documented behavior did not match the implemented one (which also
never changed).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12 13:16:19 +03:00
Jan Kiszka
91e3d71db2 KVM: Clarify KVM_ASSIGN_PCI_DEVICE documentation
Neither host_irq nor the guest_msi struct are used anymore today.
Tag the former, drop the latter to avoid confusion.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12 13:16:16 +03:00
Jan Kiszka
7f4382e8fd KVM: Fixup documentation section numbering
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12 13:16:10 +03:00
Sasha Levin
55399a02e9 KVM: Document KVM_IOEVENTFD
Document KVM_IOEVENTFD that can be used to receive
notifications of PIO/MMIO events without triggering
an exit.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12 13:15:56 +03:00
Nadav Har'El
823e396558 KVM: nVMX: Documentation
This patch includes a brief introduction to the nested vmx feature in the
Documentation/kvm directory. The document also includes a copy of the
vmcs12 structure, as requested by Avi Kivity.

[marcelo: move to Documentation/virtual/kvm]

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12 13:15:22 +03:00
Kevin Hilman
f3393b62f1 PM / Runtime: Add new helper function: pm_runtime_status_suspended()
This boolean function simply returns whether or not the runtime status
of the device is 'suspended'.  Unlike pm_runtime_suspended(), this
function returns the runtime status whether or not runtime PM for the
device has been disabled or not.

Also add entry to Documentation/power/runtime.txt

Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-12 11:17:09 +02:00
Avi Kivity
e76779339b KVM: Document KVM_GET_LAPIC, KVM_SET_LAPIC ioctl
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12 11:44:55 +03:00
Linus Torvalds
5adaf851d2 Documentation/Changes: remove some really obsolete text
That file harkens back to the days of the big 2.4 -> 2.6 version jump,
and was based even then on older versions.  Some of it is just obsolete,
and Jesper Juhl points out that it talks about kernel versions 2.6 and
should be updated to 3.0.

Remove some obsolete text, and re-phrase some other to not be 2.6-specific.

Reported-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-11 16:48:38 -07:00
Linus Torvalds
c15000b40d Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  [media] msp3400: fill in v4l2_tuner based on vt->type field
  [media] tuner-core.c: don't change type field in g_tuner or g_frequency
  [media] cx18/ivtv: fix g_tuner support
  [media] tuner-core: power up tuner when called with s_power(1)
  [media] v4l2-ioctl.c: check for valid tuner type in S_HW_FREQ_SEEK
  [media] tuner-core: simplify the standard fixup
  [media] tuner-core/v4l2-subdev: document that the type field has to be filled in
  [media] v4l2-subdev.h: remove unused s_mode tuner op
  [media] feature-removal-schedule: change in how radio device nodes are handled
  [media] bttv: fix s_tuner for radio
  [media] pvrusb2: fix g/s_tuner support
  [media] v4l2-ioctl.c: prefill tuner type for g_frequency and g/s_tuner
  [media] tuner-core: fix tuner_resume: use t->mode instead of t->type
  [media] tuner-core: fix s_std and s_tuner
2011-07-11 16:43:27 -07:00
Linus Torvalds
145628130b Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
  hp-wmi: fix use after free
  dell-laptop - using buffer without mutex_lock
  Revert: "dell-laptop: Toggle the unsupported hardware killswitch"
  platform-drivers-x86: set backlight type to BACKLIGHT_PLATFORM
  thinkpad-acpi: handle HKEY 0x4010, 0x4011 events
  drivers/platform/x86: Fix memory leak
  thinkpad-acpi: handle some new HKEY 0x60xx events
  acer-wmi: fix bitwise bug when set device state
  acer-wmi: Only update rfkill status for associated hotkey events
2011-07-11 12:47:09 -07:00
Muthu Kumar
0580181784 Documentation/spinlocks.txt: Remove reference to sti()/cli()
Since we removed sti()/cli() and related, how about removing it from
Documentation/spinlocks.txt?

Signed-off-by: Muthukumar R <muthur@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-11 12:45:04 -07:00
Johannes Berg
f15220eaa7 mac80211: fix docbook
I changed the TKIP key functions, but forgot to
update the documentation includes, fix that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-11 15:02:18 -04:00
David Herrmann
3c1c2fce64 HID: wiimote: Add sysfs support to wiimote driver
Add sysfs files for each led of the wiimote. Writing 1 to the file
enables the led and 0 disables the led.

We do not need memory barriers when checking wdata->ready since we use
a spinlock directly after it.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-11 14:30:24 +02:00
Jiri Kosina
b7e9c223be Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply pending patches that
are based on newer code already present upstream.
2011-07-11 14:15:55 +02:00
David Jander
fd05d08920 Input: gpio_keys - add support for device-tree platform data
This patch enables fetching configuration data, which is normally provided
via platform_data, from the device-tree instead.

If the device is configured from device-tree data, the platform_data struct
is not used, and button data needs to be allocated dynamically. Big part of
this patch deals with confining pdata usage to the probe function, to make
this possible.

Signed-off-by: David Jander <david@protonic.nl>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-07-10 16:08:28 -07:00
Linus Torvalds
2169ce92ca Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: conditional resource-reallocation through kernel parameter pci=realloc
2011-07-10 07:28:51 -07:00
Grant Likely
2e39e5be1d tty/serial: Add devicetree support for nVidia Tegra serial ports
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-10 06:37:43 +09:00
Binghua Duan
02c981c07b ARM: CSR: Adding CSR SiRFprimaII board support
SiRFprimaII is the latest generation application processor from CSR’s
Multifunction SoC product family. Designed around an ARM cortex A9 core,
high-speed memory bus, advanced 3D accelerator and full-HD multi-format
video decoder, SiRFprimaII is able to meet the needs of complicated
applications for modern multifunction devices that require heavy concurrent
applications and fluid user experience. Integrated with GPS baseband,
analog and PMU, this new platform is designed to provide a cost effective
solution for Automotive and Consumer markets.

This patch adds the basic support for this SoC and EVB board based on device
tree. It is following the ZYNQ of Xilinx in some degree.

Signed-off-by: Binghua Duan <Binghua.Duan@csr.com>
Signed-off-by: Rongjun Ying <Rongjun.Ying@csr.com>
Signed-off-by: Zhiwu Song <Zhiwu.Song@csr.com>
Signed-off-by: Yuping Luo <Yuping.Luo@csr.com>
Signed-off-by: Bin Shi <Bin.Shi@csr.com>
Signed-off-by: Huayi Li <Huayi.Li@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
2011-07-09 07:19:28 +08:00
Ram Pai
f483d3923d PCI: conditional resource-reallocation through kernel parameter pci=realloc
Multiple attempts to dynamically reallocate pci resources have
unfortunately lead to regressions. Though we continue to fix the
regressions and fine tune the dynamic-reallocation behavior, we have not
reached a acceptable state yet.
    
This patch provides a interim solution. It disables dynamic reallocation
by default, but adds the ability to enable it through pci=realloc kernel
command line parameter.
    
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-08 15:49:20 -07:00
Kirill Smelkov
cc62a7eb63 USB: EHCI: Allow users to override 80% max periodic bandwidth
There are cases, when 80% max isochronous bandwidth is too limiting.

For example I have two USB video capture cards which stream uncompressed
video, and to stream full NTSC + PAL videos we'd need

    NTSC 640x480 YUV422 @30fps      ~17.6 MB/s
    PAL  720x576 YUV422 @25fps      ~19.7 MB/s

isoc bandwidth.

Now, due to limited alt settings in capture devices NTSC one ends up
streaming with max_pkt_size=2688  and  PAL with max_pkt_size=2892, both
with interval=1. In terms of microframe time allocation this gives

    NTSC    ~53us
    PAL     ~57us

and together

    ~110us  >  100us == 80% of 125us uframe time.

So those two devices can't work together simultaneously because the'd
over allocate isochronous bandwidth.

80% seemed a bit arbitrary to me, and I've tried to raise it to 90% and
both devices started to work together, so I though sometimes it would be
a good idea for users to override hardcoded default of max 80% isoc
bandwidth.

After all, isn't it a user who should decide how to load the bus? If I
can live with 10% or even 5% bulk bandwidth that should be ok. I'm a USB
newcomer, but that 80% set in stone by USB 2.0 specification seems to be
chosen pretty arbitrary to me, just to serve as a reasonable default.

NOTE 1
~~~~~~

for two streams with max_pkt_size=3072 (worst case) both time
allocation would be 60us+60us=120us which is 96% periodic bandwidth
leaving 4% for bulk and control.  Alan Stern suggested that bulk then
would be problematic (less than 300*8 bittimes left per microframe), but
I think that is still enough for control traffic.

NOTE 2
~~~~~~

Sarah Sharp expressed concern that maxing out periodic bandwidth
could lead to vendor-specific hardware bugs on host controllers, because

> It's entirely possible that you'll run into
> vendor-specific bugs if you try to pack the schedule with isochronous
> transfers.  I don't think any hardware designer would seriously test or
> validate their hardware with a schedule that is basically a violation of
> the USB bus spec (more than 80% for periodic transfers).

So far I've only tested this patch on my HP Mini 5103 with N10 chipset

    kirr@mini:~$ lspci
    00:00.0 Host bridge: Intel Corporation N10 Family DMI Bridge
    00:02.0 VGA compatible controller: Intel Corporation N10 Family Integrated Graphics Controller
    00:02.1 Display controller: Intel Corporation N10 Family Integrated Graphics Controller
    00:1b.0 Audio device: Intel Corporation N10/ICH 7 Family High Definition Audio Controller (rev 02)
    00:1c.0 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 1 (rev 02)
    00:1c.3 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 4 (rev 02)
    00:1d.0 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #1 (rev 02)
    00:1d.1 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 (rev 02)
    00:1d.2 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 (rev 02)
    00:1d.3 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 (rev 02)
    00:1d.7 USB Controller: Intel Corporation N10/ICH 7 Family USB2 EHCI Controller (rev 02)
    00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
    00:1f.0 ISA bridge: Intel Corporation NM10 Family LPC Controller (rev 02)
    00:1f.2 SATA controller: Intel Corporation N10/ICH7 Family SATA AHCI Controller (rev 02)
    01:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01)
    02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8059 PCI-E Gigabit Ethernet Controller (rev 11)

and the system works stable with 110us/uframe (~88%) isoc bandwith allocated for
above-mentioned isochronous transfers.

NOTE 3
~~~~~~

This feature is off by default. I mean max periodic bandwidth is set to
100us/uframe by default exactly as it was before the patch. So only those of us
who need the extreme settings are taking the risk - normal users who do not
alter uframe_periodic_max sysfs attribute should not see any change at all.

NOTE 4
~~~~~~

I've tried to update documentation in Documentation/ABI/ thoroughly, but
only "TBD" was put into Documentation/usb/ehci.txt -- the text there seems
to be outdated and much needing refreshing, before it could be amended.

Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08 14:51:33 -07:00
Shawn Guo
8937cb602b gpio/mxc: add device tree probe support
The patch adds device tree probe support for gpio-mxc driver.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-08 12:38:18 -06:00
David S. Miller
06b8fc5d30 net: Fix default in docs for tcp_orphan_retries.
Default should be listed at 8 instead of 7.

Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-08 09:31:31 -07:00
John W. Linville
204d1641d2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-07-08 11:03:36 -04:00
Timur Tabi
6db7199407 drivers/virt: introduce Freescale hypervisor management driver
Add the drivers/virt directory, which houses drivers that support
virtualization environments, and add the Freescale hypervisor management
driver.

The Freescale hypervisor management driver provides several services to
drivers and applications related to the Freescale hypervisor:

1. An ioctl interface for querying and managing partitions

2. A file interface to reading incoming doorbells

3. An interrupt handler for shutting down the partition upon receiving the
   shutdown doorbell from a manager partition

4. A kernel interface for receiving callbacks when a managed partition
   shuts down.

Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:27 -05:00
David Howells
c902ce1bfb FS-Cache: Add a helper to bulk uncache pages on an inode
Add an FS-Cache helper to bulk uncache pages on an inode.  This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.

This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages.  This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress.  Fix it by uncaching mapped pages when we disable the inode
cookie.

This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.

  ------------[ cut here ]------------
  kernel BUG at fs/cachefiles/namei.c:201!
  invalid opcode: 0000 [#1] SMP
  Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
  RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
  RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
  RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
  R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
  R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
  FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
  Stack:
   0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
   ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
   ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
  Call Trace:
   cachefiles_lookup_object+0x78/0xd4 [cachefiles]
   fscache_lookup_object+0x131/0x16d [fscache]
   fscache_object_work_func+0x1bc/0x669 [fscache]
   process_one_work+0x186/0x298
   worker_thread+0xda/0x15d
   kthread+0x84/0x8c
   kernel_thread_helper+0x4/0x10
  RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  ---[ end trace 1d481c9af1804caa ]---

I tested the uncaching by the following means:

 (1) Create a big file on my NFS server (104857600 bytes).

 (2) Read the file into the cache with md5sum on the NFS client.  Look in
     /proc/fs/fscache/stats:

	Pages  : mrk=25601 unc=0

 (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
     again:

	Pages  : mrk=25601 unc=25601

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-07 13:21:56 -07:00
Kim Phillips
5a9ebe9599 dt: bindings: move SEC node under new crypto/
Since technically it's not powerpc arch-specific.  Also rename it sec2
to differentiate it from its incompatible successor, the SEC 4.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-07 13:51:19 -06:00
Hans Verkuil
6293698277 [media] feature-removal-schedule: change in how radio device nodes are handled
Radio devices have weird side-effects when used with combined TV/radio
tuners and the V4L2 spec is ambiguous on how it should work. This results
in inconsistent driver behavior which makes life hard for everyone.

Be more strict in when and how the switch between radio and tv mode
takes place and make sure all drivers behave the same.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-07-07 15:03:47 -03:00
Henrique de Moraes Holschuh
a50245af78 thinkpad-acpi: handle HKEY 0x4010, 0x4011 events
Handle events 0x4010 and 0x4011 so that we do not pester users about them.

These events report when the thinkpad is docked/undocked to a native
hotplug dock (i.e. one that does not need ACPI handling, nor is represented
in the ACPI device tree).  Such docks are based on USB 2.0/3.0, and also
work as port replicators.

We really want a proper dock class to report these, or at least new input
EV_SW events.  Since it is not clear which one to use yet, keep reporting
them as vendor-specific ThinkPad events.

WARNING: As defined by the thinkpad-acpi sysfs ABI rules of engagement, the
vendor-specific events will be REMOVED as soon as generic events are made
available (duplicate events are a big problem), with an appropriate update
to the thinkpad-acpi sysfs/event ABI versioning.  Userspace is already
prepared to provide easy backwards compatibility for such changes when
convenient to the distro (see acpi-fakekey).

* Event 0x4010: docking to hotplug dock/port replicator
* Event 0x4011: undocking from hotplug dock/port replicator

Typical usecase would be to trigger display reconfiguration.

Reports mention T410, T510, and series 3 docks/port replicators.  Special
thanks to Robert de Rooy for his extensive report and analysis of the
situation.

http://www.thinkwiki.org/wiki/ThinkPad_Port_Replicator_Series_3
http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Series_3
http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Plus_Series_3
http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Plus_Series_3_for_Mobile_Workstations
http://lenovoblogs.com/insidethebox/?p=290

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Reported-by: Claudius Hubig <claudiushubig@chubig.net>
Reported-by: Doctor Bill <docbill@gmail.com>
Reported-by: Korte Noack <gbk.noack@gmx.de>
Reported-by: Robert de Rooy <robert.de.rooy@gmail.com>
Reported-by: Sebastian Will <swill@csail.mit.edu>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
2011-07-07 10:39:05 -04:00
Henrique de Moraes Holschuh
2d43f671c8 thinkpad-acpi: handle some new HKEY 0x60xx events
Handle some user interface events from the newer Lenovo models.  We are likely
to do something smart with these events in the future, for now, hide the ones
we are already certain about from the user and userspace both.

* Events 0x6000 and 0x6005 are key-related.  0x6005 is not properly identified
  yet.  Ignore these events, and do not report them.

* Event 0x6040 has not been properly identified yet, and we don't know if it
  is important (looks like it isn't, but still...).  Keep reporting it.

* Change the message the driver outputs on unknown 0x6xxx events, as all
  recent events are not related to thermal alarms.  Degrade log level from
  ALERT to WARNING.

Thanks to all users who reported these events or asked about them in a number
of mailing lists.  Your help is highly appreciated, even if I did took a lot of
time to act on them.  For that I apologise.

I will list those that identified the reasons for the events as "reported-by",
and I apologise in advance if I leave anyone out: it was not done on purpose, I
made the mistake of not properly tagging all event report emails separately,
and might have missed some.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Reported-by: Markus Malkusch <markus@malkusch.de>
Reported-by: Peter Giles <g1l3sp@gmail.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
2011-07-07 10:39:00 -04:00
Michael Büsch
eb032b9837 Update my e-mail address
Signed-off-by: Michael Buesch <m@bues.ch>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-07 15:18:01 +02:00
Shan Wei
d804c6f212 net: doc: fix compile warning of no format arguments in ifenslave.c
Fix following warning in ifenslave.c with gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4).

Documentation/networking/ifenslave.c:263:4: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:271:3: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:277:3: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:285:3: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:291:3: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:292:3: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:312:4: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:323:3: warning: format not a string literal and no format arguments
Documentation/networking/ifenslave.c:342:4: warning: format not a string literal and no format arguments


Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-07 00:27:49 -07:00
Daniel Drake
cfee95977b x86, olpc: Add XO-1 RTC driver
Add a driver to configure the XO-1 RTC via CS5536 MSRs, to be used as a
system wakeup source via olpc-xo1-pm.

Device detection is based on finding the relevant device tree node.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/1309019658-1712-11-git-send-email-dsd@laptop.org
Acked-by: Andres Salomon <dilinger@queued.net>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: devicetree-discuss@lists.ozlabs.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06 14:44:42 -07:00
Andrea Righi
9b61fc4cf3 Documentation: fix cgroup blkio throttle filenames
All the blkio.throttle.* file names are incorrectly reported without
".throttle" in the documentation. Fix it.

Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 13:17:51 -07:00
Jesper Juhl
316b379988 Documentation: update CodingStyle memory allocators
The list of available general purpose memory allocators in
Documentation/CodingStyle chapter 14 is incomplete. This patch adds
the missing vzalloc() to the list.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 13:17:51 -07:00
Rafael J. Wysocki
62052ab1d1 PM / Runtime: Replace "run-time" with "runtime" in documentation
The runtime PM documentation and kerneldoc comments sometimes spell
"runtime" with a dash (i.e. "run-time").  Replace all of those
instances with "runtime" to make the naming consistent.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-06 10:52:13 +02:00
Rafael J. Wysocki
e358bad75f PM / Runtime: Improve documentation of enable, disable and barrier
The runtime PM documentation in Documentation/power/runtime_pm.txt
doesn't say that pm_runtime_enable() and pm_runtime_disable() work by
operating on power.disable_depth, which is wrong, because the
possibility of nesting disables doesn't follow from the description
of these functions.  Also, there is no description of
pm_runtime_barrier() at all in the document, which is confusing.
Improve the documentation by fixing those issues.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-06 10:52:06 +02:00
Rafael J. Wysocki
1e2ef05bb8 PM: Limit race conditions between runtime PM and system sleep (v2)
One of the roles of the PM core is to prevent different PM callbacks
executed for the same device object from racing with each other.
Unfortunately, after commit e866500247
(PM: Allow pm_runtime_suspend() to succeed during system suspend)
runtime PM callbacks may be executed concurrently with system
suspend/resume callbacks for the same device.

The main reason for commit e866500247
was that some subsystems and device drivers wanted to use runtime PM
helpers, pm_runtime_suspend() and pm_runtime_put_sync() in
particular, for carrying out the suspend of devices in their
.suspend() callbacks.  However, as it's been determined recently,
there are multiple reasons not to do so, inlcuding:

 * The caller really doesn't control the runtime PM usage counters,
   because user space can access them through sysfs and effectively
   block runtime PM.  That means using pm_runtime_suspend() or
   pm_runtime_get_sync() to suspend devices during system suspend
   may or may not work.

 * If a driver calls pm_runtime_suspend() from its .suspend()
   callback, it causes the subsystem's .runtime_suspend() callback to
   be executed, which leads to the call sequence:

   subsys->suspend(dev)
      driver->suspend(dev)
         pm_runtime_suspend(dev)
            subsys->runtime_suspend(dev)

   recursive from the subsystem's point of view.  For some subsystems
   that may actually work (e.g. the platform bus type), but for some
   it will fail in a rather spectacular fashion (e.g. PCI).  In each
   case it means a layering violation.

 * Both the subsystem and the driver can provide .suspend_noirq()
   callbacks for system suspend that can do whatever the
   .runtime_suspend() callbacks do just fine, so it really isn't
   necessary to call pm_runtime_suspend() during system suspend.

 * The runtime PM's handling of wakeup devices is usually different
   from the system suspend's one, so .runtime_suspend() may simply be
   inappropriate for system suspend.

 * System suspend is supposed to work even if CONFIG_PM_RUNTIME is
   unset.

 * The runtime PM workqueue is frozen before system suspend, so if
   whatever the driver is going to do during system suspend depends
   on it, that simply won't work.

Still, there is a good reason to allow pm_runtime_resume() to
succeed during system suspend and resume (for instance, some
subsystems and device drivers may legitimately use it to ensure that
their devices are in full-power states before suspending them).
Moreover, there is no reason to prevent runtime PM callbacks from
being executed in parallel with the system suspend/resume .prepare()
and .complete() callbacks and the code removed by commit
e866500247 went too far in this
respect.  On the other hand, runtime PM callbacks, including
.runtime_resume(), must not be executed during system suspend's
"late" stage of suspending devices and during system resume's "early"
device resume stage.

Taking all of the above into consideration, make the PM core
acquire a runtime PM reference to every device and resume it if
there's a runtime PM resume request pending right before executing
the subsystem-level .suspend() callback for it.  Make the PM core
drop references to all devices right after executing the
subsystem-level .resume() callbacks for them.  Additionally,
make the PM core disable the runtime PM framework for all devices
during system suspend, after executing the subsystem-level .suspend()
callbacks for them, and enable the runtime PM framework for all
devices during system resume, right before executing the
subsystem-level .resume() callbacks for them.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
2011-07-06 10:51:58 +02:00
David S. Miller
e12fe68ce3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-07-05 23:23:37 -07:00
Stephen Warren
22032c7774 spi/tegra: Use engineering names in DT compatible property
Engineering names are more stable than marketing names. Hence, use them
for Device Tree compatible properties instead.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-05 23:38:30 -06:00
Stephen Warren
f7f678a063 gpio/tegra: Use engineering names in DT compatible property
Engineering names are more stable than marketing names. Hence, use them
for Device Tree compatible properties instead.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-05 23:36:32 -06:00
Aloisio Almeida Jr
14205aa21c NFC: add Documentation/networking/nfc.txt
Signed-off-by: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Signed-off-by: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-05 15:26:58 -04:00
Greg Kroah-Hartman
0c9e98af5e Merge Linux 3.0-rc6 into staging-next
This handles the merge conflicts with the
drivers/staging/brcm80211/Kconfig file due to changes on the two
different branches.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-05 07:35:09 -07:00
Max Matveev
6539fefd9b Update documented default values for various TCP/UDP tunables
tcp_rmem and tcp_wmem use 1 page as default value for the minimum
amount of memory to be used, same as udp_wmem_min and udp_rmem_min.
Pages are different size on different architectures - use the right
units when describing the defaults.

Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Max Matveev <makc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-04 19:14:09 -07:00
Max Matveev
a6e1204b82 Update description of net.sctp.sctp_rmem and net.sctp.sctp_wmem tunables
sctp does not use second and third ("default" and "max") values
of sctp_rmem tunable. The format is the same as tcp_rmem
but the meaning is different so make the documentation explicit to
avoid confusion.

sctp_wmem is not used at all.

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Max Matveev <makc@redhat.com>
Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-04 19:14:09 -07:00