We now use load_gs_index() to load gs safely; unfortunately this also
changes MSR_KERNEL_GS_BASE, which we managed separately. This resulted
in confusion and breakage running 32-bit host userspace on a 64-bit kernel.
Fix by
- saving guest MSR_KERNEL_GS_BASE before we we reload the host's gs
- doing the host save/load unconditionally, instead of only when in guest
long mode
Things can be cleaned up further, but this is the minmal fix for now.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If fs or gs refer to the ldt, they must be reloaded after the ldt. Reorder
the code to that effect.
Userspace code that uses the ldt with kvm is nonexistent, so this doesn't fix
a user-visible bug.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Making /proc/kallsyms readable only for root by default makes it
slightly harder for attackers to write generic kernel exploits by
removing one source of knowledge where things are in the kernel.
This is the second submit, discussion happened on this on first submit
and mostly concerned that this is just one hole of the sieve ... but
one of the bigger ones.
Changing the permissions of at least System.map and vmlinux is also
required to fix the same set, but a packaging issue.
Target of this starter patch and follow ups is removing any kind of
kernel space address information leak from the kernel.
[ Side note: the default of root-only reading is the "safe" value, and
it's easy enough to then override at any time after boot. The /proc
filesystem allows root to change the permissions with a regular
chmod, so you can "revert" this at run-time by simply doing
chmod og+r /proc/kallsyms
as root if you really want regular users to see the kernel symbols.
It does help some tools like "perf" figure them out without any
setup, so it may well make sense in some situations. - Linus ]
Signed-off-by: Marcus Meissner <meissner@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Eugene Teo <eugeneteo@kernel.org>
Reviewed-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
nfs: Ignore kmemleak false positive in nfs_readdir_make_qstr
SUNRPC: Simplify rpc_alloc_iostats by removing pointless local variable
nfs: trivial: remove unused nfs_wait_event macro
NFS: readdir shouldn't read beyond the reply returned by the server
NFS: Fix a couple of regressions in readdir.
Revert "NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR"
Regression: fix mounting NFS when NFSv3 support is not compiled
NLM: Fix a regression in lockd
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix cross-sched-class wakeup preemption
sched: Fix runnable condition for stoptask
sched: Use group weight, idle cpu metrics to fix imbalances during idle
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM / PM QoS: Fix reversed min and max
PM / OPP: Hide OPP configuration when SoCs do not provide an implementation
PM: Allow devices to be removed during late suspend and early resume
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.
The patch below presents a simple SCSI host lock push-down as an
equivalent transformation. No locking or other behavior should change
with this patch. All existing bugs and locking orders are preserved.
Additionally, add one parameter to queuecommand,
struct Scsi_Host *
and remove one parameter from queuecommand,
void (*done)(struct scsi_cmnd *)
Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.
Minimal code disturbance was attempted with this change. Most drivers
needed only two one-line modifications for their host lock push-down.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] kprobes: Fix the return address of multiple kretprobes
[S390] kprobes: disable interrupts throughout
[S390] ftrace: build without frame pointers on s390
[S390] mm: add devmem_is_allowed() for STRICT_DEVMEM checking
[S390] vmlogrdr: purge after recording is switched off
[S390] cio: fix incorrect ccw_device_init_count
[S390] tape: add medium state notifications
[S390] fix get_user_pages_fast
I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts.
Running an strace of the X server shows that it's doing this:
open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10
mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument)
This code seems to be asking for a shared read/write mapping of 16MB worth of
BAR0 starting at file offset 0, and letting the kernel assign a starting
address. Unfortunately, this -EINVAL causes X not to start. Looking into
dmesg, there's a complaint like so:
process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x 96000000, size 0x 1000000)
...with the following code in pci_mmap_fits:
pci_start = (mmap_api == PCI_MMAP_SYSFS) ?
pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
if (start >= pci_start && start < pci_start + size &&
start + nr <= pci_start + size)
It looks like the logic here is set up such that when the mmap call comes via
sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the
resource's start and end address, and the end of the vma to be no farther than
the end. However, the sysfs PCI resource files always start at offset zero,
which means that this test always fails for programs that mmap the sysfs files.
Given the comment in the original commit
3b519e4ea6, I _think_ the old procfs files
require that the file offset be equal to the resource's base address when
mmapping.
I think what we want here is for pci_start to be 0 when mmap_api ==
PCI_MMAP_PROCFS. The following patch makes that change, after which the Matrox
and Mach64 X drivers work again.
Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Strings allocated via kmemdup() in nfs_readdir_make_qstr() are
referenced from the nfs_cache_array which is stored in a page cache
page. Kmemleak does not scan such pages and it reports several false
positives. This patch annotates the string->name pointer so that
kmemleak does not consider it a real leak.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Hi,
We can simplify net/sunrpc/stats.c::rpc_alloc_iostats() a bit by getting
rid of the unneeded local variable 'new'.
Please CC me on replies.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Sigh...
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix up the issue that array->eof_index needs to be able to be set
even if array->size == 0.
Ensure that we catch all important memory allocation error conditions
and/or kmap() failures.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This reverts commit 80e60639f1.
This change requires further fixes to ensure that the open doesn't
succeed if the lookup later results in a regular file being created.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trying to mount NFS (root partition in my case) fails if CONFIG_NFS_V3
is not selected. nfs_validate_mount_data() returns EPROTONOSUPPORT,
because of this check:
#ifndef CONFIG_NFS_V3
if (args->version == 3)
goto out_v3_not_compiled;
#endif /* !CONFIG_NFS_V3 */
and args->version was always initialized to 3.
It was working in 2.6.36
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Nick Bowler reports:
There are no unusual messages on the client... but I just logged into
the server and I see lots of messages of the following form:
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
Bisected to commit 9247685088 (SUNRPC:
Properly initialize sock_xprt.srcaddr in all cases)
Apparently, removing the 'transport->srcaddr.ss_family = family' from
xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly
initialising the srcaddr family to AF_UNSPEC.
Reported-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
failure when CONFIG_PRINTK=n. This is because the capabilities code
which used the new option was built even though the variable in question
didn't exist.
The patch here fixes this by moving the capabilities checks out of the
LSM and into the caller. All (known) LSMs should have been calling the
capabilities hook already so it actually makes the code organization
better to eliminate the hook altogether.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
arm: omap1: devices: need to return with a value
OMAP1: camera.h: add missing include
omap: dma: Add read-back to DMA interrupt handler to avoid spuriousinterrupts
OMAP2: Devkit8000: Fix mmc regulator failure
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (w83795) Check for BEEP pin availability
hwmon: (w83795) Clear intrusion alarm immediately
hwmon: (w83795) Read the intrusion state properly
hwmon: (w83795) Print the actual temperature channels as sources
hwmon: (w83795) List all usable temperature sources
hwmon: (w83795) Expose fan control method
hwmon: (w83795) Fix fan control mode attributes
hwmon: (lm95241) Check validity of input values
hwmon: Change mail address of Hans J. Koch
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: sysfs: fix printk warnings
PCI: fix pci_bus_alloc_resource() hang, prefer positive decode
PCI: read current power state at enable time
PCI: fix size checks for mmap() on /proc/bus/pci files
x86/PCI: coalesce overlapping host bridge windows
PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area
pm_qos_get_value had min and max reversed, causing all pm_qos
requests to have no effect.
Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: mark <markgross@thegnar.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Make sure I2C adapters being registered have the required struct
fields set. If they don't, problems will happen later.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
It's about time to make it clear that i2c_adapter.id is deprecated.
Hopefully this will remind the last user to move over to a different
strategy.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Drivers don't need to include <linux/i2c-id.h>, especially not when
they don't use anything that header file provides.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Michael Hunold <michael@mihu.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Delete unused I2C adapter IDs. Special cases are:
* I2C_HW_B_RIVA was still set in driver rivafb, however no other
driver is ever looking for this value, so we can safely remove it.
* I2C_HW_B_HDPVR is used in staging driver lirc_zilog, however no
adapter ID is ever set to this value, so the code in question never
runs. As the code additionally expects that I2C_HW_B_HDPVR may not
be defined, we can delete it now and let the lirc_zilog driver
maintainer rewrite this piece of code.
Big thanks for Hans Verkuil for doing all the hard work :)
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
A few new i2c-drivers came into the kernel which clear the clientdata-pointer
on exit. This is obsolete meanwhile, so fix it and hope the word will spread.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Alan Cox <alan@linux.intel.com>
Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Move the logging bits from kernel.h into printk.h so that
there is a bit more logical separation of the generic from
the printk logging specific parts.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The fix in commit 6b4e81db25 ("i8k: Tell gcc that *regs gets
clobbered") to work around the gcc miscompiling i8k.c to add "+m
(*regs)" caused register pressure problems and a build failure.
Changing the 'asm' statement to 'asm volatile' instead should prevent
that and works around the gcc bug as well, so we can remove the "+m".
[ Background on the gcc bug: a memory clobber fails to mark the function
the asm resides in as non-pure (aka "__attribute__((const))"), so if
the function does nothing else that triggers the non-pure logic, gcc
will think that that function has no side effects at all. As a result,
callers will be mis-compiled.
Adding the "+m" made gcc see that it's not a pure function, and so
does "asm volatile". The problem was never really the need to mark
"*regs" as changed, since the memory clobber did that part - the
problem was just a bug in the gcc "pure" function analysis - Linus ]
Signed-off-by: Jim Bos <jim876@xs4all.nl>
Acked-by: Jakub Jelinek <jakub@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On the W83795ADG, there's a single pin for BEEP and OVT#, so you
can't have both. Check the configuration and don't create beep
attributes when BEEP pin is not available.
The W83795G has a dedicated BEEP pin so the functionality is always
available there.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
When asked to clear the intrusion alarm, do so immediately. We have to
invalidate the cache to make sure the new status will be read. But we
also have to read from the status register once to clear the pending
alarm, as writing to CLR_CHS surprising won't clear it automatically.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
We can't read the intrusion state from the real-time alarm registers
as we do for all other alarm flags, because real-time alarm bits don't
stick (by definition) and the intrusion state has to stick until
explicitly cleared (otherwise it has little value.)
So we have to use the interrupt status register instead, which is read
from the same address but with a configuration bit flipped in another
register.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Don't expose raw register values to user-space. Decode and encode
temperature channels selected as temperature sources as needed.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Temperature sources are not correlated directly with temperature
channels. A look-up table is required to find out which temperature
sources can be used depending on which temperature channels (both
analog and digital) are enabled.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Expose fan control method (DC vs. PWM) using the standard sysfs
attributes. I've made it read-only as the board should be wired for
a given mode, the BIOS should have set up the chip for this mode, and
you shouldn't have to change it. But it would be easy enough to make
it changeable if someone comes up with a use case.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
There were two bugs:
* Speed cruise mode was improperly reported for all fans but fan1.
* Fan control method (PWM vs. DC) was mixed with the control mode.
It will be added back as a separate attribute, as per the standard
sysfs interface.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
This clears the following build-time warnings I was seeing:
drivers/hwmon/lm95241.c: In function "set_interval":
drivers/hwmon/lm95241.c:132:15: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_max2":
drivers/hwmon/lm95241.c:278:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_max1":
drivers/hwmon/lm95241.c:277:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_min2":
drivers/hwmon/lm95241.c:249:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_min1":
drivers/hwmon/lm95241.c:248:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_type2":
drivers/hwmon/lm95241.c:220:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_type1":
drivers/hwmon/lm95241.c:219:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
This also fixes a small race in set_interval() as a side effect: by
working with a temporary local variable we prevent data->interval from
being accessed at a time it contains the interval value in the wrong
unit.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Davide Rizzo <elpa.rizzo@gmail.com>
My old mail address doesn't exist anymore. This changes all occurrences
to my new address.
Signed-off-by: Hans J. Koch <hjk@hansjkoch.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cast pci_resource_start() and pci_resource_len() to u64 for printk.
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t'
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
fsl-diu-fb: drop dead ioctl define
MAINTAINERS: Add an fbdev git tree entry.
OMAP: DSS: Fix documentation regarding 'vram' kernel parameter
OMAP: VRAM: Fix boot-time memory allocation
OMAP: VRAM: improve VRAM error prints
sisfb: limit POST memory test according to PCI resource length
fbdev: sh_mobile_lcdc: use correct number of modes, when using the default
fbdev: sh_mobile_lcdc: use the standard CEA-861 720p timing
fbdev: sh_mobile_hdmi: properly clean up modedb on monitor unplug
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: intc: Fix up build failure introduced by radix tree changes.
MAINTAINERS: update the sh git tree entry.
sh: clkfwk: fix up compiler warnings.
sh: intc: Fix up initializers for gcc 4.5.
rtc: rtc-sh - fix a memory leak
* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
ARM: mach-shmobile: ap4evb: add fsib 44100Hz rate
MAINTAINERS: update the ARM SH-Mobile git tree entry.
ARM: mach-shmobile: ap4evb: Mark NOR boot loader partitions read-only.
ARM: mach-shmobile: intc-sh7372: fix interrupt number
This area of the code has always been a bit delicate due to the
subtleties of lock ordering. The problem is that for "normal"
alloc/dealloc, we always grab the inode locks first and the rgrp lock
later.
In order to ensure no races in looking up the unlinked, but still
allocated inodes, we need to hold the rgrp lock when we do the lookup,
which means that we can't take the inode glock.
The solution is to borrow the technique already used by NFS to solve
what is essentially the same problem (given an inode number, look up
the inode carefully, checking that it really is in the expected
state).
We cannot do that directly from the allocation code (lock ordering
again) so we give the job to the pre-existing delete workqueue and
carry on with the allocation as normal.
If we find there is no space, we do a journal flush (required anyway
if space from a deallocation is to be released) which should block
against the pending deallocations, so we should always get the space
back.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>