Commit graph

156956 commits

Author SHA1 Message Date
Benjamin Herrenschmidt
066c4b87e9 powerpc/mm: Fix definitions of FORCE_MAX_ZONEORDER in Kconfig
The current definitions set ranges and defaults for 32 and 64-bit
only using "PPC_STD_MMU" which means hash based MMU. This uselessly
restrict the usefulness for the upcoming 64-bit BookE port, but more
than that, it's broken on 32-bit since the only 32-bit platform
supporting multiple page sizes currently is 44x which does -not-
have PPC_STD_MMU_32 set.

This fixes it by using PPC64 and PPC32 instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:30 +10:00
roel kluin
2e2ddb24d3 powerpc/cell: Replace strncpy by strlcpy
Replace strncpy() and explicit null-termination by strlcpy()

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:29 +10:00
Benjamin Herrenschmidt
c5a8c0c99f powerpc: Remove use of a second scratch SPRG in STAB code
The STAB code used on Power3 and RS/64 uses a second scratch SPRG to
save a GPR in order to decide whether to go to do_stab_bolted_* or
to handle a normal data access exception.

This prevents our scheme of freeing SPRG3 which is user visible for
user uses since we cannot use SPRG0 which, on RS/64, seems to be
read-only for supervisor mode (like POWER4).

This reworks the STAB exception entry to use the PACA as temporary
storage instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:28 +10:00
Benjamin Herrenschmidt
ee43eb788b powerpc: Use names rather than numbers for SPRGs (v2)
The kernel uses SPRG registers for various purposes, typically in
low level assembly code as scratch registers or to hold per-cpu
global infos such as the PACA or the current thread_info pointer.

We want to be able to easily shuffle the usage of those registers
as some implementations have specific constraints realted to some
of them, for example, some have userspace readable aliases, etc..
and the current choice isn't always the best.

This patch should not change any code generation, and replaces the
usage of SPRN_SPRGn everywhere in the kernel with a named replacement
and adds documentation next to the definition of the names as to
what those are used for on each processor family.

The only parts that still use the original numbers are bits of KVM
or suspend/resume code that just blindly needs to save/restore all
the SPRGs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:27 +10:00
Benjamin Herrenschmidt
8aa34ab8b2 powerpc: Rename exception.h to exception-64s.h
The file include/asm/exception.h contains definitions
that are specific to exception handling on 64-bit server
type processors.

This renames the file to exception-64s.h to reflect that
fact and avoid confusion.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:26 +10:00
Anton Blanchard
de4376c284 powerpc: Preload application text segment instead of TASK_UNMAPPED_BASE
TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can
reuse this preload slot by loading in the segment at 0x10000000, where almost
all PowerPC binaries are linked at.

On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), both the 32bit and
64bit context switch rate improves (tested on a 4GHz POWER6):

32bit: 273k/sec -> 283k/sec
64bit: 277k/sec -> 284k/sec

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:26 +10:00
Anton Blanchard
5eb9bac040 powerpc: Rearrange SLB preload code
With the new top down layout it is likely that the pc and stack will be in the
same segment, because the pc is most likely in a library allocated via a top
down mmap. Right now we bail out early if these segments match.

Rearrange the SLB preload code to sanity check all SLB preload addresses
are not in the kernel, then check all addresses for conflicts.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:25 +10:00
Anton Blanchard
30d0b36828 powerpc: Move 64bit VDSO to improve context switch performance
On 64bit applications the VDSO is the only thing in segment 0. Since the VDSO
is position independent we can remove the hint and let get_unmapped_area pick
an area. This will mean the vdso will be near other mmaps and will share
an SLB entry:

10000000-10001000 r-xp 00000000 08:06 5778459        /root/context_switch_64
10010000-10011000 r--p 00000000 08:06 5778459        /root/context_switch_64
10011000-10012000 rw-p 00001000 08:06 5778459        /root/context_switch_64
fffa92ae000-fffa92b0000 rw-p 00000000 00:00 0
fffa92b0000-fffa9453000 r-xp 00000000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa9453000-fffa9462000 ---p 001a3000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa9462000-fffa9466000 r--p 001a2000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa9466000-fffa947c000 rw-p 001a6000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa947c000-fffa9480000 rw-p 00000000 00:00 0
fffa9480000-fffa94a8000 r-xp 00000000 08:06 4333852  /lib64/ld-2.9.so
fffa94b3000-fffa94b4000 rw-p 00000000 00:00 0

fffa94b4000-fffa94b7000 r-xp 00000000 00:00 0        [vdso] <----- here I am

fffa94b7000-fffa94b8000 r--p 00027000 08:06 4333852  /lib64/ld-2.9.so
fffa94b8000-fffa94bb000 rw-p 00028000 08:06 4333852  /lib64/ld-2.9.so
fffa94bb000-fffa94bc000 rw-p 00000000 00:00 0
fffe4c10000-fffe4c25000 rw-p 00000000 00:00 0        [stack]

On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), our context switch
rate goes from 268k to 277k ctx switches/sec (tested on a 4GHz POWER6).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:24 +10:00
Geoff Thorpe
0d2d3e38f7 powerpc: expose the multi-bit ops that underlie single-bit ops.
The bitops.h functions that operate on a single bit in a bitfield are
implemented by operating on the corresponding word location. In all
cases the inner logic is valid if the mask being applied has more than
one bit set, so this patch exposes those inner operations. Indeed,
set_bits() was already available, but it duplicated code from
set_bit() (rather than making the latter a wrapper) - it was also
missing the PPC405_ERR77() workaround and the "volatile" address
qualifier present in other APIs. This corrects that, and exposes the
other multi-bit equivalents.

One advantage of these multi-bit forms is that they allow word-sized
variables to essentially be their own spinlocks, eg. very useful for
state machines where an atomic "flags" variable can obviate the need
for any additional locking.

Signed-off-by: Geoff Thorpe <geoff@geoffthorpe.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:23 +10:00
Michael Ellerman
11a6b292c1 powerpc/mpic: Fix MPIC_BROKEN_REGREAD on non broken MPICs
The workaround enabled by CONFIG_MPIC_BROKEN_REGREAD does not work
on non-broken MPICs. The symptom is no interrupts being received.

The fix is twofold. Firstly the code was broken for multiple isus,
we need to index into the shadow array with the src_no, not the idx.
Secondly, we always do the read, but only use the VECPRI_MASK and
VECPRI_ACTIVITY bits from the hardware, the rest of "val" comes
from the shadow.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:22 +10:00
Gerhard Pircher
66dc3304f3 powerpc/amigaone: Convert amigaone_init() to a machine_device_initcall()
This allows to remove the ppc_md.init() hook in the setup code.

Signed-off-by: Gerhard Pircher <gerhard_pircher@gmx.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:21 +10:00
Linus Torvalds
c124891f50 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  security: Fix prompt for LSM_MMAP_MIN_ADDR
  security: Make LSM_MMAP_MIN_ADDR default match its help text.
2009-08-18 19:41:47 -07:00
Linus Torvalds
77f312a96d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: use the right flag for get_vm_area()
  percpu, sparc64: fix sparse possible cpu map handling
  init: set nr_cpu_ids before setup_per_cpu_areas()
2009-08-18 19:41:05 -07:00
Linus Torvalds
dc8ed71eeb Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: Don't initialize MCEs on unknown CPUs
  x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs
  x86: Annotate section mismatch warnings in kernel/apic/x2apic_uv_x.c
  x86, mce: therm_throt: Don't log redundant normality
  x86: Fix UV BAU destination subnode id
2009-08-18 16:55:43 -07:00
Bo Liu
7f9cfb3103 mm: build_zonelists(): move clear node_load[] to __build_all_zonelists()
If node_load[] is cleared everytime build_zonelists() is
called,node_load[] will have no help to find the next node that should
appear in the given node's fallback list.

Because of the bug, zonelist's node_order is not calculated as expected.
This bug affects on big machine, which has asynmetric node distance.

[synmetric NUMA's node distance]
     0    1    2
0   10   12   12
1   12   10   12
2   12   12   10

[asynmetric NUMA's node distance]
     0    1    2
0   10   12   20
1   12   10   14
2   20   14   10

This (my bug) is very old but no one has reported this for a long time.
Maybe because the number of asynmetric NUMA is very small and they use
cpuset for customizing node memory allocation fallback.

[akpm@linux-foundation.org: fix CONFIG_NUMA=n build]
Signed-off-by: Bo Liu <bo-liu@hotmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
Joe Perches
503f7944fa REPORTING-BUGS: add get_maintainer.pl blurb
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
Graff Yang
28d7a6ae92 nommu: check fd read permission in validate_mmap_request()
According to the POSIX (1003.1-2008), the file descriptor shall have been
opened with read permission, regardless of the protection options specified to
mmap().  The ltp test cases mmap06/07 need this.

Signed-off-by: Graff Yang <graff.yang@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
Ben Dooks
1915297566 spi_s3c24xx: fix transfer setup code
Since the changes to the bitbang driver, there is the possibility we will
be called with either the speed_hz or bpw values zero.  We take these to
mean that the default values (8 bits per word, or maximum bus speed).

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
Ben Dooks
b897878454 spi_s3c24xx: fix clock rate calculation
Currently the clock rate calculation may round as pleased, which means
that it is possible that we will round down and end up with a faster clock
rate than intended.

Change the calculation to use DIV_ROUND_UP() to ensure that we end up with
a clock rate either the same as or lower than the user requested one.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
Andrew Morton
b2503a9408 mmc: add the new linux-mmc mailing list to MAINTAINERS
There are a number of individual MMC drivers listed in MAINTAINERS.  I
didn't modify those records.  Perhaps I should have.

Cc: <linux-mmc@vger.kernel.org>
Cc: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Nicolas Pitre <nico@cam.org>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: Pavel Pisa <ppisa@pikron.com>
Cc: Jarkko Lavinen <jarkko.lavinen@nokia.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Sascha Sommer <saschasommer@freenet.de>
Cc: Ian Molton <ian@mnementh.co.uk>
Cc: Joseph Chan <JosephChan@via.com.tw>
Cc: Harald Welte <HaraldWelte@viatech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
KOSAKI Motohiro
0753ba01e1 mm: revert "oom: move oom_adj value"
The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
the mm_struct.  It was a very good first step for sanitize OOM.

However Paul Menage reported the commit makes regression to his job
scheduler.  Current OOM logic can kill OOM_DISABLED process.

Why? His program has the code of similar to the following.

	...
	set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
	...
	if (vfork() == 0) {
		set_oom_adj(0); /* Invoked child can be killed */
		execve("foo-bar-cmd");
	}
	....

vfork() parent and child are shared the same mm_struct.  then above
set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
change oom_adj for vfork() parent.  Then, vfork() parent (job scheduler)
lost OOM immune and it was killed.

Actually, fork-setting-exec idiom is very frequently used in userland program.
We must not break this assumption.

Then, this patch revert commit 2ff05b2b and related commit.

Reverted commit list
---------------------
- commit 2ff05b2b4e (oom: move oom_adj value from task_struct to mm_struct)
- commit 4d8b9135c3 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
- commit 8123681022 (oom: only oom kill exiting tasks with attached memory)
- commit 933b787b57 (mm: copy over oom_adj value at fork time)

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
Jeff Layton
89a4eb4b66 vfs: make get_sb_pseudo set s_maxbytes to value that can be cast to signed
get_sb_pseudo sets s_maxbytes to ~0ULL which becomes negative when cast
to a signed value.  Fix it to use MAX_LFS_FILESIZE which casts properly
to a positive signed value.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Steve French <smfrench@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:12 -07:00
Joe Perches
6b6f0b6c13 MAINTAINERS: OSD LIBRARY and FILESYSTEM pattern fix
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Benny Halevy <bhalevy@panasas.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:12 -07:00
Andreas Schwab
024e6cb408 security: Fix prompt for LSM_MMAP_MIN_ADDR
Fix prompt for LSM_MMAP_MIN_ADDR.

(Verbs are cool!)

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-19 08:42:56 +10:00
Dave Jones
a58578e47f security: Make LSM_MMAP_MIN_ADDR default match its help text.
Commit 788084aba2 added the LSM_MMAP_MIN_ADDR
option, whose help text states "For most ia64, ppc64 and x86 users with lots
of address space a value of 65536 is reasonable and should cause no problems."
Which implies that it's default setting was typoed.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-19 08:38:29 +10:00
Linus Torvalds
dcd94dbdaf Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Wake up irq thread after action has been installed
2009-08-18 13:57:38 -07:00
Linus Torvalds
8486a0f95c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (60 commits)
  net: restore gnet_stats_basic to previous definition
  NETROM: Fix use of static buffer
  e1000e: fix use of pci_enable_pcie_error_reporting
  e1000e: WoL does not work on 82577/82578 with manageability enabled
  cnic: Fix locking in init/exit calls.
  cnic: Fix locking in start/stop calls.
  bnx2: Use mutex on slow path cnic calls.
  cnic: Refine registration with bnx2.
  cnic: Fix symbol_put_addr() panic on ia64.
  gre: Fix MTU calculation for bound GRE tunnels
  pegasus: Add new device ID.
  drivers/net: fixed drivers that support netpoll use ndo_start_xmit()
  via-velocity: Fix test of mii_status bit VELOCITY_DUPLEX_FULL
  rt2x00: fix memory corruption in rf cache, add a sanity check
  ixgbe: Fix receive on real device when VLANs are configured
  ixgbe: Do not return 0 in ixgbe_fcoe_ddp() upon FCP_RSP in DDP completion
  netxen: free napi resources during detach
  netxen: remove netxen workqueue
  ixgbe: fix issues setting rx-usecs with legacy interrupts
  can: fix oops caused by wrong rtnl newlink usage
  ...
2009-08-18 13:55:01 -07:00
Linus Torvalds
b9d030a123 Merge branch 'sh/for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: sh7724 ddr self-refresh changes
  sh: use in-soc KEYSC on se7724
  sh: CMT suspend/resume
  sh: skip disabled LCDC channels
2009-08-18 13:54:26 -07:00
Linus Torvalds
435a71d9ef Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  Fix new incorrect error return from do_md_stop.
2009-08-18 13:54:08 -07:00
Thomas Gleixner
69ab849439 genirq: Wake up irq thread after action has been installed
The wake_up_process() of the new irq thread in __setup_irq() is too
early as the irqaction is not yet fully initialized especially
action->irq is not yet set. The interrupt thread might dereference the
wrong irq descriptor.

Move the wakeup after the action is installed and action->irq has been
set.

Reported-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Buesch <mb@bu3sch.de>
2009-08-18 17:22:43 +02:00
Eric Dumazet
c1a8f1f1c8 net: restore gnet_stats_basic to previous definition
In 5e140dfc1f "net: reorder struct Qdisc
for better SMP performance" the definition of struct gnet_stats_basic
changed incompatibly, as copies of this struct are shipped to
userland via netlink.

Restoring old behavior is not welcome, for performance reason.

Fix is to use a private structure for kernel, and
teach gnet_stats_copy_basic() to convert from kernel to user land,
using legacy structure (struct gnet_stats_basic)

Based on a report and initial patch from Michael Spang.

Reported-by: Michael Spang <mspang@csclub.uwaterloo.ca>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-17 21:33:49 -07:00
Ralf Baechle
c6ba973b8f NETROM: Fix use of static buffer
The static variable used by nr_call_to_digi might result in corruption if
multiple threads are trying to usee a node or neighbour via ioctl.  Fixed
by having the caller pass a structure in.  This is safe because nr_add_node
rsp. nr_add_neigh will allocate a permanent structure, if needed.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-17 18:05:32 -07:00
NeilBrown
80ffb3ccea Fix new incorrect error return from do_md_stop.
Recent commit c8c00a6915
changed the exit paths in do_md_stop and was not quite
careful enough.  There is one path were 'err' now needs
to be cleared but it isn't.
So setting an array to readonly (with mdadm --readonly) will
work, but will incorrectly report and error: ENXIO.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-08-18 10:35:26 +10:00
Linus Torvalds
df4ecf1524 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Fix HPAGE_SIZE redefinition
2009-08-17 13:39:52 -07:00
Linus Torvalds
c58afec8b2 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: fix locking in xfs_iget_cache_hit
2009-08-17 13:39:30 -07:00
Linus Torvalds
52dec22e73 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  security: define round_hint_to_min in !CONFIG_SECURITY
  Security/SELinux: seperate lsm specific mmap_min_addr
  SELinux: call cap_file_mmap in selinux_file_mmap
  Capabilities: move cap_file_mmap to commoncap.c
2009-08-17 13:38:58 -07:00
Eric Paris
08e53fcb0d inotify: start watch descriptor count at 1
The inotify_add_watch man page specifies that inotify_add_watch() will
return a non-negative integer.  However, historically the inotify
watches started at 1, not at 0.

Turns out that the inotifywait program provided by the inotify-tools
package doesn't properly handle a 0 watch descriptor.  In 7e790dd5 we
changed from starting at 1 to starting at 0.  This patch starts at 1,
just like in previous kernels, but also just like in previous kernels
it's possible for it to wrap back to 0.  This preserves the kernel
functionality exactly like it was before the patch (neither method broke
the spec)

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-17 13:37:37 -07:00
Eric Paris
cd94c8bbef inotify: tail drop inotify q_overflow events
In f44aebcc the tail drop logic of events with no file backing
(q_overflow and in_ignored) was reversed so IN_IGNORED events would
never be tail dropped.  This now means that Q_OVERFLOW events are NOT
tail dropped.  The fix is to not tail drop IN_IGNORED, but to tail drop
Q_OVERFLOW.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-17 13:37:37 -07:00
Eric Paris
eef3a116be notify: unused event private race
inotify decides if private data it passed to get added to an event was
used by checking list_empty().  But it's possible that the event may
have been dequeued and the private event removed so it would look empty.

The fix is to use the return code from fsnotify_add_notify_event rather
than looking at the list.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-17 13:37:37 -07:00
Linus Torvalds
0f66f96d21 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (37 commits)
  ARM: 5673/1: U300 fix initsection compile warning
  ARM: Fix broken highmem support
  mx31moboard: invert sdhc ro signal sense
  ARM: S3C24XX: Fix clkout mpx error
  ARM: S3C64XX: serial: Fix a typo in Kconfig
  IXP4xx: Fix IO_SPACE_LIMIT for 2.6.31-rc core PCI changes
  OMAP3: RX51: Updated rx51_defconfig
  OMAP2/3: mmc-twl4030: Free up MMC regulators while cleaning up
  OMAP3: RX51: Define TWL4030 USB transceiver in board file
  OMAP3: Overo: Fix smsc911x platform device resource value
  OMAP3: Fix omap3 sram virtual addres overlap vmalloc space after increasing vmalloc size
  OMAP2/3: DMA errata correction
  OMAP: Fix testing of cpu defines for mach-omap1
  OMAP3: Overo: add missing pen-down GPIO definition
  OMAP: GPIO: clear/restore level/edge detect settings on mask/unmask
  OMAP3: PM: Fix wrong sequence in suspend.
  OMAP: PM: CPUfreq: obey min/max settings of policy
  OMAP2/3/4: UART: allow in-order port traversal
  OMAP2/3/4: UART: Allow per-UART disabling wakeup for serial ports
  OMAP3: Fixed crash bug with serial + suspend
  ...
2009-08-17 13:36:39 -07:00
Atsushi Nemoto
87c62a66ed MIPS: Fix HPAGE_SIZE redefinition
This patch fixes warnings like this:
  CC      fs/proc/meminfo.o
In file included from /work/linux/include/linux/mmzone.h:20,
                 from /work/linux/include/linux/gfp.h:4,
                 from /work/linux/include/linux/mm.h:8,
                 from /work/linux/fs/proc/meminfo.c:5:
/work/linux/arch/mips/include/asm/page.h:36:1: warning: "HPAGE_SIZE" redefined
In file included from /work/linux/fs/proc/meminfo.c:2:
/work/linux/include/linux/hugetlb.h:107:1: warning: this is the location of the previous definition

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-08-17 17:27:57 +01:00
Ingo Molnar
e412cd257e x86, mce: Don't initialize MCEs on unknown CPUs
An older test-box started hanging at the following point during
bootup:

 [    0.022996] Mount-cache hash table entries: 512
 [    0.024996] Initializing cgroup subsys debug
 [    0.025996] Initializing cgroup subsys cpuacct
 [    0.026995] Initializing cgroup subsys devices
 [    0.027995] Initializing cgroup subsys freezer
 [    0.028995] mce: CPU supports 5 MCE banks

I've bisected it down to commit 4efc0670 ("x86, mce: use 64bit
machine check code on 32bit"), which utilizes the MCE code on
32-bit systems too.

The problem is caused by this detail in my config:

  # CONFIG_CPU_SUP_INTEL is not set

This disables the quirks in mce_cpu_quirks() but still enables
MCE support - which then hangs due to the missing quirk
workaround needed on this CPU:

	if (c->x86 == 6 && c->x86_model < 0x1A && banks > 0)
		mce_banks[0].init = 0;

The safe solution is to not initialize MCEs if we dont know on
what CPU we are running (or if that CPU's support code got
disabled in the config).

Also be a bit more defensive on 32-bit systems: dont do a
boot-time dump of pending MCEs not just on the specific system
that we found a problem with (Pentium-M), but earlier ones as
well.

Now this problem is probably not common and disabling CPU
support is rare - but still being more defensive in something
we turned on for a wide range of CPUs is prudent.

Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: Message-ID: <4A88E3E4.40506@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 13:28:25 +02:00
Bartlomiej Zolnierkiewicz
c7f6fa4411 x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs
On my legacy Pentium M laptop (Acer Extensa 2900) I get bogus MCE on a cold
boot with CONFIG_X86_NEW_MCE enabled, i.e. (after decoding it with mcelog):

MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 1 MCG status:
MCi status:
Error overflow
Uncorrected error
Error enabled
Processor context corrupt
MCA: Data CACHE Level-1 UNKNOWN Error
STATUS f200000000000195 MCGSTATUS 0

[ The other STATUS values observed: f2000000000001b5 (... UNKNOWN error)
  and f200000000000115 (... READ Error).

  To verify that this is not a CONFIG_X86_NEW_MCE bug I also modified
  the CONFIG_X86_OLD_MCE code (which doesn't log any MCEs) to dump
  content of STATUS MSR before it is cleared during initialization. ]

Since the bogus MCE results in a kernel taint (which in turn disables
lockdep support) don't log boot MCEs on Pentium M (model == 13) CPUs
by default ("mce=bootlog" boot parameter can be be used to get the old
behavior).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 10:17:02 +02:00
Christoph Hellwig
bc990f5cb4 xfs: fix locking in xfs_iget_cache_hit
The locking in xfs_iget_cache_hit currently has numerous problems:

 - we clear the reclaim tag without i_flags_lock which protects
   modifications to it
 - we call inode_init_always which can sleep with pag_ici_lock
   held (this is oss.sgi.com BZ #819)
 - we acquire and drop i_flags_lock a lot and thus provide no
   consistency between the various flags we set/clear under it

This patch fixes all that with a major revamp of the locking in
the function.  The new version acquires i_flags_lock early and
only drops it once we need to call into inode_init_always or before
calling xfs_ilock.

This patch fixes a bug seen in the wild where we race modifying the
reclaim tag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-17 01:23:48 -05:00
Eric Paris
1d9959734a security: define round_hint_to_min in !CONFIG_SECURITY
Fix the header files to define round_hint_to_min() and to define
mmap_min_addr_handler() in the !CONFIG_SECURITY case.

Built and tested with !CONFIG_SECURITY

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:09:27 +10:00
Eric Paris
788084aba2 Security/SELinux: seperate lsm specific mmap_min_addr
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:09:11 +10:00
Eric Paris
8cf948e744 SELinux: call cap_file_mmap in selinux_file_mmap
Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook.  This
means there is no DAC check on the ability to mmap low addresses in the
memory space.  This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero.  This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:08:48 +10:00
Eric Paris
9c0d90103c Capabilities: move cap_file_mmap to commoncap.c
Currently we duplicate the mmap_min_addr test in cap_file_mmap and in
security_file_mmap if !CONFIG_SECURITY.  This patch moves cap_file_mmap
into commoncap.c and then calls that function directly from
security_file_mmap ifndef CONFIG_SECURITY like all of the other capability
checks are done.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:08:35 +10:00
Leonardo Potenza
52459ab913 x86: Annotate section mismatch warnings in kernel/apic/x2apic_uv_x.c
The function uv_acpi_madt_oem_check() has been marked __init,
the struct apic_x2apic_uv_x has been marked __refdata.

The aim is to address the following section mismatch messages:

WARNING: arch/x86/kernel/apic/built-in.o(.data+0x1368): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: arch/x86/kernel/built-in.o(.data+0x68e8): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: arch/x86/built-in.o(.text+0x7b36f): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_ioremap()
The function uv_acpi_madt_oem_check() references
the function __init early_ioremap().
This is often because uv_acpi_madt_oem_check lacks a __init
annotation or the annotation of early_ioremap is wrong.

WARNING: arch/x86/built-in.o(.text+0x7b38d): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_iounmap()
The function uv_acpi_madt_oem_check() references
the function __init early_iounmap().
This is often because uv_acpi_madt_oem_check lacks a __init
annotation or the annotation of early_iounmap is wrong.

WARNING: arch/x86/built-in.o(.data+0x8668): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

Signed-off-by: Leonardo Potenza <lpotenza@inwind.it>
LKML-Reference: <200908161855.48302.lpotenza@inwind.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 19:44:13 +02:00
Randy Dunlap
894ef820b1 dm-log-userspace: fix printk format warning
drivers/md/dm-log-userspace-transfer.c:110: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'size_t'

Previously posted and acked, but apparently lost.
http://lkml.indiana.edu/hypermail/linux/kernel/0906.2/02074.html

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: dm-devel@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-16 08:35:58 -07:00