Commit graph

416 commits

Author SHA1 Message Date
Linus Torvalds
47a5c6fa0e x86: save/restore eflags in context switch
(And reset it on new thread creation)

It turns out that eflags is important to save and restore not just
because of iopl, but due to the magic bits like the NT bit, which we
don't want leaking between different threads.

Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-18 16:20:40 -07:00
David Woodhouse
e5fa6d7031 [PATCH] Fix 'make headers_check' on i386
This brings i386 asm/unistd.h into consistency with other architectures by not
exporting functionality which is not necessary.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-16 12:54:32 -07:00
David Woodhouse
f01f0f052d [PATCH] headers_check: don't expose PFN stuff to userspace in <asm-i386/setup.h>
The header file <linux/pfn.h> doesn't exist in userspace and probably
shouldn't -- but it's used unconditionally in <asm-i386/setup.h>.  Protect it
with #ifdef __KERNEL__ and move setup.h from $(header-y) to $(unifdef-y) in
Kbuild accordingly.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-13 07:32:16 -07:00
David Woodhouse
651c923a44 [PATCH] headers_check: move kernel-only #includes within <asm-i386/elf.h>
Some files which don't exist in userspace were being included unconditionally
in asm-i386/elf.h.  Move the offending #includes down a few lines so that
they're protected by #ifdef __KERNEL__

In fact, we probably want to kill off all userspace use of asm/elf.h -- but we
aren't there yet, so we should at least make it possible to include it for
now.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-13 07:32:15 -07:00
David Woodhouse
b40c274a03 [PATCH] headers_check: move inclusion of <linux/linkage.h> in <asm-i386/signal.h>
Because <linux/linkage.h> doesn't exist in userspace, it should be only
included from within #ifdef __KERNEL__.  Move the corresponding #include

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-13 07:32:15 -07:00
Jeff Garzik
97148ba223 Merge branch 'master' into upstream 2006-09-12 12:03:21 -04:00
David Woodhouse
b4a228346c [PATCH] Remove unneeded asm-i386/cpufeature.h from user visibility.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-06 11:00:02 -07:00
Jeff Garzik
f9bcda7760 Merge branch 'master' into upstream 2006-09-04 06:41:37 -04:00
Andrew Morton
a9aa141cfc [PATCH] x86: increase MAX_MP_BUSSES on default arch
Vitezslav Samel <samel@mail.cz> reports that an HP DL380 g4 fails using the
default arch due to the ISA bus having an ID of 32.

It would have worked OK with the generic arch - for some reason the default
arch doesn't support as many busses.

So bump that up to support 256 busses, but leave it at 32 if we're building a
tiny system to save a bit of memory.

Cc: Vitezslav Samel <samel@mail.cz>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-01 11:39:08 -07:00
Chris Wright
22db37ec5f [PATCH] i386: rwlock.h fix smp alternatives fix
Commit 8c74932779 ("i386: Remove
alternative_smp") did not actually compile on x86 with CONFIG_SMP.

This fixes the __build_read/write_lock helpers.  I've boot tested on
SMP.

[ Andi: "Oops, I think that was a quilt unrefreshed patch.  Sorry.  I
  fixed those before testing, but then still send out the old patch." ]

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-31 10:46:07 -07:00
Andi Kleen
386dcafaac [PATCH] i386: Remove __KERNEL__ ifdef around _syscall*()
After all their only point is having them in user space.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen
8c74932779 [PATCH] i386: Remove alternative_smp
The .fill causes miscompilations with some binutils version.

Instead just patch the lock prefix in the lock constructs. That is the
majority of the cost and should be good enough.

Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich
ea424055b7 [PATCH] x86: Make backtracer fallback logic more bullet-proof
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.

Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.

Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich
61171b8dbd [PATCH] x86: fix x86 cpuid keys used in alternative_smp()
By hard-coding the cpuid keys for alternative_smp() rather than using
the symbolic constant it turned out that incorrect values were used on
both i386 (0x68 instead of 0x69) and x86-64 (0x66 instead of 0x68).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jeff Garzik
b01e86fee6 Merge /spare/repo/linux-2.6 into upstream 2006-08-29 17:55:59 -04:00
KAMEZAWA Hiroyuki
4e54bdaa9c [PATCH] CONFIG_ACPI_SRAT NUMA build fix
In file included from include/asm/mmzone.h:18,
                   from include/linux/mmzone.h:439,
  <snip>
  include/asm/srat.h:31:2: error: #error CONFIG_ACPI_SRAT not defined, and srat.h header has been included
  make[1]: *** [arch/i386/kernel/asm-offsets.s] Error 1

This can happen with CONFIG_NUMA && !CONFIG_ACPI && !CONFIG_X86_NUMAQ

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27 11:01:32 -07:00
Alan Cox
2ec7df0457 [PATCH] libata: rework legacy handling to remove much of the cruft
Kill host_set->next
Fix simplex support
Allow per platform setting of IDE legacy bases

Some of this can be tidied further later on, in particular all the
legacy port gunge belongs as a PCI quirk/PCI header decode to understand
the special legacy IDE rules in the PCI spec.

Longer term Jeff also wants to move the request_irq/free_irq out of core
which will make this even cleaner.

tj: folded in three followup patches - ata_piix-fix, broken-arch-fix
and fix-new-legacy-handling, and separated per-dev xfermask into
separate patch preceding this one.  Folded in fixes are...

* ata_piix-fix: fix build failure due to host_set->next removal
* broken-arch-fix: add missing include/asm-*/libata-portmap.h
* fix-new-legacy-handling:
	* In ata_pci_init_legacy_port(), probe_num was incorrectly
          incremented during initialization of the secondary port and
          probe_ent->n_ports was incorrectly fixed to 1.

	* Both legacy ports ended up having the same hard_port_no.

	* When printing port information, both legacy ports printed
	  the first irq.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-08-10 16:59:10 +09:00
bibo, mao
a9ad965ea9 [PATCH] IA64: kprobe invalidate icache of jump buffer
Kprobe inserts breakpoint instruction in probepoint and then jumps to
instruction slot when breakpoint is hit, the instruction slot icache must
be consistent with dcache.  Here is the patch which invalidates instruction
slot icache area.

Without this patch, in some machines there will be fault when executing
instruction slot where icache content is inconsistent with dcache.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:38 -07:00
Steven Rostedt
52393ccc0a [PATCH] remove set_wmb - arch removal
set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit.  Since it is not currently used in the
kernel this patch removes it so that new code does not include it.

All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead.  But this is
still moot since it is not used anyway.

Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14 21:56:14 -07:00
Chuck Ebbert
b43c7cec6b [PATCH] i386: system.h: remove extra semicolons and fix order
include/asm-i386/system.h has trailing semicolons in some of the
macros that cause legitimate code to fail compilation, so remove
them. Also remove extra blank lines within one group of macros.

And put stts() and clts() back together; they got separated somehow.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-13 07:48:28 -07:00
Stephane Eranian
b3cf257623 [PATCH] i386: use thread_info flags for debug regs and IO bitmaps
Use thread info flags to track use of debug registers and IO bitmaps.

 - add TIF_DEBUG to track when debug registers are active
 - add TIF_IO_BITMAP to track when I/O bitmap is used
 - modify __switch_to() to use the new TIF flags

Performance tested on Pentium II, ten runs of LMbench context switch
benchmark (smaller is better:)

	before	after
avg	3.65	3.39
min	3.55	3.33

Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-09 18:47:12 -07:00
Linus Torvalds
b862f3b099 i386: improve and correct inline asm memory constraints
Use "+m" rather than a combination of "=m" and "m" for improved clarity
and consistency.

This also fixes some inlines that incorrectly didn't tell the compiler
that they read the old value at all, potentially causing the compiler to
generate bogus code.  It appear that all of those potential bugs were
hidden by the use of extra "volatile" specifiers on the data structures
in question, though.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-08 15:24:18 -07:00
Linus Torvalds
6fa0cb1141 Merge git://git.infradead.org/hdrinstall-2.6
* git://git.infradead.org/hdrinstall-2.6:
  Remove export of include/linux/isdn/tpam.h
  Remove <linux/i2c-id.h> and <linux/i2c-algo-ite.h> from userspace export
  Restrict headers exported to userspace for SPARC and SPARC64
  Add empty Kbuild files for 'make headers_install' in remaining arches.
  Add Kbuild file for Alpha 'make headers_install'
  Add Kbuild file for SPARC 'make headers_install'
  Add Kbuild file for IA64 'make headers_install'
  Add Kbuild file for S390 'make headers_install'
  Add Kbuild file for i386 'make headers_install'
  Add Kbuild file for x86_64 'make headers_install'
  Add Kbuild file for PowerPC 'make headers_install'
  Add generic Kbuild files for 'make headers_install'
  Basic implementation of 'make headers_check'
  Basic implementation of 'make headers_install'
2006-07-04 12:55:45 -07:00
Ingo Molnar
8a25d5debf [PATCH] lockdep: prove spinlock rwlock locking correctness
Use the lock validator framework to prove spinlock and rwlock locking
correctness.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:04 -07:00
Ingo Molnar
4ea2176dfa [PATCH] lockdep: prove rwsem locking correctness
Use the lock validator framework to prove rwsem locking correctness.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:04 -07:00
Ingo Molnar
c8558fcdec [PATCH] lockdep: irqtrace cleanup of include/asm-i386/irqflags.h
Clean up the x86 irqflags.h file:

 - macro => inline function transformation
 - simplifications
 - style fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar
55f327fa9e [PATCH] lockdep: irqtrace subsystem, i386 support
Add irqflags-tracing support to i386.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar
c4e05116a2 [PATCH] lockdep: clean up rwsems
Clean up rwsems.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:01 -07:00
Thomas Gleixner
4879d77c4c [PATCH] irq-flags: i386: Use the new IRQF_ constants
Use the new IRQF_ constants and remove the SA_INTERRUPT define

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02 13:58:47 -07:00
Gerd Hoffmann
8ec4d41f88 [PATCH] SMP alternatives: skip with UP kernels
Hide the magic in alternative.h and provide some dummy inline functions
for the UP case (gcc should manage to optimize away these calls).  No
changes in module.c.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-01 09:56:02 -07:00
Catherine Zhang
877ce7c1b3 [AF_UNIX]: Datagram getpeersec
This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket.  The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.

Patch design and implementation:

The design and implementation is very similar to the UDP case for INET
sockets.  Basically we build upon the existing Unix domain socket API for
retrieving user credentials.  Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message).  To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt.  Then the application
retrieves the security context using the auxiliary data mechanism.

An example server application for Unix datagram socket should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
    cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
    if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
        cmsg_hdr->cmsg_level == SOL_SOCKET &&
        cmsg_hdr->cmsg_type == SCM_SECURITY) {
        memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
    }
}

sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.

Testing:

We have tested the patch by setting up Unix datagram client and server
applications.  We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:06 -07:00
Ingo Molnar
c0ad90a32f [PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)

NOTE: ia64 needs testing. i386 and x86_64 tested.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:23 -07:00
Adrian Bunk
0686cd8fbe [PATCH] fix sgivwfb compile
drivers/built-in.o: In function `sgivwfb_set_par':
sgivwfb.c:(.text+0x88583): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.text+0x88596): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.text+0x885a8): undefined reference to `sgivwfb_mem_phys'
drivers/built-in.o: In function `sgivwfb_check_var':
sgivwfb.c:(.text+0x88ad0): undefined reference to `sgivwfb_mem_size'
drivers/built-in.o: In function `sgivwfb_mmap':
sgivwfb.c:(.text+0x88c75): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.text+0x88c7f): undefined reference to `sgivwfb_mem_phys'
drivers/built-in.o: In function `sgivwfb_probe':
sgivwfb.c:(.init.text+0x4060): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x4065): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.init.text+0x4076): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.init.text+0x409c): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x410e): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x4113): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.init.text+0x4162): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x4168): undefined reference to `sgivwfb_mem_phys'
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:19 -07:00
Siddha, Suresh B
5c45bf279d [PATCH] sched: mc/smt power savings sched policy
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.

Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains.  When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics...  see OLS
2005 CMP kernel scheduler paper for more details..)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Ingo Molnar
e6e5494cb2 [PATCH] vdso: randomize the i386 vDSO by moving it into a vma
Move the i386 VDSO down into a vma and thus randomize it.

Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.

It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).

There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.

There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.

(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)

This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Chuck Ebbert
c723e08460 [PATCH] i386: use C code for current_thread_info()
Using C code for current_thread_info() lets the compiler optimize it.
With gcc 4.0.2, kernel is smaller:

    text           data     bss     dec     hex filename
 3645212         555556  312024 4512792  44dc18 2.6.17-rc6-nb-post/vmlinux
 3647276         555556  312024 4514856  44e428 2.6.17-rc6-nb/vmlinux
 -------
   -2064

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Rohit Seth
4b89aff930 [PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86
Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure.  Similar
patch for x86_64 is already accepted by Andi earlier this week.

[akpm@osdl.org: fix warning]
Signed-off-by: Rohit Seth <rohitseth@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Yasunori Goto
0fc44159bf [PATCH] Register sysfs file for hotplugged new node
When new node becomes enable by hot-add, new sysfs file must be created for
new node.  So, if new node is enabled by add_memory(), register_one_node() is
called to create it.  In addition, I386's arch_register_node() and a part of
register_nodes() of powerpc are consolidated to register_one_node() as a
generic_code().

This is tested by Tiger4(IPF) with node hot-plug emulation.

Signed-off-by: Keiichiro Tokunaga <tokuanga.keiich@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Linus Torvalds
da206c9e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  typo fixes
  Clean up 'inline is not at beginning' warnings for usb storage
  Storage class should be first
  i386: Trivial typo fixes
  ixj: make ixj_set_tone_off() static
  spelling fixes
  fix paniced->panicked typos
  Spelling fixes for Documentation/atomic_ops.txt
  move acknowledgment for Mark Adler to CREDITS
  remove the bouncing email address of David Campbell
2006-06-26 13:33:14 -07:00
Linus Torvalds
81a07d7588 Merge branch 'x86-64'
* x86-64: (83 commits)
  [PATCH] x86_64: x86_64 stack usage debugging
  [PATCH] x86_64: (resend) x86_64 stack overflow debugging
  [PATCH] x86_64: msi_apic.c build fix
  [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
  [PATCH] x86_64: Avoid broadcasting NMI IPIs
  [PATCH] x86_64: fix apic error on bootup
  [PATCH] x86_64: enlarge window for stack growth
  [PATCH] x86_64: Minor string functions optimizations
  [PATCH] x86_64: Move export symbols to their C functions
  [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
  [PATCH] x86_64: Fix modular pc speaker
  [PATCH] x86_64: remove sys32_ni_syscall()
  [PATCH] x86_64: Do not use -ffunction-sections for modules
  [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle
  [PATCH] x86_64: adjust kstack_depth_to_print default
  [PATCH] i386/x86-64: adjust /proc/interrupts column headings
  [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels
  [PATCH] x86_64: Fix fast check in safe_smp_processor_id
  [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information
  [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
  ...

Manual resolve of trivial conflict in arch/i386/kernel/Makefile
2006-06-26 10:51:09 -07:00
Venkatesh Pallipadi
0080e66755 [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Keith Owens
e77deacb7b [PATCH] x86_64: Avoid broadcasting NMI IPIs
On some i386/x86_64 systems, sending an NMI IPI as a broadcast will
reset the system.  This seems to be a BIOS bug which affects machines
where one or more cpus are not under OS control.  It occurs on HT
systems with a version of the OS that is not compiled without HT
support.  It also occurs when a system is booted with max_cpus=n where
2 <= n < cpus known to the BIOS.  The fix is to always send NMI IPI as
a mask instead of as a broadcast.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Keith Owens
45486f81c9 [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
x86_64 and i386 behave inconsistently when sending an IPI on vector 2
(NMI_VECTOR).  Make both behave the same, so IPI 2 is sent as NMI.

The crash code was abusing send_IPI_allbutself() by passing a code
instead of a vector, it only worked because crash knew about the
internal code of send_IPI_allbutself().  Change crash to use NMI_VECTOR
instead, and remove the comment about how crash was abusing the function.

This patch is a pre-requisite for fixing the problem where sending an
IPI as NMI would reboot some Dell Xeon systems.  I cannot fix that
problem while crash continus to abuse send_IPI_allbutself().

It also removes the inconsistency between i386 and x86_64 for
NMI_VECTOR.  That will simplify all the RAS code that needs to bring
all the cpus to a clean stop, even when one or more cpus are spinning
disabled.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Andi Kleen
da5311258d [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels
When a process changes CPUs while doing the non atomic cpu_local_*
operations it might operate on the local_t of a different CPUs.

Fix that by disabling preemption.

Pointed out by Christopher Lameter

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Andi Kleen
495ab9c045 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
During some profiling I noticed that default_idle causes a lot of
memory traffic. I think that is caused by the atomic operations
to clear/set the polling flag in thread_info. There is actually
no reason to make this atomic - only the idle thread does it
to itself, other CPUs only read it. So I moved it into ti->status.

Converted i386/x86-64/ia64 for now because that was the easiest
way to fix ACPI which also manipulates these flags in its idle
function.

Cc: Nick Piggin <npiggin@novell.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Jan Beulich
c33bd9aac0 [PATCH] i386/x86-64: fall back to old-style call trace if no unwinding
If no unwinding is possible at all for a certain exception instance,
fall back to the old style call trace instead of not showing any trace
at all.

Also, allow setting the stack trace mode at the command line.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jan Beulich
fe7cacc1c2 [PATCH] i386: reliable stack trace support i386 entry.S
To increase the usefulness of reliable stack unwinding, this adds CFI
unwind annotations to many low-level i386 routines.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Jan Beulich
176a2718f4 [PATCH] i386: reliable stack trace support (i386)
These are the i386-specific pieces to enable reliable stack traces. This is
going to be even more useful once CFI annotations get added to he assembly
code, namely to entry.S.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Don Zickus
3e4ff11574 [PATCH] x86_64: nmi watchdog header cleanup
Misc header cleanup for nmi watchdog.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Andi Kleen
a32073bffc [PATCH] x86_64: Clean and enhance up K8 northbridge access code
- Factor out the duplicated access/cache code into a single file
   * Shared between i386/x86-64.
 - Share flush code between AGP and IOMMU
   * Fix a bug: AGP didn't wait for end of flush before
 - Drop 8 northbridges limit and allocate dynamically
 - Add lock to serialize AGP and IOMMU GART flushes
 - Add PCI ID for next AMD northbridge
 - Random related cleanups

The old K8 NUMA discovery code is unchanged. New systems
should all use SRAT for this.

Cc: "Navin Boppuri" <navin.boppuri@newisys.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Gerd Hoffmann
d167a51877 [PATCH] x86_64: x86_64 version of the smp alternative patch.
Changes are largely identical to the i386 version:

 * alternative #define are moved to the new alternative.h file.
 * one new elf section with pointers to the lock prefixes which can be
   nop'ed out for non-smp.
 * two new elf sections simliar to the "classic" alternatives to
   replace SMP code with simpler UP code.
 * fixup headers to use alternative.h instead of defining their own
   LOCK / LOCK_PREFIX macros.

The patch reuses the i386 version of the alternatives code to avoid code
duplication.  The code in alternatives.c was shuffled around a bit to
reduce the number of #ifdefs needed.  It also got some tweaks needed for
x86_64 (vsyscall page handling) and new features (noreplacement option
which was x86_64 only up to now).  Debug printk's are changed from
compile-time to runtime.

Loosely based on a early version from Bastian Blank <waldi@debian.org>

Signed-off-by: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Andi Kleen
240cd6a806 [PATCH] i386/x86-64: Emulate CPUID4 on AMD
Intel systems report the cache level data from CPUID 4 in sysfs.
Add a CPUID 4 emulation for AMD CPUs to report the same
information for them. This allows programs to read this
information in a uniform way.

The AMD way to report this is less flexible so some assumptions
are hardcoded (e.g. no L3)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Anil S Keshavamurthy
e6f47f978b [PATCH] Notify page fault call chain
With this patch Kprobes now registers for page fault notifications only when
their is an active probe registered.  Once all the active probes are
unregistered their is no need to be notified of page faults and kprobes
unregisters itself from the page fault notifications.  Hence we will have ZERO
side effects when no probes are active.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:22 -07:00
Anil S Keshavamurthy
b71b5b6528 [PATCH] Notify page fault call chain for i386
Overloading of page fault notification with the notify_die() has performance
issues(since the only interested components for page fault is kprobes and/or
kdb) and hence this patch introduces the new notifier call chain exclusively
for page fault notifications their by avoiding notifying unnecessary
components in the do_page_fault() code path.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:22 -07:00
john stultz
6f84fa2f3e [PATCH] Time: i386 Conversion - part 3: Enable Generic Timekeeping
This converts the i386 arch to use the generic timeofday subsystem.  It
enabled the GENERIC_TIME option, disables the timer_opts code and other arch
specific timekeeping code and reworks the delay code.

While this patch enables the generic timekeeping, please note that this patch
does not provide any i386 clocksource.  Thus only the jiffies clocksource will
be available.  To get full replacements for the code being disabled here, the
timeofday-clocks-i386 patch will needed.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
john stultz
539eb11e6e [PATCH] Time: i386 Conversion - part 2: Rework TSC Support
As part of the i386 conversion to the generic timekeeping infrastructure, this
introduces a new tsc.c file.  The code in this file replaces the TSC
initialization, management and access code currently in timer_tsc.c (which
will be removed) that we want to preserve.

The code also introduces the following functionality:

o tsc_khz: like cpu_khz but stores the TSC frequency on systems that do not
  change TSC frequency w/ CPU frequency

o check/mark_tsc_unstable: accessor/modifier flag for TSC timekeeping
  usability

o minor cleanups to calibration math.

This patch also includes a one line __cpuinitdata fix from Zwane Mwaikambo.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
Andreas Mohr
d6e05edc59 spelling fixes
acquired (aquired)
contiguous (contigious)
successful (succesful, succesfull)
surprise (suprise)
whether (weather)
some other misspellings

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26 18:35:02 +02:00
NeilBrown
7c12d81134 [PATCH] Make copy_from_user_inatomic NOT zero the tail on i386
As described in a previous patch and documented in mm/filemap.h,
copy_from_user_inatomic* shouldn't zero out the tail of the buffer after an
incomplete copy.

This patch implements that change for i386.

For the _nocache version, a new __copy_user_intel_nocache is defined similar
to copy_user_zeroio_intel_nocache, and this is ultimately used for the copy.

For the regular version, __copy_from_user_ll_nozero is defined which uses
__copy_user and __copy_user_intel - the later needs casts to reposition the
__user annotations.

If copy_from_user_atomic is given a constant length of 1, 2, or 4, then we do
still zero the destintion on failure.  This didn't seem worth the effort of
fixing as the places where it is used really don't care.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:09 -07:00
NeilBrown
01408c4939 [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes
The problem is that when we write to a file, the copy from userspace to
pagecache is first done with preemption disabled, so if the source address is
not immediately available the copy fails *and* *zeros* *the* *destination*.

This is a problem because a concurrent read (which admittedly is an odd thing
to do) might see zeros rather that was there before the write, or what was
there after, or some mixture of the two (any of these being a reasonable thing
to see).

If the copy did fail, it will immediately be retried with preemption
re-enabled so any transient problem with accessing the source won't cause an
error.

The first copying does not need to zero any uncopied bytes, and doing so
causes the problem.  It uses copy_from_user_atomic rather than copy_from_user
so the simple expedient is to change copy_from_user_atomic to *not* zero out
bytes on failure.

The first of these two patches prepares for the change by fixing two places
which assume copy_from_user_atomic does zero the tail.  The two usages are
very similar pieces of code which copy from a userspace iovec into one or more
page-cache pages.  These are changed to remove the assumption.

The second patch changes __copy_from_user_inatomic* to not zero the tail.
Once these are accepted, I will look at similar patches of other architectures
where this is important (ppc, mips and sparc being the ones I can find).

This patch:

There is a problem with __copy_from_user_inatomic zeroing the tail of the
buffer in the case of an error.  As it is called in atomic context, the error
may be transient, so it results in zeros being written where maybe they
shouldn't be.

In the usage in filemap, this opens a window for a well timed read to see data
(zeros) which is not consistent with any ordering of reads and writes.

Most cases where __copy_from_user_inatomic is called, a failure results in
__copy_from_user being called immediately.  As long as the latter zeros the
tail, the former doesn't need to.  However in *copy_from_user_iovec
implementations (in both filemap and ntfs/file), it is assumed that
copy_from_user_inatomic will zero the tail.

This patch removes that assumption, so that after this patch it will
be safe for copy_from_user_inatomic to not zero the tail.

This patch also adds some commentary to filemap.h and asm-i386/uaccess.h.

After this patch, all architectures that might disable preempt when
kmap_atomic is called need to have their __copy_from_user_inatomic* "fixed".
This includes
 - powerpc
 - i386
 - mips
 - sparc

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:09 -07:00
Matt Mackall
afedfd016a [PATCH] random: remove SA_SAMPLE_RANDOM from floppy driver
The floppy driver is already calling add_disk_randomness as it should, so this
was redundant.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:00 -07:00
Jeremy Fitzhardinge
e75eac33b5 [PATCH] Clean up and refactor i386 sub-architecture setup
Clean up and refactor i386 sub-architecture setup.

This change moves all the code from the
asm-i386/mach-*/setup_arch_pre/post.h headers, into
arch/i386/mach-*/setup.c.  mach-*/setup_arch_pre.h is renamed to
setup_arch.h, and contains only things which should be in header files.  It
is purely code-motion; there should be no functional changes at all.

Several functions in arch/i386/kernel/setup.c needed to be made non-static
so that they're visible to the code in mach-*/setup.c.  asm-i386/setup.h is
used to hold the prototypes for these functions.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Cc: Martin Bligh <mbligh@google.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Andrey Panin <pazke@donpac.ru>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:55 -07:00
Linus Torvalds
37224470c8 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (65 commits)
  ACPI: suppress power button event on S3 resume
  ACPI: resolve merge conflict between sem2mutex and processor_perflib.c
  ACPI: use for_each_possible_cpu() instead of for_each_cpu()
  ACPI: delete newly added debugging macros in processor_perflib.c
  ACPI: UP build fix for bugzilla-5737
  Enable P-state software coordination via _PDC
  P-state software coordination for speedstep-centrino
  P-state software coordination for acpi-cpufreq
  P-state software coordination for ACPI core
  ACPI: create acpi_thermal_resume()
  ACPI: create acpi_fan_suspend()/acpi_fan_resume()
  ACPI: pass pm_message_t from acpi_device_suspend() to root_suspend()
  ACPI: create acpi_device_suspend()/acpi_device_resume()
  ACPI: replace spin_lock_irq with mutex for ec poll mode
  ACPI: Allow a WAN module enable/disable on a Thinkpad X60.
  sem2mutex: acpi, acpi_link_lock
  ACPI: delete unused acpi_bus_drivers_lock
  sem2mutex: drivers/acpi/processor_perflib.c
  ACPI add ia64 exports to build acpi_memhotplug as a module
  ACPI: asus_acpi_init(): propagate correct return value
  ...

Manual resolve of conflicts in:

	arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/i386/kernel/cpu/cpufreq/speedstep-centrino.c
	include/acpi/processor.h
2006-06-23 07:52:36 -07:00
Kirill Smelkov
30343d6c3d [PATCH] x86: compile fix for asm-i386/alternatives.h
compile fix:  <asm-i386/alternative.h>  needs  <asm/types.h> for 'u8' --
just look at struct alt_instr.

My module includes <asm/bitops.h> as the first header, and as of 2.6.17 this
leads to compilation errors.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:59 -07:00
Michal Ludvig
224f611c16 [PATCH] x86: VIA C7 CPU flags
New CPU flags for next generation of crypto engine as found in VIA C7
processors.

Signed-off-by: Michal Ludvig <michal@logix.cz>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:59 -07:00
Roman Zippel
722f4f5b26 [PATCH] x86: fix __range_ok constraint
An immediate operand can't be the destination of the cmpl instruction,
so exclude it.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Mattia Dongili <malattia@linux.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:58 -07:00
Alexey Dobriyan
a03a3e287b [PATCH] Don't trigger full rebuild via CONFIG_X86_MCE
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Alexey Dobriyan
27b07da733 [PATCH] Don't trigger full rebuild via CONFIG_MTRR
Only drm, framebuffer, mtrr parts + misc files here and there.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Adrian Bunk
a0b4da91f4 [PATCH] arch/i386/kernel/apic.c: make modern_apic() static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Hiro Yoshioka
c22ce143d1 [PATCH] x86: cache pollution aware __copy_from_user_ll()
Use the x86 cache-bypassing copy instructions for copy_from_user().

Some performance data are

Total of GLOBAL_POWER_EVENTS (CPU cycle samples)

2.6.12.4.orig    1921587
2.6.12.4.nt      1599424
1599424/1921587=83.23% (16.77% reduction)

BSQ_CACHE_REFERENCE (L3 cache miss)
2.6.12.4.orig      57427
2.6.12.4.nt        20858
20858/57427=36.32% (63.7% reduction)

L3 cache miss reduction of __copy_from_user_ll
samples  %
37408    65.1412  vmlinux                  __copy_from_user_ll
23        0.1103  vmlinux                  __copy_user_zeroing_intel_nocache
23/37408=0.061% (99.94% reduction)

Top 5 of 2.6.12.4.nt
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
samples  %        app name                 symbol name
128392    8.0274  vmlinux                  __copy_user_zeroing_intel_nocache
64206     4.0143  vmlinux                  journal_add_journal_head
59746     3.7355  vmlinux                  do_get_write_access
47674     2.9807  vmlinux                  journal_put_journal_head
46021     2.8774  vmlinux                  journal_dirty_metadata
pattern9-0-cpu4-0-09011728/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000
samples  %        app name                 symbol name
69755     4.2861  vmlinux                  __copy_user_zeroing_intel_nocache
55685     3.4215  vmlinux                  journal_add_journal_head
52371     3.2179  vmlinux                  __find_get_block
45504     2.7960  vmlinux                  journal_put_journal_head
36005     2.2123  vmlinux                  journal_stop
pattern9-0-cpu4-0-09011744/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000
samples  %        app name                 symbol name
1147      5.4994  vmlinux                  journal_add_journal_head
881       4.2240  vmlinux                  journal_dirty_data
872       4.1809  vmlinux                  blk_rq_map_sg
734       3.5192  vmlinux                  journal_commit_transaction
617       2.9582  vmlinux                  radix_tree_delete
pattern9-0-cpu4-0-09011731/summary.out

iozone results are

original 2.6.12.4 CPU time = 207.768 sec
cache aware       CPU time = 184.783 sec
(three times run)
184.783/207.768=88.94% (11.06% reduction)

original:
pattern9-0-cpu4-0-08191720/iozone.out:  CPU Utilization: Wall time   45.997    CPU time   64.527    CPU utilization 140.28 %
pattern9-0-cpu4-0-08191741/iozone.out:  CPU Utilization: Wall time   46.878    CPU time   71.933    CPU utilization 153.45 %
pattern9-0-cpu4-0-08191743/iozone.out:  CPU Utilization: Wall time   45.152    CPU time   71.308    CPU utilization 157.93 %

cache awre:
pattern9-0-cpu4-0-09011728/iozone.out:  CPU Utilization: Wall time   44.842    CPU time   62.465    CPU utilization 139.30 %
pattern9-0-cpu4-0-09011731/iozone.out:  CPU Utilization: Wall time   44.718    CPU time   59.273    CPU utilization 132.55 %
pattern9-0-cpu4-0-09011744/iozone.out:  CPU Utilization: Wall time   44.367    CPU time   63.045    CPU utilization 142.10 %

Signed-off-by: Hiro Yoshioka <hyoshiok@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Christoph Lameter
1b2db9fb7a [PATCH] sys_move_pages: 32bit support (i386, x86_64)
sys_move_pages() support for 32bit (i386 plus x86_64 compat layer)

Add support for move_pages() on i386 and also add the compat functions
necessary to run 32 bit binaries on x86_64.

Add compat_sys_move_pages to the x86_64 32bit binary layer.  Note that it is
not up to date so I added the missing pieces.  Not sure if this is done the
right way.

[akpm@osdl.org: compile fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Linus Torvalds
6c763eb9ea Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (27 commits)
  [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible
  [PATCH] PCI: fix issues with extended conf space when MMCONFIG disabled because of e820
  [PATCH] PCI: Bus Parity Status sysfs interface
  [PATCH] PCI: fix memory leak in MMCONFIG error path
  [PATCH] PCI: fix error with pci_get_device() call in the mpc85xx driver
  [PATCH] PCI: MSI-K8T-Neo2-Fir: run only where needed
  [PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev
  [PATCH] PCI: clean up pci documentation to be more specific
  [PATCH] PCI: remove unneeded msi code
  [PATCH] PCI: don't move ioapics below PCI bridge
  [PATCH] PCI: cleanup unused variable about msi driver
  [PATCH] PCI: disable msi mode in pci_disable_device
  [PATCH] PCI: Allow MSI to work on kexec kernel
  [PATCH] PCI: AMD 8131 MSI quirk called too late, bus_flags not inherited ?
  [PATCH] PCI: Move various PCI IDs to header file
  [PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation
  [PATCH] PCI: i386/x86_84: disable PCI resource decode on device disable
  [PATCH] PCI ACPI: Rename the functions to avoid multiple instances.
  [PATCH] PCI: don't enable device if already enabled
  [PATCH] PCI: Add a "enable" sysfs attribute to the pci devices to allow userspace (Xorg) to enable devices without doing foul direct access
  ...
2006-06-22 15:07:59 -07:00
Bjorn Helgaas
4f1bcaf094 [PATCH] vgacon: make VGA_MAP_MEM take size, remove extra use
VGA_MAP_MEM translates to ioremap() on some architectures.  It makes sense
to do this to vga_vram_base, because we're going to access memory between
vga_vram_base and vga_vram_end.

But it doesn't really make sense to map starting at vga_vram_end, because
we aren't going to access memory starting there.  On ia64, which always has
to be different, ioremapping vga_vram_end gives you something completely
incompatible with ioremapped vga_vram_start, so vga_vram_size ends up being
nonsense.

As a bonus, we often know the size up front, so we can use ioremap()
correctly, rather than giving it a zero size.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:58 -07:00
bibo,mao
b209a6ee49 [PATCH] PCI: cleanup unused variable about msi driver
In IA64 platform, msi driver does not use irq_vector variable, and in
x86 platform LAST_DEVICE_VECTOR should one before FIRST_SYSTEM_VECTOR,
this patch modify this.

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:00:00 -07:00
Mark Maule
fd58e55fcf [PATCH] PCI: msi abstractions and support for altix
Abstract portions of the MSI core for platforms that do not use standard
APIC interrupt controllers.  This is implemented through a new arch-specific
msi setup routine, and a set of msi ops which can be set on a per platform
basis.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 11:59:58 -07:00
David Woodhouse
c509075e71 Add Kbuild file for i386 'make headers_install'
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 12:19:46 +01:00
Len Brown
63518472c0 Pull trivial1 into release branch 2006-06-15 15:37:09 -04:00
David Woodhouse
66643de455 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	include/asm-powerpc/unistd.h
	include/asm-sparc/unistd.h
	include/asm-sparc64/unistd.h

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-24 09:22:21 +01:00
Kimball Murray
e0c1e9bf81 [PATCH] x86_64: avoid IRQ0 ioapic pin collision
The patch addresses a problem with ACPI SCI interrupt entry, which gets
re-used, and the IRQ is assigned to another unrelated device.  The patch
corrects the code such that SCI IRQ is skipped and duplicate entry is
avoided.  Second issue came up with VIA chipset, the problem was caused by
original patch assigning IRQs starting 16 and up.  The VIA chipset uses
4-bit IRQ register for internal interrupt routing, and therefore cannot
handle IRQ numbers assigned to its devices.  The patch corrects this
problem by allowing PCI IRQs below 16.

Cc: len.brown@intel.com

Signed-off by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
David Woodhouse
b07019f293 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-30 20:34:39 +01:00
Chuck Ebbert
543f2a3382 [PATCH] i386: fix broken FP exception handling
The FXSAVE information leak patch introduced a bug in FP exception
handling: it clears FP exceptions only when there are already
none outstanding.  Mikael Pettersson reported that causes problems
with the Erlang runtime and has tested this fix.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-29 14:13:16 -07:00
David Woodhouse
5614253686 Remove unneeded _syscallX macros from user view in asm-*/unistd.h
These aren't needed by glibc or klibc, and they're broken in some cases
anyway. The uClibc folks are apparently switching over to stop using
them too (now that we agreed that they should be dropped, at least).

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-29 01:51:47 +01:00
David Woodhouse
d6754b401a Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2006-04-29 01:42:26 +01:00
Zachary Amsden
6e5882cfa2 [PATCH] x86/PAE: Fix pte_clear for the >4GB RAM case
Proposed fix for ptep_get_and_clear_full PAE bug.  Pte_clear had the same bug,
so use the same fix for both.  Turns out pmd_clear had it as well, but pgds
are not affected.

The problem is rather intricate.  Page table entries in PAE mode are 64-bits
wide, but the only atomic 8-byte write operation available in 32-bit mode is
cmpxchg8b, which is expensive (at least on P4), and thus avoided.  But it can
happen that the processor may prefetch entries into the TLB in the middle of an
operation which clears a page table entry.  So one must always clear the P-bit
in the low word of the page table entry first when clearing it.

Since the sequence *ptep = __pte(0) leaves the order of the write dependent on
the compiler, it must be coded explicitly as a clear of the low word followed
by a clear of the high word.  Further, there must be a write memory barrier
here to enforce proper ordering by the compiler (and, in the future, by the
processor as well).

On > 4GB memory machines, the implementation of pte_clear for PAE was clearly
deficient, as it could leave virtual mappings of physical memory above 4GB
aliased to memory below 4GB in the TLB.  The implementation of
ptep_get_and_clear_full has a similar bug, although not nearly as likely to
occur, since the mappings being cleared are in the process of being destroyed,
and should never be dereferenced again.

But, as luck would have it, it is possible to trigger bugs even without ever
dereferencing these bogus TLB mappings, even if the clear is followed fairly
soon after with a TLB flush or invalidation.  The problem is that memory above
4GB may now be aliased into the first 4GB of memory, and in fact, may hit a
region of memory with non-memory semantics.  These regions include AGP and PCI
space.  As such, these memory regions are not cached by the processor.  This
introduces the bug.

The processor can speculate memory operations, including memory writes, as long
as they are committed with the proper ordering.  Speculating a memory write to
a linear address that has a bogus TLB mapping is possible.  Normally, the
speculation is harmless.  But for cached memory, it does leave the falsely
speculated cacheline unmodified, but in a dirty state.  This cache line will be
eventually written back.  If this cacheline happens to intersect a region of
memory that is not protected by the cache coherency protocol, it can corrupt
data in I/O memory, which is generally a very bad thing to do, and can cause
total system failure or just plain undefined behavior.

These bugs are extremely unlikely, but the severity is of such magnitude, and
the fix so simple that I think fixing them immediately is justified.  Also,
they are nearly impossible to debug.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-27 12:00:59 -07:00
David Woodhouse
cd469e0cc6 Exclude asm-generic/{page,memory_model}.h from user bits of i386/x86_64 page.h
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-27 15:48:08 +01:00
David Woodhouse
62c4f0a2d5 Don't include linux/config.h from anywhere else in include/
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-26 12:56:16 +01:00
Jens Axboe
912d35f867 [PATCH] Add support for the sys_vmsplice syscall
sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice()
moves data to a pipe, with the input being a user address range instead.

This uses an approach suggested by Linus, where we can hold partial ranges
inside the pages[] map. Hopefully this will be useful for network
receive support as well.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:59:21 +02:00
Andi Kleen
18bd057b14 [PATCH] i386/x86-64: Fix x87 information leak between processes
AMD K7/K8 CPUs only save/restore the FOP/FIP/FDP x87 registers in FXSAVE
when an exception is pending.  This means the value leak through
context switches and allow processes to observe some x87 instruction
state of other processes.

This was actually documented by AMD, but nobody recognized it as
being different from Intel before.

The fix first adds an optimization: instead of unconditionally
calling FNCLEX after each FXSAVE test if ES is pending and skip
it when not needed. Then do a x87 load from a kernel variable to
clear FOP/FIP/FDP.

This means other processes always will only see a constant value
defined by the kernel in their FP state.

I took some pain to make sure to chose a variable that's already
in L1 during context switch to make the overhead of this low.

Also alternative() is used to patch away the new code on CPUs
who don't need it.

Patch for both i386/x86-64.

The problem was discovered originally by Jan Beulich. Richard
Brunner provided the basic code for the workarounds, with contribution
from Jan.

This is CVE-2006-1056

Cc: richard.brunner@amd.com
Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-20 07:58:11 -07:00
lepton
1bb858f27e [PATCH] asm-i386/atomic.h: local_irq_save should be used instead of local_irq_disable
atomic_add_return() if CONFIG_M386 can accidentally enable local interrupts.

Signed-off-by: Lepton Wu <ytht.net@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:50 -07:00
Jens Axboe
70524490ee [PATCH] splice: add support for sys_tee()
Basically an in-kernel implementation of tee, which uses splice and the
pipe buffers as an intelligent way to pass data around by reference.

Where the user space tee consumes the input and produces a stdout and
file output, this syscall merely duplicates the data inside a pipe to
another pipe. No data is copied, the output just grabs a reference to the
input pipe data.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 15:51:17 +02:00
Linus Torvalds
88dd9c16ce Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] vfs: add splice_write and splice_read to documentation
  [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
  [PATCH] splice: warning fix
  [PATCH] another round of fs/pipe.c cleanups
  [PATCH] splice: comment styles
  [PATCH] splice: add Ingo as addition copyright holder
  [PATCH] splice: unlikely() optimizations
  [PATCH] splice: speedups and optimizations
  [PATCH] pipe.c/fifo.c code cleanups
  [PATCH] get rid of the PIPE_*() macros
  [PATCH] splice: speedup __generic_file_splice_read
  [PATCH] splice: add direct fd <-> fd splicing support
  [PATCH] splice: add optional input and output offsets
  [PATCH] introduce a "kernel-internal pipe object" abstraction
  [PATCH] splice: be smarter about calling do_page_cache_readahead()
  [PATCH] splice: optimize the splice buffer mapping
  [PATCH] splice: cleanup __generic_file_splice_read()
  [PATCH] splice: only call wake_up_interruptible() when we really have to
  [PATCH] splice: potential !page dereference
  [PATCH] splice: mark the io page as accessed
2006-04-11 06:34:02 -07:00
Andrew Morton
491d4bed80 [PATCH] sys_kexec_load() naming fixups
__NR_sys_kexec_load should be __NR_kexec_load.  Mainly affects users of the
_syscallN() macros, and glibc is already checking for __NR_kexec_load.

Cc: Ulrich Drepper <drepper@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:42 -07:00
Yasunori Goto
c80d79d746 [PATCH] Configurable NODES_SHIFT
Current implementations define NODES_SHIFT in include/asm-xxx/numnodes.h for
each arch.  Its definition is sometimes configurable.  Indeed, ia64 defines 5
NODES_SHIFT values in the current git tree.  But it looks a bit messy.

SGI-SN2(ia64) system requires 1024 nodes, and the number of nodes already has
been changeable by config.  Suitable node's number may be changed in the
future even if it is other architecture.  So, I wrote configurable node's
number.

This patch set defines just default value for each arch which needs multi
nodes except ia64.  But, it is easy to change to configurable if necessary.

On ia64 the number of nodes can be already configured in generic ia64 and SN2
config.  But, NODES_SHIFT is defined for DIG64 and HP'S machine too.  So, I
changed it so that all platforms can be configured via CONFIG_NODES_SHIFT.  It
would be simpler.

See also: http://marc.theaimsgroup.com/?l=linux-kernel&m=114358010523896&w=2

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:39 -07:00
Randy Dunlap
dc8cbaed57 [PATCH] mptspec: remove duplicate #include
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:34 -07:00
OGAWA Hirofumi
7519fdc90f [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
On i386, we don't use sys_ prefix for __NR_*. This patch removes it
[FWIW, _syscall*() macros will generate foo() instead of sys_foo().]

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 14:00:04 +02:00
Jordan Hargrave
b20367a6c2 [PATCH] x86_64: Fix drift with HPET timer enabled
If the HPET timer is enabled, the clock can drift by ~3 seconds a day.
This is due to the HPET timer not being initialized with the correct
setting (still using PIT count).

If HZ changes, this drift can become even more pronounced.

HPET patch initializes tick_nsec with correct tick_nsec settings for
HPET timer.

Vojtech comments:

  "It's not entirely correct (it assumes the HPET ticks totally
   exactly), but it's significantly better than assuming the PIT error
   there."

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:53 -07:00
Andi Kleen
95d769aaf4 [PATCH] i386: Consolidate modern APIC handling
AMD systems have a modern APIC that supports 8 bit IDs, but
don't have a XAPIC version number.  Add a new "modern_apic"
subfunction that handles this correctly and use it (nearly)
everywhere where XAPIC is tested for.

I removed one wart: the code specified that external APICs
would use an 8bit APIC ID. But I checked a real 82093 data sheet
and it says clearly that they only use 4bit. So I removed
this special case since it would a bit awkward to implement now.

I removed the valid APIC tests in mptable parsing completely. On any modern
system they only check against the full field width (8bit) anyways
and are no-ops. This also fixes them doing the wrong thing
on >8 core Opterons.

This makes i386 boot again on 16 core Opterons.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:51 -07:00
Arjan van de Ven
952223683e [PATCH] x86_64: Introduce e820_all_mapped
Introduce a e820_all_mapped() function which checks if the entire range
<start,end> is mapped with type.

This is done by moving the local start variable to the end of each
known-good region; if at the end of the function the start address is
still before end, there must be a part that's not of the correct type;
otherwise it's a good region.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:50 -07:00
Ashok Raj
c12ea918ee x86_64: Remove stale lapic definition from apicdef.h
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-04-01 22:50:03 -05:00
Andrew Morton
2cf8d82d63 [PATCH] make local_t signed
local_t's were defined to be unsigned.  This increases confusion because
atomic_t's are signed.  The patch goes through and changes all implementations
to use signed longs throughout.

Also, x86-64 was using 32-bit quantities for the value passed into local_add()
and local_sub().  Fixed.

All (actually, both) existing users have been audited.

(Also s/__inline__/inline/ in x86_64/local.h)

Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
Andrew Morton
f79e2abb9b [PATCH] sys_sync_file_range()
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00