Commit graph

199 commits

Author SHA1 Message Date
Robert Love
725b38ab54 [PATCH] inotify: add x86-64 syscall entries
Add inotify syscall entries to x86-64.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:37:22 -07:00
Robert Love
d25cb934b0 [PATCH] inotify: add missing hook to sys32_open
Add missing fsnotify_open() hook to sys32_open().

Add fsnotify_open() hook to sys32_open() on x86-64.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:37:22 -07:00
Siddha, Suresh B
9fb1759a31 [PATCH] x86_64: TASK_SIZE fixes for compatibility mode processes
A malicious 32bit app can have an elf section at 0xffffe000.  During
exec of this app, we will have a memory leak as insert_vm_struct() is
not checking for return value in syscall32_setup_pages() and thus not
freeing the vma allocated for the vsyscall page.

Check the return value and free the vma incase of failure.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-15 22:56:11 -07:00
Andrew Morton
d312ceda56 [PATCH] x86_64: section alignment fix
This is the second time this has happened: inserting a new section requires
that we adjust the arithmetic which is used to calculate the vsyscall page's
offset.

Cc: Christoph Lameter <christoph@lameter.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:00:59 -07:00
Sam Ravnborg
d5950b4355 [NET]: add a top-level Networking menu to *config
Create a new top-level menu named "Networking" thus moving
net related options and protocol selection way from the drivers
menu and up on the top-level where they belong.

To implement this all architectures has to source "net/Kconfig" before
drivers/*/Kconfig in their Kconfig file. This change has been
implemented for all architectures.

Device drivers for ordinary NIC's are still to be found
in the Device Drivers section, but Bluetooth, IrDA and ax25
are located with their corresponding menu entries under the new
networking menu item.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 21:03:49 -07:00
Christoph Lameter
6c036527a6 [PATCH] mostly_read data section
Add a new section called ".data.read_mostly" for data items that are read
frequently and rarely written to like cpumaps etc.

If these maps are placed in the .data section then these frequenly read
items may end up in cachelines with data is is frequently updated.  In that
case all processors in an SMP system must needlessly reload the cachelines
again and again containing elements of those frequently used variables.

The ability to share these cachelines will allow each cpu in an SMP system
to keep local copies of those shared cachelines thereby optimizing
performance.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Shaohua Li
3b520b238e [PATCH] MTRR suspend/resume cleanup
There has been some discuss about solving the SMP MTRR suspend/resume
breakage, but I didn't find a patch for it.  This is an intent for it.  The
basic idea is moving mtrr initializing into cpu_identify for all APs (so it
works for cpu hotplug).  For BP, restore_processor_state is responsible for
restoring MTRR.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:42 -07:00
Andreas Steinmetz
a2a892a236 [CRYPTO] Add x86_64 asm AES
Implementation:
===============
The encrypt/decrypt code is based on an x86 implementation I did a while
ago which I never published. This unpublished implementation does
include an assembler based key schedule and precomputed tables. For
simplicity and best acceptance, however, I took Gladman's in-kernel code
for table generation and key schedule for the kernel port of my
assembler code and modified this code to produce the key schedule as
required by my assembler implementation. File locations and Kconfig are
kept similar to the i586 AES assembler implementation.
It may seem a little bit strange to use 32 bit I/O and registers in the
assembler implementation but this gives the best code size. My
implementation takes one instruction more per round compared to
Gladman's x86 assembler but it doesn't require any stack for local
variables or saved registers and it is less serialized than Gladman's
code.
Note that all comparisons to Gladman's code were done after my code was
implemented. I did only use FIPS PUB 197 for the implementation so my
implementation is independent work.
If anybody has a better assembler solution for x86_64 I'll be pleased to
have my code replaced with the better solution.

Testing:
========
The implementation passes the in-kernel crypto testing module and I'm
running it without any problems on my laptop where it is mainly used for
dm-crypt.

Microbenchmark:
===============
The microbenchmark was done in userspace with similar compile flags as
used during kernel compile.
Encrypt/decrypt is about 35% faster than the generic C implementation.
As the generic C as well as my assembler implementation are both table
I don't really expect that there is much room for further
improvements though I'll be glad to be corrected here.
The key schedule is about 5% slower than the generic C implementation.
This is due to the fact that some more work has to be done in the key
schedule routine to fit the schedule to the assembler implementation.

Code Size:
==========
Encrypt and decrypt are together about 2.1 Kbytes smaller than the
generic C implementation which is important with regard to L1 cache
usage. The key schedule routine is about 100 bytes larger than the
generic C implementation.

Data Size:
==========
There's no difference in data size requirements between the assembler
implementation and the generic C implementation.

License:
========
Gladmans's code is dual BSD/GPL whereas my assembler code is GPLv2 only
(I'm  not going to change the license for my code). So I had to change
the module license for the x86_64 aes module from 'Dual BSD/GPL' to
'GPL' to reflect the most restrictive license within the module.

Signed-off-by: Andreas Steinmetz <ast@domdv.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-06 13:55:00 -07:00
Rusty Lynch
6772926bef [PATCH] kprobes: fix namespace problem and sparc64 build
The following renames arch_init, a kprobes function for performing any
architecture specific initialization, to arch_init_kprobes in order to
cleanup the namespace.

Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes
build from the last return probe patch.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-05 19:19:00 -07:00
Ingo Molnar
306e440daf [PATCH] x86: i8253/i8259A lock cleanup
Introduce proper declarations for i8253_lock and i8259A_lock.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:10 -07:00
Greg KH
8644d2a42b Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2005-06-27 22:07:56 -07:00
Greg Kroah-Hartman
1cde8a1681 [PATCH] PCI: use the MCFG table to properly access pci devices (x86-64)
Now that we have access to the whole MCFG table, let's properly use it
for all pci device accesses (as that's what it is there for, some boxes
don't put all the busses into one entry.)

If, for some reason, the table is incorrect, we fallback to the "old
style" of mmconfig accesses, namely, we just assume the first entry in
the table is the one for us, and blindly use it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:48 -07:00
Greg Kroah-Hartman
545493917d [PATCH] PCI: add proper MCFG table parsing to ACPI core.
This patch is the first step in properly handling the MCFG PCI table.
It defines the structures properly, and saves off the table so that the
pci mmconfig code can access it.  It moves the parsing of the table a
little later in the boot process, but still before the information is
needed.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:47 -07:00
Rusty Lynch
ba8af12f43 [PATCH] Return probe redesign: x86_64 specific changes
The following patch contains the x86_64 specific changes for the new
return probe design.  Changes include:
 * Removing the architecture specific functions for querying a return probe
   instance off a stack address
 * Complete rework onf arch_prepare_kretprobe() and trampoline_probe_handler()
 * Removing trampoline_post_handler()
 * Adding arch_init() so that now we handle registering the return probe
   trampoline instead of kernel/kprobes.c doing it

NOTE:
Note that with this new design, the dependency on calculating a pointer to
the task off the stack pointer no longer exist (resolving the problem of
interruption stacks as pointed out in the original feedback to this port.)

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:53 -07:00
Ananth N Mavinakayanahalli
9ec4b1f356 [PATCH] kprobes: fix single-step out of line - take2
Now that PPC64 has no-execute support, here is a second try to fix the
single step out of line during kprobe execution.  Kprobes on x86_64 already
solved this problem by allocating an executable page and using it as the
scratch area for stepping out of line.  Reuse that.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:52 -07:00
Andrea Arcangeli
ffaa8bd6c9 [PATCH] seccomp: tsc disable
I believe at least for seccomp it's worth to turn off the tsc, not just for
HT but for the L2 cache too.  So it's up to you, either you turn it off
completely (which isn't very nice IMHO) or I recommend to apply this below
patch.

This has been tested successfully on x86-64 against current cogito
repository (i686 compiles so I didn't bother testing ;).  People selling
the cpu through cpushare may appreciate this bit for a peace of mind.

There's no way to get any timing info anymore with this applied
(gettimeofday is forbidden of course).  The seccomp environment is
completely deterministic so it can't be allowed to get timing info, it has
to be deterministic so in the future I can enable a computing mode that
does a parallel computing for each task with server side transparent
checkpointing and verification that the output is the same from all the 2/3
seller computers for each task, without the buyer even noticing (for now
the verification is left to the buyer client side and there's no
checkpointing, since that would require more kernel changes to track the
dirty bits but it'll be easy to extend once the basic mode is finished).

Eliminating a cold-cache read of the cr4 global variable will save one
cacheline during the tlb flush while making the code per-cpu-safe at the
same time.  Thanks to Mikael Pettersson for noticing the tlb flush wasn't
per-cpu-safe.

The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but
it'll be transparent to the switch_to code since the IPI won't make any
change to the cr4 contents from the point of view of the interrupted code
and since it's now all per-cpu stuff, it will not race.  So no need to
disable irqs in switch_to slow path.

Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:44 -07:00
Linus Torvalds
2031d0f586 Merge Christoph's freeze cleanup patch 2005-06-25 17:16:53 -07:00
Christoph Lameter
3e1d1d28d9 [PATCH] Cleanup patch for process freezing
1. Establish a simple API for process freezing defined in linux/include/sched.h:

   frozen(process)		Check for frozen process
   freezing(process)		Check if a process is being frozen
   freeze(process)		Tell a process to freeze (go to refrigerator)
   thaw_process(process)	Restart process
   frozen_process(process)	Process is frozen now

2. Remove all references to PF_FREEZE and PF_FROZEN from all
   kernel sources except sched.h

3. Fix numerous locations where try_to_freeze is manually done by a driver

4. Remove the argument that is no longer necessary from two function calls.

5. Some whitespace cleanup

6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
   cleared before setting PF_FROZEN, recalc_sigpending does not check
   PF_FROZEN).

This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 17:10:13 -07:00
Domen Puncer
f45494480f [PATCH] x86_64: coding style and whitespace fixups
Remove some of the unnecessary differences between arch/i386 and
arch/x86_64.  This patch fixes more whitespace issues, some miscellaneous
typos, a wrong URL and a factually incorrect statement about the current
boot sector code.

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:02 -07:00
randy_dunlap
09dbb4768c [PATCH] x86-64: add memcpy/memset prototypes
Put function prototypes for memset() and memcpy() ahead of where
there are used, to kill sparse warnings:

arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: undefined identifier 'memset'
arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:11: warning: undefined identifier 'memcpy'
arch/x86_64/boot/compressed/misc.c:151:2: warning: undefined identifier 'memcpy'
arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: call with no type!
arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:17: warning: call with no type!
arch/x86_64/boot/compressed/misc.c:151:9: warning: call with no type!

Signed-off-by: randy_dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:57 -07:00
Maneesh Soni
72414d3f1d [PATCH] kexec code cleanup
o Following patch provides purely cosmetic changes and corrects CodingStyle
  guide lines related certain issues like below in kexec related files

  o braces for one line "if" statements, "for" loops,
  o more than 80 column wide lines,
  o No space after "while", "for" and "switch" key words

o Changes:
  o take-2: Removed the extra tab before "case" key words.
  o take-3: Put operator at the end of line and space before "*/"

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:55 -07:00
Alexander Nyberg
6e274d1443 [PATCH] kdump: Use real pt_regs from exception
Makes kexec_crashdump() take a pt_regs * as an argument.  This allows to
get exact register state at the point of the crash.  If we come from direct
panic assertion NULL will be passed and the current registers saved before
crashdump.

This hooks into two places:
die(): check the conditions under which we will panic when calling
do_exit and go there directly with the pt_regs that caused the fatal
fault.

die_nmi(): If we receive an NMI lockup while in the kernel use the
pt_regs and go directly to crash_kexec(). We're probably nested up badly
at this point so this might be the only chance to escape with proper
information.

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:54 -07:00
Vivek Goyal
625f1c8219 [PATCH] Kdump: Export crash notes section address through sysfs
o Following patch exports kexec global variable "crash_notes" to user space
  through sysfs as kernel attribute in /sys/kernel.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
Eric W. Biederman
5f5609df0c [PATCH] crashdump: x86_64: crashkernel option
This is the x86_64 implementation of the crashkernel option.  It reserves
a window of memory very early in the bootup process, so we never use
it for anything but the kernel to switch to when the running
kernel panics.

In addition to reserving this memory a resource structure is registered
so looking at /proc/iomem it is clear what happened to that memory.

ISSUES:
Is it possible to implement this in a architecture generic way?
What should be done with architectures that always use an iommu and
thus don't report their RAM memory resources in /proc/iomem?

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman
5234f5eb04 [PATCH] kexec: x86_64 kexec implementation
This is the x86_64 implementation of machine kexec.  32bit compatibility
support has been implemented, and machine_kexec has been enhanced to not care
about the changing internal kernel paget table structures.

From: Alexander Nyberg <alexn@dsv.su.se>

      build fix

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman
d89559589a [PATCH] kexec: x86_64: factor out apic shutdown code
Factor out the apic and smp shutdown code from machine_restart so it can be
called by in the kexec reboot path as well.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman
d0537508a9 [PATCH] kexec: x86_64: add CONFIG_PHYSICAL_START
For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory.  To accomplish this
the two kernels need to built to run at different physical addresses.

This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel
so we can do just that.  You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED

bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB.  If you need a kernel
loaded at a different address as well you need to boot a vmlinux.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Eric W. Biederman
5ded01e83e [PATCH] kexec: x86_64: vmlinux: fix physical addresses
The vmlinux on x86_64 does not report the correct physical address of
the kernel.  Instead in the physical address field it currently
reports the virtual address of the kernel.

This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.

This is potentially a help for crash dump analysis tools.

This definitiely allows bootloaders that load vmlinux as a standard
ELF executable.  Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman
208fb93162 [PATCH] kexec: x86_64: restore apic virtual wire mode on shutdown
When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode.  This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.

This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision.  A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it.  That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman
719e711050 [PATCH] kexec: x86_64: add i8259 shutdown method
From: Eric W. Biederman <ebiederm@xmission.com

The following patch simply adds a shutdown method to the x86_64 i8259 code.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Eric W. Biederman
70adada428 [PATCH] kexec: x86_64: e820 64bit fix
From: Eric W. Biederman <ebiederm@xmission.com>

It is ok to reserve resources > 4G on x86_64 struct resource is 64bit now :)

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Ingo Molnar
cc19ca86a0 [PATCH] consolidate PREEMPT options into kernel/Kconfig.preempt
This patch consolidates the CONFIG_PREEMPT and CONFIG_PREEMPT_BKL
preemption options into kernel/Kconfig.preempt.  This, besides reducing
source-code, also enables more centralized tweaking of preemption related
options.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:45 -07:00
Paul E. McKenney
b2b1866006 [PATCH] RCU: clean up a few remaining synchronize_kernel() calls
2.6.12-rc6-mm1 has a few remaining synchronize_kernel()s, some (but not
all) in comments.  This patch changes these synchronize_kernel() calls (and
comments) to synchronize_rcu() or synchronize_sched() as follows:

- arch/x86_64/kernel/mce.c mce_read(): change to synchronize_sched() to
  handle races with machine-check exceptions (synchronize_rcu() would not cut
  it given RCU implementations intended for hardcore realtime use.

- drivers/input/serio/i8042.c i8042_stop(): change to synchronize_sched() to
  handle races with i8042_interrupt() interrupt handler.  Again,
  synchronize_rcu() would not cut it given RCU implementations intended for
  hardcore realtime use.

- include/*/kdebug.h comments: change to synchronize_sched() to handle races
  with NMIs.  As before, synchronize_rcu() would not cut it...

- include/linux/list.h comment: change to synchronize_rcu(), since this
  comment is for list_del_rcu().

- security/keys/key.c unregister_key_type(): change to synchronize_rcu(),
  since this is interacting with RCU read side.

- security/keys/process_keys.c install_session_keyring(): change to
  synchronize_rcu(), since this is interacting with RCU read side.

Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:38 -07:00
Pavel Machek
8d783b3e02 [PATCH] swsusp: clean assembly parts
This patch fixes register saving so that each register is only saved once,
and adds missing saving of %cr8 on x86-64.  Some reordering so that
save/restore is more logical/safer (segment registers should be restored
after gdt).

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:33 -07:00
Pavel Machek
343c3f6428 [PATCH] s-t-RAM: load gdt the right way
Sleep code uses wrong version of lgdt, that does the wrong thing when
gdt is beyond 16MB or so.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:32 -07:00
Ashok Raj
a02c4cb67e [PATCH] x86_64: Provide ability to choose using shortcuts for IPI in flat mode.
This patch provides an option to switch broadcast or use mask version for
sending IPI's.  If CONFIG_HOTPLUG_CPU is defined, we choose not to use
broadcast shortcuts by default, otherwise we choose broadcast mode as default.

both cases, one can change this via startup cmd line option, to choose
no-broadcast mode.

	no_ipi_broadcast=1

This is provided on request from Andi Kleen, since he doesnt agree with
replacing IPI shortcuts as a solution for CPU hotplug.  Without removing
broadcast IPI's, it would mean lots of new code for __cpu_up() path, which
would acheive the same results.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:31 -07:00
Ashok Raj
884d9e40b4 [PATCH] x86_64: Dont use broadcast shortcut to make it cpu hotplug safe.
Broadcast IPI's provide un-expected behaviour for cpu hotplug.  CPU's in
offline state also end up receiving the IPI.  Once the cpus become online they
receive these stale IPI's which are bad and introduce unexpected behaviour.

This is easily avoided by not sending a broadcast and addressing just the
CPU's in online map.  Doing prelim cycle counts it appears there is no big
overhead and numbers seem around 0x3000-0x3900 on an average on x86 and x86_64
systems with CPUS running 3G, both for broadcast and mask version of the
API's.

The shortcuts are useful only for flat mode (where the perf shows no
degradation), and in cluster mode, its unicast anyway.  Its simpler to just
not use broadcast anymore.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:31 -07:00
Ashok Raj
cb0cd8d49a [PATCH] x86_64: CPU hotplug sibling map cleanup
This patch is a minor cleanup to the cpu sibling/core map.  It is required
that this setup happens on a per-cpu bringup time.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:31 -07:00
Ashok Raj
76e4f660d9 [PATCH] x86_64: CPU hotplug support
Experimental CPU hotplug patch for x86_64
  -----------------------------------------
This supports logical CPU online and offline.
- Test with maxcpus=1, and then kick other cpu's off to test if init code
  is all cleaned up. CONFIG_SCHED_SMT works as well.
- idle threads are forked on demand from keventd threads for clean startup

TBD:
1. Not tested on a real NUMA machine (tested with numa=fake=2)
2. Handle ACPI pieces for physical hotplug support.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Shaohua.li<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Ashok Raj
e6982c671c [PATCH] x86_64: Change init sections for CPU hotplug support
This patch adds __cpuinit and __cpuinitdata sections that need to exist past
boot to support cpu hotplug.

Caveat: This is done *only* for EM64T CPU Hotplug support, on request from
Andi Kleen.  Much of the generic hotplug code in kernel, and none of the other
archs that support CPU hotplug today, i386, ia64, ppc64, s390 and parisc dont
mark sections with __cpuinit, but only mark them as __devinit, and
__devinitdata.

If someone is motivated to change generic code, we need to make sure all
existing hotplug code does not break, on other arch's that dont use __cpuinit,
and __cpudevinit.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Prasanna S Panchamukhi
aa3d7e3d78 [PATCH] kprobes: Temporary disarming of reentrant probe for x86_64
This patch includes x86_64 architecture specific changes to support temporary
disarming on reentrancy of probes.

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:24 -07:00
Rusty Lynch
7e1048b11c [PATCH] Move kprobe [dis]arming into arch specific code
The architecture independent code of the current kprobes implementation is
arming and disarming kprobes at registration time.  The problem is that the
code is assuming that arming and disarming is a just done by a simple write
of some magic value to an address.  This is problematic for ia64 where our
instructions look more like structures, and we can not insert break points
by just doing something like:

*p->addr = BREAKPOINT_INSTRUCTION;

The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
functions:

     * void arch_arm_kprobe(struct kprobe *p)
     * void arch_disarm_kprobe(struct kprobe *p)

and then adds the new functions for each of the architectures that already
implement kprobes (spar64/ppc64/i386/x86_64).

I thought arch_[dis]arm_kprobe was the most descriptive of what was really
happening, but each of the architectures already had a disarm_kprobe()
function that was really a "disarm and do some other clean-up items as
needed when you stumble across a recursive kprobe." So...  I took the
liberty of changing the code that was calling disarm_kprobe() to call
arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
with the recursive kprobe case.

So far this patch as been tested on i386, x86_64, and ppc64, but still
needs to be tested in sparc64.

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Rusty Lynch
73649dab0f [PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.

Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:

static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
	printk("sys_mkdir exited\n");
	return 0;
}
static struct kretprobe return_probe = {
	.handler = sys_mkdir_exit,
};

<inside setup function>

return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
	printk(KERN_DEBUG "Unable to register return probe!\n");
	/* do error path */
}

<inside cleanup function>
unregister_kretprobe(&return_probe);

The way this works is that:

* At system initialization time, kernel/kprobes.c installs a kprobe
  on a function called kretprobe_trampoline() that is implemented in
  the arch/x86_64/kernel/kprobes.c  (More on this later)

* When a return probe is registered using register_kretprobe(),
  kernel/kprobes.c will install a kprobe on the first instruction of the
  targeted function with the pre handler set to arch_prepare_kretprobe()
  which is implemented in arch/x86_64/kernel/kprobes.c.

* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
  - nodes for hanging this instance in an empty or free list
  - a pointer to the return probe
  - the original return address
  - a pointer to the stack address

  With all this stowed away, arch_prepare_kretprobe() then sets the return
  address for the targeted function to a special trampoline function called
  kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c

* The kprobe completes as normal, with control passing back to the target
  function that executes as normal, and eventually returns to our trampoline
  function.

* Since a kprobe was installed on kretprobe_trampoline() during system
  initialization, control passes back to kprobes via the architecture
  specific function trampoline_probe_handler() which will lookup the
  instance in an hlist maintained by kernel/kprobes.c, and then call
  the handler function.

* When trampoline_probe_handler() is done, the kprobes infrastructure
  single steps the original instruction (in this case just a top), and
  then calls trampoline_post_handler().  trampoline_post_handler() then
  looks up the instance again, puts the instance back on the free list,
  and then makes a long jump back to the original return instruction.

So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:

  - A breakpoint at the very beginning of the function allowing us to
    switch out the return address
  - A single step interruption to execute the original instruction that
    we replaced with the break instruction (normal kprobe flow)
  - A breakpoint in the trampoline function where our instrumented function
    returned to
  - A single step interruption to execute the original instruction that
    we replaced with the break instruction (normal kprobe flow)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Vincent Hanquez
76381fee7e [PATCH] xen: x86_64: use more usermode macro
Make use of the user_mode macro where it's possible.  This is useful for Xen
because it will need only to redefine only the macro to a hypervisor call.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Vincent Hanquez
e9129e56e9 [PATCH] xen: x86_64: Add macro for debugreg
Add 2 macros to set and get debugreg on x86_64.  This is useful for Xen
because it will need only to redefine each macro to a hypervisor call.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Natalie Protasevich
701067c466 [PATCH] x86_64: avoid wasting IRQs
I suggest to change the way IRQs are handed out to PCI devices.

Currently, each I/O APIC pin gets associated with an IRQ, no matter if the
pin is used or not.  It is expected that each pin can potentually be
engaged by a device inserted into the corresponding PCI slot.  However,
this imposes severe limitation on systems that have designs that employ
many I/O APICs, only utilizing couple lines of each, such as P64H2 chipset.

It is used in ES7000, and currently, there is no way to boot the system
with more that 9 I/O APICs.

The simple change below allows to boot a system with say 64 (or more) I/O
APICs, each providing 1 slot, which otherwise impossible because of the IRQ
gaps created for unused lines on each I/O APIC.  It does not resolve the
problem with number of devices that exceeds number of possible IRQs, but
eases up a tension for IRQs on any large system with potentually large
number of devices.

I only implemented this for the ACPI boot, since if the system is this big
and using newer chipsets it is probably (better be!) an ACPI based system
:).  The change is completely "mechanical" and does not alter any internal
structures or interrupt model/implementation.  The patch works for both
i386 and x86_64 archs.  It works with MSIs just fine, and should not
intervene with implementations like shared vectors, when they get worked
out and incorporated.

To illustrate, below is the interrupt distribution for 2-cell ES7000 with
20 I/O APICs, and an Ethernet card in the last slot, which should be eth1
and which was not configured because its IRQ exceeded allowable number (it
actially turned out huge - 480!):

zorro-tb2:~ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
  0:      65716      30012      30007      30002      30009      30010      30010      30010    IO-APIC-edge  timer
  4:        373          0        725        280          0          0          0          0    IO-APIC-edge  serial
  8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
 14:         39          3          0          0          0          0          0          0    IO-APIC-edge  ide0
 16:        108         13          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
 18:          0          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb3
 19:         15          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
 23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
 96:       4240        397         18          0          0          0          0          0   IO-APIC-level  aic7xxx
 97:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx
192:        847          0          0          0          0          0          0          0   IO-APIC-level  eth0
NMI:          0          0          0          0          0          0          0          0
LOC:     273423     274528     272829     274228     274092     273761     273827     273694
ERR:          7
MIS:          0

Even though the system doesn't have that many devices, some don't get
enabled only because of IRQ numbering model.

This is the IRQ picture after the patch was applied:

zorro-tb2:~ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
  0:      44169      10004      10004      10001      10004      10003      10004       6135    IO-APIC-edge  timer
  4:        345          0          0          0          0        244          0          0    IO-APIC-edge  serial
  8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
 14:         39          0          3          0          0          0          0          0    IO-APIC-edge  ide0
 17:       4425          0          9          0          0          0          0          0   IO-APIC-level  aic7xxx
 18:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx, uhci_hcd:usb3
 21:        231          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
 22:         26          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
 23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
 24:        348          0          0          0          0          0          0          0   IO-APIC-level  eth0
 25:          6        192          0          0          0          0          0          0   IO-APIC-level  eth1
NMI:          0          0          0          0          0          0          0          0
LOC:     107981     107636     108899     108698     108489     108326     108331     108254
ERR:          7
MIS:          0

Not only we see the card in the last I/O APIC, but we are not even close to
using up available IRQs, since we didn't waste any.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:13 -07:00
Roland McGrath
0928d6ef7f [PATCH] x86_64: never block forced SIGSEGV
This is the x86_64 version of the signal fix I just posted for i386.

This problem was first noticed on PPC and has already been fixed there.
But the exact same issue applies to other platforms in the same way.  The
signal blocking for sa_mask and the handled signal takes place after the
handler setup.  When the stack is bogus, the handler setup forces a
SIGSEGV.  But then this will be blocked, and returning to user mode will
fault again and iterate.  This patch fixes the problem by checking whether
signal handler setup failed, and not doing the signal-blocking if so.  This
copies what was done in the ppc code.  I think all architectures' signal
handler setup code follows this pattern and needs the change.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:13 -07:00
john stultz
a3a00751ad [PATCH] x86_64: fix hpet for systems that don't support legacy replacement
Currently the x86-64 HPET code assumes the entire HPET implementation from
the spec is present.  This breaks on boxes that do not implement the
optional legacy timer replacement functionality portion of the spec.

This patch fixes this issue, allowing x86-64 systems that cannot use the
HPET for the timer interrupt and RTC to still use the HPET as a time
source.  I've tested this patch on a system systems without HPET, with HPET
but without legacy timer replacement, as well as HPET with legacy timer
replacement.

This version adds a minor check to cap the HPET counter value in
gettimeoffset_hpet to avoid possible time inconsistencies.  Please ignore
the A2 version I sent to you earlier.

Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:12 -07:00
Alexander Nyberg
c0a88c9878 [PATCH] x86_64: i8259.c iso99 structure initialization
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:12 -07:00
Christoph Lameter
5912100372 [PATCH] i386: Selectable Frequency of the Timer Interrupt
Make the timer frequency selectable. The timer interrupt may cause bus
and memory contention in large NUMA systems since the interrupt occurs
on each processor HZ times per second.

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Jan Beulich
799d19f6ec [PATCH] allow early printk to use more than 25 lines
Allow early printk code to take advantage of the full size of the screen, not
just the first 25 lines.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Christoph Lameter
8c5a09082f [PATCH] x86/x86_64: pcibus_to_node
Define pcibus_to_node to be able to figure out which NUMA node contains a
given PCI device.  This defines pcibus_to_node(bus) in
include/linux/topology.h and adjusts the macros for i386 and x86_64 that
already provided a way to determine the cpumask of a pci device.

x86_64 was changed to not build an array of cpumasks anymore.  Instead an
array of nodes is build which can be used to generate the cpumask via
node_to_cpumask.

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:08 -07:00
Venkatesh Pallipadi
8a9e1b0f56 [PATCH] Platform SMIs and their interferance with tsc based delay calibration
Issue:
Current tsc based delay_calibration can result in significant errors in
loops_per_jiffy count when the platform events like SMIs
(System Management Interrupts that are non-maskable) are present. This could
lead to potential kernel panic(). This issue is becoming more visible with 2.6
kernel (as default HZ is 1000) and on platforms with higher SMI handling
latencies. During the boot time, SMIs are mostly used by BIOS (for things
like legacy keyboard emulation).

Description:
The psuedocode for current delay calibration with tsc based delay looks like
(0) Estimate a value for loops_per_jiffy
(1) While (loops_per_jiffy estimate is accurate enough)
(2)   wait for jiffy transition (jiffy1)
(3)   Note down current tsc (tsc1)
(4)   loop until tsc becomes tsc1 + loops_per_jiffy
(5)   check whether jiffy changed since jiffy1 or not and refine
loops_per_jiffy estimate

Consider the following cases
Case 1:
If SMIs happen between (2) and (3) above, we can end up with a
loops_per_jiffy value that is too low. This results in shorted delays and
kernel can panic () during boot (Mostly at IOAPIC timer initialization
timer_irq_works() as we don't have enough timer interrupts in a specified
interval).

Case 2:
If SMIs happen between (3) and (4) above, then we can end up with a
loops_per_jiffy value that is too high. And with current i386 code, too
high lpj value (greater than 17M) can result in a overflow in
delay.c:__const_udelay() again resulting in shorter delay and panic().

Solution:
The patch below makes the calibration routine aware of asynchronous events
like SMIs. We increase the delay calibration time and also identify any
significant errors (greater than 12.5%) in the calibration and notify it to
user.

Patch below changes both i386 and x86-64 architectures to use this
new and improved calibrate_delay_direct() routine.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:08 -07:00
Ian Campbell
0f8e2d62fa [PATCH] use ${CROSS_COMPILE}installkernel in arch/*/boot/install.sh
The attached patch causes the various arch specific install.sh scripts to
look for ${CROSS_COMPILE}installkernel rather than just installkernel (in
both /sbin/ and ~/bin/ where the script already did this).  This allows you
to have e.g.  arm-linux-installkernel as a handy way to install on your
cross target.  It also prevents the script picking up on the host
/sbin/installkernel which causes the script to fall through and do the
install itself (which is what I actually use myself, with $INSTALL_PATH
set).

I don't believe it causes back-compatibility problems since calling the
host installkernel was never likely to work or be what you wanted when
cross compiling anyway.  If $CROSS_COMPILE isn't set then nothing changes.

I only use ARM and i386 myself but I figured it couldn't hurt to do the
whole lot.  I've cc'd those who I hope are the arch maintainers for files
that I've touched.

Signed-off-by: Ian Campbell <icampbell@arcom.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:07 -07:00
Matt Tolentino
bbfceef47f [PATCH] add x86-64 specific support for sparsemem
This patch adds in the necessary support for sparsemem such that x86-64
kernels may use sparsemem as an alternative to discontigmem for NUMA
kernels.  Note that this does no preclude one from continuing to build NUMA
kernels using discontigmem, but merely allows the option to build NUMA
kernels with sparsemem.

Interestingly, the use of sparsemem in lieu of discontigmem in NUMA kernels
results in reduced text size for otherwise equivalent kernels as shown in
the example builds below:

   text	   data	    bss	    dec	    hex	filename
2371036	 765884	1237108	4374028	 42be0c	vmlinux.discontig
2366549	 776484	1302772	4445805	 43d66d	vmlinux.sparse

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:07 -07:00
Matt Tolentino
2b97690f4c [PATCH] reorganize x86-64 NUMA and DISCONTIGMEM config options
In order to use the alternative sparsemem implmentation for NUMA kernels,
we need to reorganize the config options.  This patch effectively abstracts
out the CONFIG_DISCONTIGMEM options to CONFIG_NUMA in most cases.  Thus,
the discontigmem implementation may be employed as always, but the
sparsemem implementation may be used alternatively.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:06 -07:00
Matt Tolentino
1035faf1b1 [PATCH] add x86-64 Kconfig options for sparsemem
Add the requisite arch specific Kconfig options to enable the use of the
sparsemem implementation for NUMA kernels on x86-64.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:06 -07:00
Matt Tolentino
073326634b [PATCH] remove direct ref to contig_page_data for x86-64
This patch pulls out all remaining direct references to contig_page_data
from arch/x86-64, thus saving an ifdef in one case.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:06 -07:00
Dave Hansen
3f22ab276b [PATCH] make each arch use mm/Kconfig
For all architectures, this just means that you'll see a "Memory Model"
choice in your architecture menu.  For those that implement DISCONTIGMEM,
you may eventually want to make your ARCH_DISCONTIGMEM_ENABLE a "def_bool
y" and make your users select DISCONTIGMEM right out of the new choice
menu.  The only disadvantage might be if you have some specific things that
you need in your help option to explain something about DISCONTIGMEM.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:02 -07:00
Wolfgang Wander
1363c3cd86 [PATCH] Avoiding mmap fragmentation
Ingo recently introduced a great speedup for allocating new mmaps using the
free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
causes huge performance increases in thread creation.

The downside of this patch is that it does lead to fragmentation in the
mmap-ed areas (visible via /proc/self/maps), such that some applications
that work fine under 2.4 kernels quickly run out of memory on any 2.6
kernel.

The problem is twofold:

  1) the free_area_cache is used to continue a search for memory where
     the last search ended.  Before the change new areas were always
     searched from the base address on.

     So now new small areas are cluttering holes of all sizes
     throughout the whole mmap-able region whereas before small holes
     tended to close holes near the base leaving holes far from the base
     large and available for larger requests.

  2) the free_area_cache also is set to the location of the last
     munmap-ed area so in scenarios where we allocate e.g.  five regions of
     1K each, then free regions 4 2 3 in this order the next request for 1K
     will be placed in the position of the old region 3, whereas before we
     appended it to the still active region 1, placing it at the location
     of the old region 2.  Before we had 1 free region of 2K, now we only
     get two free regions of 1K -> fragmentation.

The patch addresses thes issues by introducing yet another cache descriptor
cached_hole_size that contains the largest known hole size below the
current free_area_cache.  If a new request comes in the size is compared
against the cached_hole_size and if the request can be filled with a hole
below free_area_cache the search is started from the base instead.

The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
(earlier posted) leakme.c test program terminates after 50000+ iterations
with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
(as expected) with thread creation, Ingo's test_str02 with 20000 threads
requires 0.7s system time.

Taking out Ingo's patch (un-patch available per request) by basically
deleting all mentions of free_area_cache from the kernel and starting the
search for new memory always at the respective bases we observe: leakme
terminates successfully with 11 distinctive hardly fragmented areas in
/proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
time for Ingo's test_str02 with 20000 threads.

Now - drumroll ;-) the appended patch works fine with leakme: it ends with
only 7 distinct areas in /proc/self/maps and also thread creation seems
sufficiently fast with 0.71s for 20000 threads.

Signed-off-by: Wolfgang Wander <wwc@rentec.com>
Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:16 -07:00
Ingo Molnar
39c715b717 [PATCH] smp_processor_id() cleanup
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:13 -07:00
Suresh Siddha
84929801e1 [PATCH] x86_64: TASK_SIZE fixes for compatibility mode processes
Appended patch will setup compatibility mode TASK_SIZE properly.  This will
fix atleast three known bugs that can be encountered while running
compatibility mode apps.

a) A malicious 32bit app can have an elf section at 0xffffe000.  During
   exec of this app, we will have a memory leak as insert_vm_struct() is
   not checking for return value in syscall32_setup_pages() and thus not
   freeing the vma allocated for the vsyscall page.  And instead of exec
   failing (as it has addresses > TASK_SIZE), we were allowing it to
   succeed previously.

b) With a 32bit app, hugetlb_get_unmapped_area/arch_get_unmapped_area
   may return addresses beyond 32bits, ultimately causing corruption
   because of wrap-around and resulting in SEGFAULT, instead of returning
   ENOMEM.

c) 32bit app doing this below mmap will now fail.

  mmap((void *)(0xFFFFE000UL), 0x10000UL, PROT_READ|PROT_WRITE,
	MAP_FIXED|MAP_PRIVATE|MAP_ANON, 0, 0);

Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:12 -07:00
Andrew Morton
42442ed574 [PATCH] revert x86_64-use-the-e820-hole-to-map-the-iommu-agp-aperture
Martin Bligh determined that this patch is causing his test box to not boot.
Revert.

Andi Kleen <ak@muc.de>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-08 16:21:13 -07:00
Andi Kleen
9c2be6a0fa [PATCH] x86_64 CONFIG_ACPI=n build fix
Make CONFIG_X86_PM_TIMER dependent on CONFIG_ACPI

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-31 14:54:17 -07:00
Andi Kleen
8d91640606 [PATCH] x86_64: More fixes for compilation without CONFIG_ACPI
Suggested by Alexander Nyberg

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-31 14:54:17 -07:00
Oliver Korpilla
c5924b7d97 [PATCH] x86_64: signal.c build fix
For unspecified reasons, arch/x86_64/kernel/signal.c apparently needs
ia32_unistd.h.

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-28 11:13:59 -07:00
Alexander Nyberg
8aadff7dd5 [PATCH] Note on ACPI build fix
Even after the previous fix you can still set CONFIG_ACPI_BOOT
indirectly even without CONFIG_ACPI by choosing CONFIG_PCI and
CONFIG_PCI_MMCONFIG.

That doesn't build very well either.

This makes PCI_MMCONFIG depend on ACPI, fixing that hole.

[ I guess in theory Kconfig could follow the whole chain of dependencies
  for things that get selected, but that sounds insanely complicated, so
  we'll just fix up these things by hand.  --Linus ]

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-27 08:15:05 -07:00
Alexander Nyberg
4f60fdf613 [PATCH] x86_64: CONFIG_BUG=n fixes
Fixes some !CONFIG_BUG warnings:
include/asm/mmu_context.h: I funktion `switch_mm':
include/asm/mmu_context.h:57: varning: implicit declaration of function `out_of_line_bug'

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-25 15:31:28 -07:00
Andi Kleen
2df9fa3664 [PATCH] x86_64: i386/x86-64: Export cpu_core_map
Needed for the powernow k8 driver for dual core support.

Signed-off-by: Andi Kleen <ak@suse.de>
Cc: <mark.langsdorf@amd.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:21 -07:00
Andi Kleen
14d98cad82 [PATCH] x86_64: Add option to disable timer check
This works around the too fast timer seen on some ATI boards.

I don't feel confident enough about it yet to enable it by default, but give
users the option.

Patch and debugging from Christopher Allen Wing <wingc@engin.umich.edu>, with
minor tweaks (renamed the option and documented it)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:21 -07:00
Andi Kleen
607a168583 [PATCH] x86_64: Fix 32bit system call restart
The test case at
http://cvs.sourceforge.net/viewcvs.py/posixtest/posixtestsuite/conforman
ce/interfaces/clock_nanosleep/1-5.c fails if it runs as a 32bit process on
x86_86 machines.

The root cause is the sub 32bit process fails to restart the syscall after it
is interrupted by a signal.

The syscall number of sys_restart_syscall in table sys_call_table is
__NR_restart_syscall (219) while it's __NR_ia32_restart_syscall
(0) in ia32_sys_call_table. When regs->rax==(unsigned
long)-ERESTART_RESTARTBLOCK, function do_signal doesn't distinguish if
the process is 64bit or 32bit, and always sets restart syscall number
as __NR_restart_syscall (219).

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:21 -07:00
Andi Kleen
7856dfeb23 [PATCH] x86_64: Fixed guard page handling again in iounmap
Caused oopses again.  Also fix potential mismatch in checking if
change_page_attr was needed.

To do it without races I needed to change mm/vmalloc.c to export a
__remove_vm_area that does not take vmlist lock.

Noticed by Terence Ripperda and based on a patch of his.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:20 -07:00
Andi Kleen
c4d1fcf3a2 [PATCH] x86_64: Don't allow accesses below register frame in ptrace
There was a "off by one quad word" error in there.  I don't think it is
exploitable because it will only store into a unused area, but better to plug
it.

Found and fixed by John Blackwood

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:20 -07:00
Andi Kleen
b41e29398a [PATCH] x86_64: 386/x86-64 Further AMD dual core fixes
- Remove duplicated ifdef
- Make core_id match what Intel uses
- Initialize phys_proc_id correctly for non DC case
- Handle non power of two core numbers.

Fixes for both i386 and x86-64

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:20 -07:00
Andi Kleen
59db2e6ed0 [PATCH] x86_64: Update defconfig
Update defconfig

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:17 -07:00
Andi Kleen
18a2b64712 [PATCH] x86_64: Don't assume BSP has ID 0 in new smp bootup
This patch removes the assumption that LAPIC entries contain the BSP as its
first entry.  This is a slight improvement to the temporary fix submitted by
Suresh Siddha.

- Removes assumption that LAPIC entries contain BSP first.

- Builds x86_acpiid_to_apicid[] and bios_cpu_apicid[] properly with BSP as
  first entry.

- Made maxcpus=1 boot on these systems.  Since the parsing earlier in
  arch/x86_64/kernel/mpparse.c stopped after maxcpus entries, other entries
  were not processed, this causes kernel not to boot on these systems.

TBD: x86_acpiid_to_apicid and bios_cpu_apicid[] seem to be exactly the
     same.  This could be removed, but might need more work to cleanup.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:16 -07:00
Andi Kleen
751521149a [PATCH] x86_64: Collected NMI watchdog fixes.
Collected NMI watchdog fixes.

- Fix call of check_nmi_watchdog

- Remove earlier move of check_nmi_watchdog to later.  It does not fix the
  race it was supposed to fix fully.

- Remove unused P6 definitions

- Add support for performance counter based watchdog on P4 systems.

  This allows to run it only once per second, which saves some CPU time.
  Previously it would run at 1000Hz, which was too much.

  Code ported from i386

  Make this the default on Intel systems.

- Use check_nmi_watchdog with local APIC based nmi

- Fix race in touch_nmi_watchdog

- Fix bug that caused incorrect performance counters to be programmed in a
  few cases on K8.

- Remove useless check for local APIC

- Use local_t and per_cpu variables for per CPU data.

- Keep other CPUs busy during check_nmi_watchdog to make sure they really
  tick when in lapic mode.

- Only check CPUs that are actually online.

- Various other fixes.

- Fix fallback path when MSRs are unimplemented

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:16 -07:00
Andi Kleen
f3c5f5e7ee [PATCH] x86_64: Make vsyscall.c compile without CONFIG_SYSCTL
Originally from Matt Tolentino

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:16 -07:00
Suresh Siddha
376ec33fcb [PATCH] x86_64: Fix OEM hpet check
Use bitmap_zero instead of bitmap_empty to initialise cpu mask This makes it
actually run reliable instead of relying on stack state.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:16 -07:00
Andi Kleen
3b9ba4d5e2 [PATCH] x86_64: When checking vmalloc mappings don't use pte_page
The PTEs can point to ioremap mappings too, and these are often outside
mem_map.  The NUMA hash page lookup functions cannot handle out of bounds
accesses properly.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:15 -07:00
Andi Kleen
f6b8d4778c [PATCH] x86_64: Fix canonical checking for segment registers in ptrace
Allowed user programs to set a non canonical segment base, which would cause
oopses in the kernel later.

Credit-to: Alexander Nyberg <alexn@dsv.su.se>

 For identifying and reporting this bug.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:15 -07:00
Andi Kleen
d1099e8a18 [PATCH] x86_64: check if ptrace RIP is canonical
This works around an AMD Erratum.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:15 -07:00
Andi Kleen
312df5f1a1 [PATCH] x86_64: Add pmtimer support
There are unfortunately more and more multi processor Opteron systems which
don't have HPET timer support in the southbridge.  This covers in particular
Nvidia and VIA chipsets.  They also don't guarantee that the TSCs are
synchronized between CPUs; and especially with MP powernow the systems are
nearly unusable because the time gets very inconsistent between CPUs.

The timer code for x86-64 was originally written under the assumption that we
could fall back to the HPET timer on such systems.  But this doesn't work
there.

Another alternative is to use the ACPI PM timer as primary time source.  This
patch does that.  The kernel only uses PM timer when there is no other choice
because it has some disadvantages.

Ported over from i386.  It should be faster than the i386 version because I
dropped the "read three times" workaround, but is still considerable slower
than HPET and also does not work together with vsyscalls which have to be
disabled.

Cc: <mark.langsdorf@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:15 -07:00
Andi Kleen
0af2be0b72 [PATCH] x86_64: Remove unique APIC/IO-APIC ID check
It is unnecessary on modern Intel or AMD systems, and that is all we support
on x86-64

Also causes problems on various systems

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:14 -07:00
Andi Kleen
622dcaf974 [PATCH] x86_64: Don't print the internal k8c+ flag in /proc/cpuinfo
It is not very useful to the user and more an kernel internal implementation
detail.  So hide it.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:14 -07:00
Andi Kleen
2942283e97 [PATCH] x86_64: Remove x86_apicid field
Remove x86_apicid field

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:14 -07:00
Andi Kleen
dda50e716d [PATCH] x86_64: Update TSC sync algorithm
The new TSC sync algorithm recently submitted did not work too well.

The result was that some MP machines where the TSC came up of the BIOS very
unsynchronized and that did not have HPET support were nearly unusable because
the time would jump forwards and backwards between CPUs.

After a lot of research ;-) and some more prototypes I ended up with just
using the one from IA64 which looks best.  It has some internal self tuning
that should adapt to changing interconnect latencies.  It holds up in my tests
so far.

I believe it was originally written by David Mosberger, I just ported it over
to x86-64.  See the inline comment for a description.

This cleans up the code because it uses smp_call_function for syncing instead
of having custom hooks in SMP bootup.

Please note that the cycle numbers it outputs are too optimistic because they
do not take into account the latency of WRMSR and RDTSC, which can be hundreds
of cycles.  It seems to be able to sync a dual Opteron to 200-300 cycles,
which is probably good enough.

There is a timing window during AP bootup where interrupts can see
inconsistent time before the TSC is synced.  It is hard to avoid unfortunately
because we can only do the TSC sync after some setup, and we need to enable
interrupts before that.  I just ignored it for now.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:14 -07:00
Andi Kleen
93ef70a217 [PATCH] x86_64: Don't look up struct page pointer of physical address in iounmap
It could be in a memory hole not mapped in mem_map and that causes the hash
lookup to go off to nirvana.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:14 -07:00
Andi Kleen
a158608bf4 [PATCH] x86_64/i386: fix defaults for physical/core id in /proc/cpuinfo
Last round hopefully of cpu_core_id changes hopefully fow now:

- Always initialize cpu_core_id for all CPUs, even when no dual core setup
  is detected.  This prevents funny /proc/cpuinfo output

- Do the same with phys_proc_id[] even when no HyperThreading - dito.

- Use the CPU APIC-ID from CPUID 1 instead of the linux virtual CPU number
  to identify the core for AMD dual core setups.

Patch for i386/x86-64.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:13 -07:00
Andi Kleen
10ffdbb8d6 [PATCH] x86_64: Readd missing tests in entry.S
Cleans up the system exit call slightly and synchronizes with my tree again.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:13 -07:00
Andi Kleen
ac6b931c44 [PATCH] x86_64: Reduce NMI watchdog stack usage
NR_CPUs can be quite big these days.  kmalloc the per CPU array instead of
putting it onto the stack

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:12 -07:00
Domen Puncer
125947f2ab [PATCH] CodingStyle: trivial whitespace fixups
When I do a "diff -Nur arch/i386 arch/x86_64" to see what is different between these two
architectures, I see some differences due to whitespace issues only. The attached patch removes
some of the noise by fixing up the following files:
- arch/i386/boot/bootsect.S
- arch/i386/boot/video.S
- arch/x86_64/boot/bootsect.S

Signed-off-by: Daniel Dickman <didickman@yahoo.com>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:49 -07:00
Prasanna S Panchamukhi
0b9e2cac8a [PATCH] Kprobes: Incorrect handling of probes on ret/lret instruction
Kprobes could not handle the insertion of a probe on the ret/lret
instruction and used to oops after single stepping since kprobes was
modifying eip/rip incorrectly.  Adjustment of eip/rip is not required after
single stepping in case of ret/lret instruction, because eip/rip points to
the correct location after execution of the ret/lret instruction.  This
patch fixes the above problem.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:39 -07:00
Paolo 'Blaisorblade' Giarrusso
0c28130b5c [PATCH] x86_64: make string func definition work as intended
In include/asm-x86_64/string.h there are such comments:

/* Use C out of line version for memcmp */
#define memcmp __builtin_memcmp
int memcmp(const void * cs,const void * ct,size_t count);

This would mean that if the compiler does not decide to use __builtin_memcmp,
it emits a call to memcmp to be satisfied by the C out-of-line version in
lib/string.c.  What happens is that after preprocessing, in lib/string.i you
may find the definition of "__builtin_strcmp".

Actually, by accident, in the object you will find the definition of strcmp
and such (maybe a trick intended to redirect calls to __builtin_memcmp to the
default memcmp when the definition is not expanded); however, this particular
case is not a documented feature as far as I can see.

Also, the EXPORT_SYMBOL does not work, so it's duplicated in the arch.

I simply added some #undef to lib/string.c and removed the (now duplicated)
exports in x86-64 and UML/x86_64 subarchs (the second ones are introduced by
another patch I just posted for -mm).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:33 -07:00
Paolo 'Blaisorblade' Giarrusso
23352fc252 [PATCH] uml: kludgy compilation fixes for x86-64 subarch modules support
These are some trivial fixes for the x86-64 subarch module support.  The only
potential problem is that I have to modify arch/x86_64/kernel/module.c, to
avoid copying the whole of it.

I can't use it verbatim because it depends on a special vmalloc-like area for
modules, which for now (maybe that's to fix, I guess not) UML/x86-64 has not.
I went the easy way and reused the i386 vmalloc()-based allocator.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:33 -07:00
David Woodhouse
bfd4bda097 Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-05-05 13:59:37 +01:00
Al Viro
5cae841b13 [PATCH] ISA DMA Kconfig fixes - part 1
A bunch of drivers use ISA DMA helpers or their equivalents for
platforms that have ISA with different DMA controller (a lot of ARM
boxen).  Currently there is no way to put such dependency in Kconfig -
CONFIG_ISA is not it (e.g.  it is not set on platforms that have no ISA
slots, but have on-board devices that pretend to be ISA ones).

New symbol added - ISA_DMA_API.  Set when we have functional
enable_dma()/set_dma_mode()/etc.  set of helpers.  Next patches in the
series will add missing dependencies for drivers that need them.

I'm very carefully staying the hell out of the recurring flamefest on
what exactly CONFIG_ISA would mean in ideal world - added symbol has a
well-defined meaning and for now I really want to treat it as completely
independent from the mess around CONFIG_ISA.

Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-04 07:33:13 -07:00
David Woodhouse
488f2eaca1 [AUDIT] Log correct syscall args for i386 processes on x86_64
The i386 syscall ABI uses different registers. Log those instead of the
x86_64 ones.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-05-03 14:11:02 +01:00
David Woodhouse
27b030d58c Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-05-03 08:14:09 +01:00
Adrian Bunk
408b664a7d [PATCH] make lots of things static
Another large rollup of various patches from Adrian which make things static
where they were needlessly exported.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:29 -07:00