Commit graph

1257 commits

Author SHA1 Message Date
Andi Kleen
8689b517be [PATCH] i386: Fix some warnings added by earlier patch
Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-24 13:05:37 +02:00
Andi Kleen
9ce883becb [PATCH] x86: Remove noreplacement option
noreplacement is dangerous on modern systems because it will not replace the
context switch FNSAVE with SSE aware FXSAVE. But other places in the kernel still assume
SSE and do FXSAVE and the CPU will then access FXSAVE information with
FNSAVE and cause corruption.

Easiest way to avoid this is to remove the option. It was mostly for paranoia
reasons anyways and alternative()s have been stable for some time.

Thanks to Jeremy F. for reporting and helping debug it.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-24 13:05:37 +02:00
Dave Jones
7ab77e03c1 Longhaul - Revert ACPI C3 on Longhaul ver. 2
Support for Longhaul ver.  2 broke driver for VIA C3 Eden 600MHz with
Samuel 2 core.  Processor is not able to switch frequency anymore.  I
don't know much about this issue at the moment, but until (if ever) I
will know why, this part should be reversed.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-20 22:56:29 -07:00
Andi Kleen
1714f9bfc9 [PATCH] x86: Fix potential overflow in perfctr reservation
While reviewing this code again I found a potential overflow of the bitmap.
The p4 oprofile can theoretically set bits beyond the reservation bitmap for
specific configurations. Avoid that by sizing the bitmaps properly.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-16 10:30:27 +02:00
Andi Kleen
08269c6d38 [PATCH] x86: Fix gcc 4.2 _proxy_pda workaround
Due to an over aggressive optimizer gcc 4.2 cannot optimize away _proxy_pda
in all cases (counter intuitive, but true).  This breaks loading of some
modules.

The earlier workaround to just export a dummy symbol didn't work unfortunately
because the module code ignores exports with 0 value.

Make it 1 instead.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-16 10:30:27 +02:00
Zachary Amsden
0492c37137 Fix VMI relocation processing logic error
Fix logic error in VMI relocation processing.  NOPs would always cause
a BUG_ON to fire because the != RELOCATION_NONE in the first if clause
precluding the == VMI_RELOCATION_NOP in the second clause.  Make these
direct equality tests and just warn for unsupported relocation types
(which should never happen), falling back to native in that case.

Thanks to Anthony Liguori for noting this!

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-14 21:48:36 -07:00
Andrew Morton
c2481cc4a8 [PATCH] i386: irqbalance_disable() section fix
WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text:irqbalance_disable from .text between 'quirk_intel_irqbalance' (at offset 0x80a5) and 'i8237A_suspend'

Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-08 19:47:55 -07:00
Zachary Amsden
49f1971051 [PATCH] Proper fix for highmem kmap_atomic functions for VMI for 2.6.21
Since lazy MMU batching mode still allows interrupts to enter, it is
possible for interrupt handlers to try to use kmap_atomic, which fails when
lazy mode is active, since the PTE update to highmem will be delayed.  The
best workaround is to issue an explicit flush in kmap_atomic_functions
case; this is the only way nested PTE updates can happen in the interrupt
handler.

Thanks to Jeremy Fitzhardinge for noting the bug and suggestions on a fix.

This patch gets reverted again when we start 2.6.22 and the bug gets fixed
differently.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-08 19:47:55 -07:00
Linus Torvalds
9a5ee4cc9e Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
  [PATCH] x86: Don't probe for DDC on VBE1.2
  [PATCH] x86-64: Increase NMI watchdog probing timeout
  [PATCH] x86-64: Let oprofile reserve MSR on all CPUs
  [PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E
2007-04-02 11:41:55 -07:00
Rafael J. Wysocki
1d64b9cb1d [PATCH] Fix microcode-related suspend problem
Fix the regression resulting from the recent change of suspend code
ordering that causes systems based on Intel x86 CPUs using the microcode
driver to hang during the resume.

The problem occurs since the microcode driver uses request_firmware() in
its CPU hotplug notifier, which is called after tasks has been frozen and
hangs.  It can be fixed by telling the microcode driver to use the
microcode stored in memory during the resume instead of trying to load it
from disk.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Adrian Bunk <bunk@stusta.de>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maxim <maximlevitsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02 10:06:09 -07:00
Andi Kleen
0fb2ebfcb5 [PATCH] x86-64: Increase NMI watchdog probing timeout
A 4 core Opteron needs longer than 10 ticks for this.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-02 12:14:12 +02:00
Andi Kleen
89e07569e4 [PATCH] x86-64: Let oprofile reserve MSR on all CPUs
The MSR reservation is per CPU and oprofile would only allocate them
on the CPU it was initialized on. Change this to handle all CPUs.

This also fixes a warning about unprotected use of smp_processor_id()
in preemptible kernels.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-02 12:14:12 +02:00
Andi Kleen
3556ddfa92 [PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E
AMD dual core laptops with C1E do not run the APIC timer correctly
when they go idle. Previously the code assumed this only happened
on C2 or deeper.  But not all of these systems report support C2.

Use a AMD supplied snippet to detect C1E being enabled and then disable
local apic timer use.

This supercedes an earlier workaround using DMI detection of specific systems.

Thanks to Mark Langsdorf for the detection snippet.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-02 12:14:12 +02:00
Maxim Levitsky
399afa4fc9 [PATCH] Add suspend/resume for HPET
This adds support of suspend/resume on i386 for HPET, which fixes a
number of timer-related failures around STR.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-29 10:25:32 -07:00
Thomas Gleixner
c7f6d15ff2 [PATCH] i386: Fix bogus return value in hpet_next_event()
The clockevents / tick management code expects an error value, when the
event is already expired. hpet_next_event() returns 1 in that case.

Fix it to return the proper -ETIME error code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:08:08 -07:00
Thomas Gleixner
d9a5c0a4e0 [PATCH] i386: Prevent early access to TSC to avoid crash on TSCless systems
commit f9690982b8 removed the check for
cpu_khz from sched_clock(), which prevented early access to the TSC by
non obvious magic.

This is harmless as long as the CPU has a TSC. On TSCless systems this
results in an illegal instruction trap.

Replace tsc_disabled and tsc_unstable by tsc_enabled, which is only set
when the tsc is available and not unstable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-24 15:45:53 -07:00
Thomas Gleixner
e585bef815 [PATCH] i386: add command line option "local_apic_timer_c2_ok"
It turned out that it is almost impossible to trust ACPI, BIOS & Co.
regarding the C states. This was the reason to switch the local apic
timer off in C2 state already. OTOH there are sane and well behaving
systems, which get punished by that decision.

Allow the user to confirm that the local apic timer is trustworthy in C2
state. This keeps the default behaviour on the safe side.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-23 10:21:02 -07:00
Ingo Molnar
4edc5db83f [PATCH] setup_boot_APIC_clock() irq-enable fix
latest -git triggers an irqtrace/lockdep warning of a leaked
irqs-off condition:

  BUG: at kernel/fork.c:1033 copy_process()

after some debugging it turns out that commit ca1b940c accidentally left
interrupts disabled - which trickled down all the way to the first time
we fork a kernel thread and triggered the warning.

the fix is to re-enable interrupts in the 'else' branch of
setup_boot_APIC_clock()'s pmtimers calibration path.

Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@brown.paperbag.linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-22 19:42:31 -07:00
Thomas Gleixner
ad62ca2bd8 [PATCH] i386: disable local apic timer via command line or dmi quirk
The local APIC timer stops to work in deeper C-States.  This is handled by
the ACPI code and a broadcast mechanism in the clockevents / tick managment
code.

Some systems do not expose the deeper C-States to the kernel, but switch
into deeper C-States behind the kernels back.  This delays the local apic
timer interrupts for ever and makes the systems unusable.

Add a command line option to disable the local apic timer and a dmi
quirk for known broken systems.

Andi sayeth:

  While not wrong by itself i think it is still better to use some heuristic
  -- like "has battery in ACPI" With the DMI table if the problem is more wide
  spread we will just continue extending it.

  But anyways should be ok now for .21 although I'm not really happy with
  it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Grudgingly-acked-by: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-22 19:39:05 -07:00
Thomas Gleixner
6b3964cde7 [PATCH] i386: clockevents fix breakage on Geode/Cyrix PIT implementations
The PIT has no dedicated mode for shut down. The only way to disable PIT
is to put it into one shot mode. AMD implementations of PIT on Geode
(also observed on Cyrix) are confused by an "empty" transition from
CLOCK_EVT_MODE_UNUSED to CLOCK_EVT_MODE_SHUTDOWN, which puts the PIT
into one shot mode momentarily.

I realized after staring helpless at the bug report
http://bugzilla.kernel.org/show_bug.cgi?id=8027 for quite a while, that
the only change, which might influence the bogomips calibration, is the
above transition during the PIT initialization.

Avoiding the unnecessary switch to oneshot and later to periodic mode
fixes the weird bogomips value and also the resulting slowness.

The fix is confirmed on OLPC and another Geode based box.

Note: this is unrelated to the Dual Core problem discussed here:
http://lkml.org/lkml/2007/3/17/48

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-22 19:33:30 -07:00
Thomas Gleixner
ca1b940ce6 [PATCH] i386: trust the PM-Timer calibration of the local APIC timer
When PM-Timer is available for local APIC timer calibration we can skip the
verification of the calibrated time value.  The resulting error is quite
small on a bunch of evaluated platforms and is less harming than the
observed false positives.

We need to keep the verification on systems, which have no PM-Timer to
avoid bogus local APIC timer calibrations in the range of factor 2-10,
which can be observed when swicthing off the PM-timer support in the kernel
configuration.

The wrong calibration values are probably caused by SMM code trying to
emulate a PS/2 keyboard from a (maybe connected or not) USB keyboard.  This
prohibits the accurate delivery of PIT interrupts, which are used to
calibrate the local APIC timer.  Unfortunately we have no way to disable
this BIOS misfeature in the early boot process.

Add also the dropped cpu_relax() back to the wait loops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-18 11:35:08 -07:00
Andi Kleen
92b35e910f [PATCH] x86: Export _proxy_pda for gcc 4.2
The symbol is not actually used, but the compiler unforunately generates
a (unused) reference to it. This can happen even in modules. So export it.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-16 21:07:36 +01:00
Guillaume Chazarain
28f36f8fbf [PATCH] i386: Don't use the TSC in sched_clock if unstable
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f9690982b8c2f9a2c65acdc113e758ec356676a3
caused a regression by letting sched_clock use the TSC even when cpufreq
disabled it. This caused scheduling weirdnesses.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-16 21:07:36 +01:00
Andi Kleen
302cf930cb [PATCH] i386: Enforce GPLness of VMI ROM
VMI ROMs are pretty intimate to the kernel, so enforce their GPLness.

No \0 tricks checking for now

This rules out BSD/MIT modules for now, sorry -- the trouble is those
could come without source.

Acked-by: Zachary Amsden <zach@vmware.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-16 21:07:36 +01:00
Linus Torvalds
8ce5e3e45e Disable NMI watchdog by default properly
This reverts commit 6ebf622b25 and
replaces it with one that actually works.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 17:53:43 -07:00
Len Brown
653351b0b9 Pull bugzilla-5966 into release branch 2007-03-09 23:18:05 -05:00
Dave Jones
d0035aef39 [PATCH] build fix for i386 earlyquirk.c
missing close bracket.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08 22:39:23 -08:00
Len Brown
fe69933652 [PATCH] ACPI: repair nvidia early quirk breakage on x86_64
x86_64 nvidia_bugs() broke when we bailed out on not finding the HPET.
However, the quirk works by checking for _not_ finding the HPET...

Delete the nvidia_hpet_detected flag and simply test for
not finding the HPET, which is simple to do now that
acpi_table_parse returns 1 on failure.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08 16:06:07 -08:00
Len Brown
74586fca38 ACPI: fix Thinkpad 600/600E/600X interrupts
The root cause of this bug shows that this machine
could not possibly run an ACPI-aware OS without a
model specific workaround.

http://bugzilla.kernel.org/show_bug.cgi?id=5966

Signed-off-by: Len Brown <len.brown@intel.com>
2007-03-08 02:48:30 -05:00
Ingo Molnar
d04f41e353 [PATCH] CPU hotplug: call check_tsc_sync_source() with irqs off
check_tsc_sync_source() depends on being called with irqs disabled (it
checks whether the TSC is coherent across two specific CPUs). This is
incidentally true during bootup, but not during cpu hotplug __cpu_up().
This got found via smp_processor_id() debugging.

disable irqs explicitly and remove the unconditional enabling of
interrupts. Add touch_nmi_watchdog() to the cpu_online_map busy loop.

this bug is present both on i386 and on x86_64.

Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-07 10:07:24 -08:00
Adrian Bunk
4a6753ca08 [PATCH] remove arch/i386/kernel/tsc.c:custom_sched_clock
Remove the no longer used custom_sched_clock.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 09:30:25 -08:00
Thomas Gleixner
e585047ef9 [PATCH] Scheduled removal of SA_xxx interrupt flags fixups 3
The obsolete SA_xxx interrupt flags have been used despite the scheduled
removal.  Fixup the remaining users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 09:30:24 -08:00
Adrian Bunk
8f48561223 [PATCH] arch/i386/kernel/vmi.c must #include <asm/kmap_types.h>
CC      arch/i386/kernel/vmi.o
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c: In function 'vmi_map_pt_hook':
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE0' undeclared (first use in this function)
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: (Each undeclared identifier is reported only once
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: for each function it appears in.)
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE1' undeclared (first use in this function)
make[2]: *** [arch/i386/kernel/vmi.o] Error 1

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:54 -08:00
john stultz
6bb74df481 [PATCH] clocksource init adjustments (fix bug #7426)
This patch resolves the issue found here:
http://bugme.osdl.org/show_bug.cgi?id=7426

The basic summary is:
Currently we register most of i386/x86_64 clocksources at module_init
time. Then we enable clocksource selection at late_initcall time. This
causes some problems for drivers that use gettimeofday for init
calibration routines (specifically the es1968 driver in this case),
where durring module_init, the only clocksource available is the low-res
jiffies clocksource. This may cause slight calibration errors, due to
the small sampling time used.

It should be noted that drivers that require fine grained time may not
function on architectures that do not have better then jiffies
resolution timekeeping (there are a few). However, this does not
discount the reasonable need for such fine-grained timekeeping at init
time.

Thus the solution here is to register clocksources earlier (ideally when
the hardware is being initialized), and then we enable clocksource
selection at fs_initcall (before device_initcall).

This patch should probably get some testing time in -mm, since
clocksource selection is one of the most important issues for correct
timekeeping, and I've only been able to test this on a few of my own
boxes.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Thomas Gleixner
a5f5e43e2b [PATCH] fix "NMI appears to be stuck"
Testing NMI watchdog ... CPU#0: NMI appears to be stuck (54->54)!
  CPU#1: NMI appears to be stuck (0->0)!

Keep the PIT/HPET alive when nmi_watchdog = 1 is given on the command
line.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Zachary Amsden
c6b36e9a3c [PATCH] vmi: smp fixes
Critical fixes for SMP.

Fix a couple functions which needed to be __devinit and fix a bogus parameter
to AP startup that just so happened to work because the low virtual mapping of
memory was still established.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Zachary Amsden
772205f62e [PATCH] vmi: apic ops
Use para_fill instead of directly setting the APIC ops to the result of the
vmi_get_function call - this allows one to implement a VMI ROM without
implementing APIC functions, just using the native APIC functions.

While doing this, I realized that there is a lot more cleanup that should have
been done.  Basically, we should never assume that the ROM implements a
specific set of functions, and always allow fallback to the native
implementation.

This is critical for future compatibility.

Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Zachary Amsden <zach@vmware.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
a9eddc9528 [PATCH] vmi: fix nohz compile
More goo from hrtimers integration.  We do compile and run properly with NO_HZ
enabled.  There was a period when we didn't because of a missing export, but
that was since fixed.

And with the clocksource code now firmly in place, we can get rid of code that
fixes up the wallclock, since this is done in the common infrastructure.  This
actually fixes a timer bug as well, that was caused by do_settimeofday no
longer being callable with interrupts disabled due to the use of
on_each_cpu().

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
e30fab3ad3 [PATCH] vmi: pit override
The time_init_hook in paravirt-ops no longer functions in the correct manner
after the integration of the hrtimers code.  The problem is that now the call
path for time initialization is:

  time_init :
       late_time_init = hpet_time_init;

  late_time_init -> hpet_time_init:
       setup_pit_timer (BAD)
       do_time_init --> (via paravirt.h)
          time_init_hook --> (via arch_hooks.h)
              time_init_hook (in SUBARCH/setup.c)

If this isn't confusing enough, the paravirt case goes through an indirect
function pointer in the paravirt-ops table.  The problem is, by the time the
paravirt hook is called, the pit timer is already enabled.

But paravirt guests have their own timer, and don't want to use the PIT.
Rather than intensify the struggle for power going on here, just make it all
nice and simple and just unconditionally do all timer setup in the
late_time_init hook.  This also has the advantage of enabling timers in the
same place in all code paths, so everyone has the same bugs and we don't have
outliers who break other code because they turn on timer too early or too
late.

So the paravirt-ops time init function is now by default hpet_time_init, which
is the time init function used for native hardware.  Paravirt guests have the
chance to override this when they setup the paravirt-ops table, and should
need no change.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
eda08b1bef [PATCH] vmi: paravirt drop udelay op
Not respecting udelay causes problems with any virtual hardware that is passed
through to real hardware.  This can be noticed by any device that interacts
with the real world in real time - like AP startup, which takes real time.  Or
keyboard LEDs, which should blink in real-time.  Or floppy drives, but only
when passed through to a real floppy controller on OSes which can't
sufficiently buffer the floppy commands to emulate a zero latency floppy.  Or
IDE drives, when connecting to a physical CDROM.

This was mostly a hack to get the kernel to boot faster, but it introduced a
number of misvirtualization bugs, and Alan and Pavel argued pretty strongly
against it.  We were the only client, and now want to clean up this cruft.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
9a1c13e91f [PATCH] vmi: fix highpte
Provide a PT map hook for HIGHPTE kernels to designate where they are mapping
page tables.  This information is required so the physical address of PTE
updates can be determined; otherwise, the mm layer would have to carry the
physical address all the way to each PTE modification callsite, which is even
more hideous that the macros required to provide the proper hooks.

So lets not mess up arch neutral code to achieve this, but keep the horror in
an #ifdef HIGHPTE in include/asm-i386/pgtable.h.  I had to use macros here
because some types are not yet defined in all the include paths for this
header.

This patch is absolutely required for HIGHPTE kernels to operate properly with
VMI.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
1182d8528b [PATCH] vmi: cpu cycles fix
In order to share the common code in tsc.c which does CPU Khz calibration, we
need to make an accurate value of CPU speed available to the tsc.c code.  This
value loses a lot of precision in a VM because of the timing differences with
real hardware, but we need it to be as precise as possible so the guest can
make accurate time calculations with the cycle counters.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
6cb9a8350a [PATCH] vmi: sched clock paravirt op fix
The custom_sched_clock hook is broken.  The result from sched_clock needs to
be in nanoseconds, not in CPU cycles.  The TSC is insufficient for this
purpose, because TSC is poorly defined in a virtual environment, and mostly
represents real world time instead of scheduled process time (which can be
interrupted without notice when a virtual machine is descheduled).

To make the scheduler consistent, we must expose a different nature of time,
that is scheduled time.  So deprecate this custom_sched_clock hack and turn it
into a paravirt-op, as it should have been all along.  This allows the tsc.c
code which converts cycles to nanoseconds to be shared by all paravirt-ops
backends.

It is unfortunate to add a new paravirt-op, but this is a very distinct
abstraction which is clearly different for all virtual machine
implementations, and it gets rid of an ugly indirect function which I
ashamedly admit I hacked in to try to get this to work earlier, and then even
got in the wrong units.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
7507ba34e8 [PATCH] vmi: timer fixes round two
Critical bugfixes for the VMI-Timer code.

1) Do not setup a one shot alarm if we are keeping the periodic alarm
   armed.  Additionally, since the periodic alarm can be run at a lower rate
   than HZ, let's fixup the guard to the no-idle-hz mode appropriately.  This
   fixes the bug where the no-idle-hz mode might have a higher interrupt rate
   than the non-idle case.

2) The interrupt handler can no longer adjust xtime due to nested lock
   acquisition.  Drop this.  We don't need to check for wallclock time at
   every tick, it can be done in userspace instead.

3) Add a bypass to disable noidle operation.  This is useful as a last
   minute workaround, or testing measure.

4) The code to skip the IO_APIC timer testing (no_timer_check) should be
   conditional on IO_APIC, not SMP, since UP kernels can have this configured
   in as well.

Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Yoichi Yuasa
3a0ee2ce8c [PATCH] fix memory leak in dma_declare_coherent_memory()
When it goes to free1_out, dev->dma_mem has not been freed.

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-01 14:53:39 -08:00
Eric W. Biederman
2ff7354fe8 [PATCH] x86_64/i386 irq: Fix !CONFIG_SMP compilation
When removing set_native_irq I missed the fact that it was
called in a couple of places that were compiled even when
SMP support is disabled.  And since the irq_desc[].affinity
field only exists in SMP things broke.

Thanks to Simon Arlott <simon@arlott.org> for spotting this.

There are a couple of ways to fix this but the simplest one
is to just remove the assignments.  The affinity field is only
used to display a value to the user, and nothing on either i386
or x86_64 reads it or depends on it being any particlua value,
so skipping the assignment is safe.  The assignment that
is being removed is just for the initial affinity value before
the user explicitly sets it.  The irq_desc array initializes
this field to CPU_MASK_ALL so the field is initialized to
a reasonable value in the SMP case without being set.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-28 08:52:31 -08:00
Linus Torvalds
221dee285e Revert "[CPUFREQ] constify cpufreq_driver where possible."
This reverts commit aeeddc1435, which was
half-baked and broken.  It just resulted in compile errors, since
cpufreq_register_driver() still changes the 'driver_data' by setting
bits in the flags field.  So claiming it is 'const' _really_ doesn't
work.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-26 14:55:48 -08:00
Linus Torvalds
6f8c480f99 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] constify some data tables.
  [CPUFREQ] constify cpufreq_driver where possible.
  {rd,wr}msr_on_cpu SMP=n optimization
  [CPUFREQ] cpufreq_ondemand.c: don't use _WORK_NAR
  rdmsr_on_cpu, wrmsr_on_cpu
  [CPUFREQ] Revert default on deprecated config X86_SPEEDSTEP_CENTRINO_ACPI
2007-02-26 14:17:50 -08:00
Eric W. Biederman
9f0a5ba550 [PATCH] irq: Remove set_native_irq_info
This patch replaces all instances of "set_native_irq_info(irq, mask)"
with "irq_desc[irq].affinity = mask".  The latter form is clearer
uses fewer abstractions, and makes access to this field uniform
accross different architectures.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-26 10:34:07 -08:00
Linus Torvalds
ea3d5226f5 Revert "[PATCH] i386: add idle notifier"
This reverts commit 2ff2d3d747.

Uwe Bugla reports that he cannot mount a floppy drive any more, and Jiri
Slaby bisected it down to this commit.

Benjamin LaHaise also points out that this is a big hot-path, and that
interrupt delivery while idle is very common and should not go through
all these expensive gyrations.

Fix up conflicts in arch/i386/kernel/apic.c and arch/i386/kernel/irq.c
due to other unrelated irq changes.

Cc: Stephane Eranian <eranian@hpl.hp.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Uwe Bugla <uwe.bugla@gmx.de>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-26 09:21:46 -08:00