Commit graph

118108 commits

Author SHA1 Message Date
Davide Libenzi
9ce209d64d epoll: avoid double-inserts in case of EFAULT
In commit f337b9c583 ("epoll: drop
unnecessary test") Thomas found that there is an unnecessary (always
true) test in ep_send_events().  The callback never inserts into
->rdllink while the send loop is performed, and also does the
~EP_PRIVATE_BITS test.  Given we're holding the mutex during this time,
the conditions tested inside the loop are always true.

HOWEVER.

The test "!ep_is_linked(&epi->rdllink)" wasn't there because we insert
into ->rdllink, but because the send-events loop might terminate before
the whole list is scanned (-EFAULT).

In such cases, when the loop terminates early, and when a (leftover)
file received an event while we're performing the lockless loop, we need
such test to avoid to double insert the epoll items.  The list_splice()
done a few steps below, will correctly re-insert the ones that were left
on "txlist".

This should fix the kenrel.org bugzilla entry 11831.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 12:09:49 -07:00
Arjan van de Ven
4d36a9e65d select: deal with math overflow from borderline valid userland data
Some userland apps seem to pass in a "0" for the seconds, and several
seconds worth of usecs to select().  The old kernels accepted this just
fine, so the new kernels must too.

However, due to the upscaling of the microseconds to nanoseconds we had
some cases where we got math overflow, and depending on the GCC version
(due to inlining decisions) that actually resulted in an -EINVAL return.

This patch fixes this by adding the excess microseconds to the seconds
field.

Also with thanks to Marcin Slusarz for spotting some implementation bugs
in the diagnostics patches.

Reported-by: Carlos R. Mafra <crmafra2@gmail.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 11:22:08 -07:00
Arjan van de Ven
44a504c405 wireless: fix regression caused by regulatory config option
The default for the regulatory compatibility option is wrong;
if you picked the default you ended up with a non-functional wifi
system (at least I did on Fedora 9 with iwl4965).
I don't think even the October 2008 releases of the various distros
has the new userland so clearly the default is wrong, and also
we can't just go about deleting this in 2.6.29...

Change the default to "y" and also adjust the config text a little to
reflect this.

This patch fixes regression #11859

With thanks to Johannes Berg for the diagnostics

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 10:38:52 -07:00
Stephen Rothwell
2077776641 cgroup: remove unused variable
/scratch/sfr/next/kernel/cgroup.c: In function 'cgroup_tasks_start':
/scratch/sfr/next/kernel/cgroup.c:2107: warning: unused variable 'i'

Introduced in commit cc31edceee "cgroups:
convert tasks file to use a seq_file with shared pid array".

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 09:38:17 -07:00
Linus Torvalds
b1cd2ee3b9 Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  hwmon: (abituguru3) enable DMI probing feature on AW9D-MAX
  hwmon: (abituguru3) Cosmetic whitespace fixes
  hwmon: (adt7473) Fix voltage conversion routines
  hwmon: (lm90) Add support for the LM99 16 degree offset
  hwmon: (lm90) Fix handling of hysteresis value
  hwmon-vid: Add support for AMD family 10h CPUs
  hwmon: (w83781d) Fix linking when built-in
2008-10-26 09:36:19 -07:00
Francois Romieu
e383d56487 r8169: revert "read MAC address from EEPROM on init"
This reverts commit 7bf6bf4803.

The code has both a short existence and an increasing track of failures
despite some work to amend it for -rc1.  It is not just a matter of
reading the eeprom: sometimes the eeprom is read correctly, then the mac
address is not written correctly back into the mac registers.

Some chipsets seem to work reliably but it is not clear at this point if
the code can simply be made to work on a per-chipset basis and post -rc1
is not the place where I want to experiment these things.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 09:35:05 -07:00
Al Viro
1137fb6704 arm ide breakage
a) semicolon before the function body is a bad idea
b) it's const struct foo, not struct const foo
c) incidentally, it's ecard_remove_driver(), not ecard_unregister_driver()
d) compiling is occasionally useful.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 09:35:05 -07:00
Al Viro
ce97e13e52 fix allmodconfig breakage
If you use KCONFIG_ALLCONFIG (even with empty file) you get broken
allmodconfig/allyesconfig; CONFIG_MODULES gets turned off, with obvious
massive fallout.

Breakage had been introduced when conf_set_all_new_symbols() got used
for allmodconfig et.al.

What happens is that sym_calc_value(modules_sym) done in
conf_read_simple() sets SYMBOL_VALID on both modules_sym and MODULES.
When we get to conf_set_all_new_symbols(), we set sym->def[S_DEF_USER]
on everything, but it has no effect on sym->curr for the symbols that
already have SYMBOL_VALID - these are stuck.

Solution: use sym_clear_all_valid() in there.  Note that it makes
reevaluation of modules_sym redundant - sym_clear_all_valid() will do
that itself.

[ Fixes http://bugzilla.kernel.org/show_bug.cgi?id=11512, says Alexey ]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 09:35:05 -07:00
Alistair John Strachan
c02d65694d hwmon: (abituguru3) enable DMI probing feature on AW9D-MAX
Switch the AW9D-MAX over from port probing to the preferred DMI
probe method.

Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk>
Tested-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-26 17:04:40 +01:00
Alistair John Strachan
4777e4e6b8 hwmon: (abituguru3) Cosmetic whitespace fixes
As the probable result of zealous copy/pasting, many supported boards
contain sensor names with trailing whitespace. Though this is not a
huge problem, it is inconsistent with other sensor names, and with
other similar hwmon drivers.

Additionally, the DMI nag message added in 2.6.27 was missing a
space between two sentence fragments -- might as well clean that up
too.

Doesn't alter any kernel text, just data.

Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk>
Reported-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-26 17:04:40 +01:00
Jean Delvare
be821b78af hwmon: (adt7473) Fix voltage conversion routines
Fix voltage conversion routines. Based on an earlier patch from
Paulius Zaleckas.

According to the datasheet voltage is scaled with resistors and
value 192 is nominal voltage. 0 is 0V.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Cc: Darrick J. Wong <djwong@us.ibm.com>
2008-10-26 17:04:40 +01:00
Jean Delvare
97ae60bb38 hwmon: (lm90) Add support for the LM99 16 degree offset
The LM99 differs from the LM86, LM89 and LM90 in that it reports
remote temperatures (temp2) 16 degrees lower than they really are. So
far we have been cheating and handled this in userspace but it really
should be handled by the driver directly.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
2008-10-26 17:04:39 +01:00
Jean Delvare
ec38fa2b35 hwmon: (lm90) Fix handling of hysteresis value
There are several problems in the way the hysteresis value is handled
by the lm90 driver:

* In show_temphyst(), specific handling of the MAX6646 is missing, so
  the hysteresis is reported incorrectly if the critical temperature
  is over 127 degrees C.
* In set_temphyst(), the new hysteresis register value is written to
  the chip but data->temp_hyst isn't updated accordingly, so there is
  a short period of time (up to 2 seconds) where the old hystereris
  value will be returned while the new one is already active.
* In set_temphyst(), the critical temperature which is used as a base
  to compute the value of the hysteresis register lacks
  device-specific handling. As a result, the value of the hysteresis
  register might be incorrect for the ADT7461 and MAX6646 chips.

Fix these 3 bugs.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Nate Case <ncase@xes-inc.com>
2008-10-26 17:04:39 +01:00
Jean Delvare
1b871826b3 hwmon-vid: Add support for AMD family 10h CPUs
The AMD family 10h CPUs use the same VID decoding table as the family
0Fh CPUs.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Rudolf Marek <r.marek@assembler.cz>
2008-10-26 17:04:39 +01:00
Geert Uytterhoeven
dd56b63895 hwmon: (w83781d) Fix linking when built-in
When w83781d is built-in, the final links fails with the following vague error
message:

`.exit.text' referenced in section `.init.text' of drivers/built-in.o: defined
in discarded section `.exit.text' of drivers/built-in.o

w83781d_isa_unregister() cannot be marked __exit, as it's also called from
sensors_w83781d_init(), which is marked __init.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-26 17:04:38 +01:00
Jay Fenlason
cd1f70fdb4 firewire: fw-sbp2: fix races
1: There is a small race between queue_delayed_work() and its
   corresponding kref_get().  Do the kref_get first, and _put it again
   if the queue_delayed_work() failed, so there is no chance of the
   kref going to zero while the work is scheduled.
2: An SBP2_LOGOUT_REQUEST could be sent out with a login_id full of
   garbage.  Initialize it to an invalid value so we can tell if we
   ever got a valid login_id.
3: The node ID and generation may have changed but the new values may
   not yet have been recorded in lu and tgt when the final logout is
   attempted.  Use the latest values from the device in
   sbp2_release_target().

Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-10-26 10:27:01 +01:00
Stefan Richter
0dcfeb7e3c firewire: fw-sbp2: delay first login to avoid retries
This optimizes firewire-sbp2's device probe for the case that the local
node and the SBP-2 node were discovered at the same time.  In this case,
fw-core's bus management work and fw-sbp2's login and SCSI probe work
are scheduled in parallel (in the globally shared workqueue and in
fw-sbp2's workqueue, respectively).  The bus reset from fw-core may then
disturb and extremely delay the login and SCSI probe because the latter
fails with several command timeouts and retries and has to be retried
from scratch.

We avoid this particular situation of sbp2_login() and fw_card_bm_work()
running in parallel by delaying the first sbp2_login() a little bit.

This is meant to be a short-term fix for
https://bugzilla.redhat.com/show_bug.cgi?id=466679.  In the long run,
the SCSI probe, i.e. fw-sbp2's call of __scsi_add_device(), should be
parallelized with sbp2_reconnect().

Problem reported and fix tested and confirmed by Alex Kanavin.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-10-26 10:27:01 +01:00
Stefan Richter
7007a0765e firewire: fw-ohci: initialization failure path fixes
Fix leaks when pci_probe fails.  Simplify error log strings.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-10-26 10:27:00 +01:00
Jay Fenlason
a55709ba9d firewire: fw-ohci: don't leak dma memory on module removal
The transmit and receive context dma memory was not being freed on
module removal.  Neither was the config rom memory.  Fix that.

The ab->next assignment is pure paranoia.

Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-10-26 10:27:00 +01:00
Jay Fenlason
77e5571917 firewire: fix struct fw_node memory leak
With the bus_resets patch applied, it is easy to see this memory leak
by repeatedly resetting the firewire bus while running slabtop in
another window.  Just watch kmalloc-32 grow and grow...

Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-10-26 10:27:00 +01:00
Jay Fenlason
4f9740d4f5 firewire: Survive more than 256 bus resets
The "color" is used during the topology building after a bus reset,
hovever in "struct fw_node"s it is stored in a u8, but in struct fw_card
it is stored in an int.  When the value wraps in one struct, but not
the other, disaster strikes.

Signed-off-by: Jay Fenlason <fenlason@redhat.com>

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10922.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-10-26 10:26:59 +01:00
Linus Torvalds
23cf24c0c8 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Fix duplicate entries returned from getdents() system call
  ext3: Fix duplicate entries returned from getdents() system call
2008-10-25 19:57:30 -07:00
Linus Torvalds
4403b406d4 Revert "Call init_workqueues before pre smp initcalls."
This reverts commit a802dd0eb5 by moving
the call to init_workqueues() back where it belongs - after SMP has been
initialized.

It also moves stop_machine_init() - which needs workqueues - to a later
phase using a core_initcall() instead of early_initcall().  That should
satisfy all ordering requirements, and was apparently the reason why
init_workqueues() was moved to be too early.

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-25 19:53:38 -07:00
Theodore Ts'o
3c37fc86d2 ext4: Fix duplicate entries returned from getdents() system call
Fix a regression caused by commit d0156417, "ext4: fix ext4_dx_readdir
hash collision handling", where deleting files in a large directory
(requiring more than one getdents system call), results in some
filenames being returned twice.  This was caused by a failure to
update info->curr_hash and info->curr_minor_hash, so that if the
directory had gotten modified since the last getdents() system call
(as would be the case if the user is running "rm -r" or "git clean"),
a directory entry would get returned twice to the userspace.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

This patch fixes the bug reported by Markus Trippelsdorf at:
http://bugzilla.kernel.org/show_bug.cgi?id=11844

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
2008-10-25 22:37:55 -04:00
Theodore Ts'o
8c9fa93d51 ext3: Fix duplicate entries returned from getdents() system call
Fix a regression caused by commit 6a897cf4, "ext3: fix ext3_dx_readdir
hash collision handling", where deleting files in a large directory
(requiring more than one getdents system call), results in some
filenames being returned twice.  This was caused by a failure to
update info->curr_hash and info->curr_minor_hash, so that if the
directory had gotten modified since the last getdents() system call
(as would be the case if the user is running "rm -r" or "git clean"),
a directory entry would get returned twice to the userspace.

This patch fixes the bug reported by Markus Trippelsdorf at:
http://bugzilla.kernel.org/show_bug.cgi?id=11844

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
2008-10-25 22:37:44 -04:00
Len Brown
9fb3c5ca3d Merge branch 'i7300_idle' into release 2008-10-25 04:07:44 -04:00
Len Brown
438f8de46b leds-hp-disk: fix build warning
drivers/leds/leds-hp-disk.c:59: warning: passing argument 4 of ‘acpi_evaluate_integer’ from incompatible pointer type

Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-25 04:07:14 -04:00
Rafael J. Wysocki
f8123381ba ACPI: Oops in ACPI with git latest
ACPI Warning (nseval-0168): Insufficient arguments - method [_OSC] needs 5, found 4 [20080926]
ACPI Warning (nspredef-0252): \_SB_.PCI0._OSC: Parameter count mismatch - ASL declared 5, expected 4 [20080926]
ACPI Error (nspredef-0163): \_SB_.PCI0._OSC: Missing expected return value [20080926]
BUG: unable to handle kernel NULL pointer dereference at 00000000
IP: [<c0237671>] acpi_run_osc+0xa1/0x170

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-25 04:07:14 -04:00
Rafael J. Wysocki
92daa7b53b ACPI suspend: build fix for ACPI_SLEEP=n && XEN_SAVE_RESTORE=y.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-25 04:07:13 -04:00
Len Brown
cab0896918 toshiba_acpi: always call input_sync() after input_report_switch()
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-25 04:07:13 -04:00
Guillem Jover
df316e9391 ACPI: Always report a sync event after a lid state change
Currently not always an EV_SYN event is reported to userland
after the EV_SW SW_LID event has been sent. This is easy to verify
by using “input-events” from input-utils and just closing and opening
the lid.

Signed-off-by: Guillem Jover <guillem.jover@nokia.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-25 04:07:12 -04:00
Miao Xie
16be87ea17 ACPI: cpufreq, processor: fix compile error in drivers/acpi/processor_perflib.c
When trying to build 2.6.28-rc1 on ia64, make aborts with:

  CC      drivers/acpi/processor_perflib.o
  drivers/acpi/processor_perflib.c:41:28: error: asm/cpufeature.h: No such file or directory
  drivers/acpi/processor_perflib.c: In function ‘acpi_processor_get_performance_info’:
  drivers/acpi/processor_perflib.c:364: error: implicit declaration of function ‘boot_cpu_has’
  drivers/acpi/processor_perflib.c:364: error: ‘X86_FEATURE_EST’ undeclared (first use in this function)
  drivers/acpi/processor_perflib.c:364: error: (Each undeclared identifier is reported only once
  drivers/acpi/processor_perflib.c:364: error: for each function it appears in.)
  make[2]: *** [drivers/acpi/processor_perflib.o] Error 1
  make[1]: *** [drivers/acpi] Error 2
  make: *** [drivers] Error 2

this patch fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-25 04:07:12 -04:00
Venki Pallipadi
f371be6352 i7300_idle: Fix compile warning CONFIG_I7300_IDLE_IOAT_CHANNEL not defined
When I7300_idle driver is not configured, there is a compile time
warning about IDLE_IOAT_CHANNEL not defined. Fix it.

Reported-by: Suresh Siddha <suresh.b.siddha@intel.com>
Reported-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-24 12:59:47 -04:00
Venki Pallipadi
33093e186c i7300_idle: Cleanup based review comments
Cleanup of i7300 idle driver based on review comments from Randy Dunlap,
Andi Kleen and Len Brown.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-24 12:55:14 -04:00
Venki Pallipadi
3ad0b02e4c i7300_idle: Disable ioat channel only on platforms where ile driver can load
Based on input from Andi Kleen:
share the platform detection code with ioat_dma and disable the channel in
dma engine only for specific platforms.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-24 12:54:18 -04:00
Mark Brown
9c366452e0 mfd: Make WM8400 depend on I2C until SPI is submitted
Otherwise we could build in WM8400 but not I2C.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-10-24 18:34:39 +02:00
Samuel Ortiz
8e2eaabfd9 mfd: add missing Kconfig entry for da903x
This one was accidentally left out during the rc1 mfd merge.

Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-10-24 18:34:27 +02:00
David Vrabel
ae5d82cb8d uwb: build UWB before USB/WUSB
The WHCI-HCD driver in drivers/usb/host/ depends on the umc driver in
drivers/uwb/.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-10-24 16:46:22 +01:00
Jens Axboe
e013e13bf6 libata: fix bug with non-ncq devices
The recent commit 2fca5ccf97 ("libata:
switch to using block layer tagging support") to enable support for
block layer tagging in libata was broken for non-NCQ devices

The block layer initializes the tag field to -1 to detect invalid uses
of a tag, and if the libata devices does NOT support NCQ, we just used
that field to index the internal command list.  So we need to check for
-1 first and only use the tag field if it's valid.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Paul Mundt <lethal@linux-sh.org>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-24 08:22:38 -07:00
Ingo Molnar
f17845e5d9 ftrace: warning in kernel/trace/ftrace.c
this warning:

  kernel/trace/ftrace.c:189: warning: ‘frozen_record_count’ defined but not used

triggers because frozen_record_count is only used in the KCONFIG_MARKERS
case. Move the variable it there.

Alas, this frozen-record facility seems to have little use. The
frozen_record_count variable is not used by anything, nor the flags.

So this section might need a bit of dead-code-removal care as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:52:44 +02:00
Peter Zijlstra
3f3a490480 sched: virtual time buddy preemption
Since we moved wakeup preemption back to virtual time, it makes sense to move
the buddy stuff back as well. The purpose of the buddy scheduling is to allow
a quickly scheduling pair of tasks to run away from the group as far as a
regular busy task would be allowed under wakeup preemption.

This has the advantage that the pair can ping-pong for a while, enjoying
cache-hotness. Without buddy scheduling other tasks would interleave destroying
the cache.

Also, it saves a word in cfs_rq.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:03 +02:00
Peter Zijlstra
464b75273f sched: re-instate vruntime based wakeup preemption
The advantage is that vruntime based wakeup preemption has a better
conceptual model. Here wakeup_gran = 0 means: preempt when 'fair'.
Therefore wakeup_gran is the granularity of unfairness we allow in order
to make progress.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:02 +02:00
Mike Galbraith
0d13033bc9 sched: weaken sync hint
Mysql+oltp and pgsql+oltp peaks are still shifted right. The below puts
the peaks back to 1 client/server pair per core.

Use the avg_overlap information to weaken the sync hint.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:01 +02:00
Peter Zijlstra
1af5f730fc sched: more accurate min_vruntime accounting
Mike noticed the current min_vruntime tracking can go wrong and skip the
current task. If the only remaining task in the tree is a nice 19 task
with huge vruntime, new tasks will be inserted too far to the right too,
causing some interactibity issues.

min_vruntime can only change due to the leftmost entry disappearing
(dequeue_entity()), or by the leftmost entry being incremented past the
next entry, which elects a new leftmost (__update_curr())

Due to the current entry not being part of the actual tree, we have to
compare the leftmost tree entry with the current entry, and take the
leftmost of these two.

So create a update_min_vruntime() function that takes computes the
leftmost vruntime in the system (either tree of current) and increases
the cfs_rq->min_vruntime if the computed value is larger than the
previously found min_vruntime. And call this from the two sites we've
identified that can change min_vruntime.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:51:00 +02:00
Peter Zijlstra
01c8c57d66 sched: fix a find_busiest_group buglet
In one of the group load balancer patches:

	commit 408ed066b1
	Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
	Date:   Fri Jun 27 13:41:28 2008 +0200
	Subject: sched: hierarchical load vs find_busiest_group

The following change:

-               if (max_load - this_load + SCHED_LOAD_SCALE_FUZZ >=
+               if (max_load - this_load + 2*busiest_load_per_task >=
                                        busiest_load_per_task * imbn) {

made the condition always true, because imbn is [1,2].
Therefore, remove the 2*, and give the it a fair chance.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24 12:50:59 +02:00
Ingo Molnar
8c82a17e9c Merge commit 'v2.6.28-rc1' into sched/urgent 2008-10-24 12:48:46 +02:00
Linus Torvalds
57f8f7b60d Linux 2.6.28-rc1 2008-10-23 20:06:52 -07:00
Randy Dunlap
325dcfdc81 Fix PCI hotplug printk format
Fix printk format warning:

  drivers/pci/hotplug/acpiphp_ibm.c:207: warning: format '%08lx' expects type 'long unsigned int', but argument 3 has type 'long long unsigned int'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 19:48:29 -07:00
Linus Torvalds
3a2c5dad6c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: remove unused resource assignment in pci_read_bridge_bases()
  PCI hotplug: shpchp: message refinement
  PCI hotplug: shpchp: replace printk with dev_printk
  PCI: add routines for debugging and handling lost interrupts
  PCI hotplug: pciehp: message refinement
  PCI: fix ARI code to be compatible with mixed ARI/non-ARI systems
  PCI hotplug: cpqphp: fix kernel NULL pointer dereference
2008-10-23 19:31:34 -07:00
Linus Torvalds
2242d5eff1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
  tcp: Restore ordering of TCP options for the sake of inter-operability
  net: Fix disjunct computation of netdev features
  sctp: Fix to handle SHUTDOWN in SHUTDOWN_RECEIVED state
  sctp: Fix to handle SHUTDOWN in SHUTDOWN-PENDING state
  sctp: Add check for the TSN field of the SHUTDOWN chunk
  sctp: Drop ICMP packet too big message with MTU larger than current PMTU
  p54: enable 2.4/5GHz spectrum by eeprom bits.
  orinoco: reduce stack usage in firmware download path
  ath5k: fix suspend-related oops on rmmod
  [netdrvr] fec_mpc52xx: Implement polling, to make netconsole work.
  qlge: Fix MSI/legacy single interrupt bug.
  smc911x: Make the driver safer on SMP
  smc911x: Add IRQ polarity configuration
  smc911x: Allow Kconfig dependency on ARM
  sis190: add identifier for Atheros AR8021 PHY
  8139x: reduce message severity on driver overlap
  igb: add IGB_DCA instead of selecting INTEL_IOATDMA
  igb: fix tx data corruption with transition to L0s on 82575
  ehea: Fix memory hotplug support
  netdev: DM9000: remove BLACKFIN hacking in DM9000 netdev driver
  ...
2008-10-23 19:19:54 -07:00