Impact: cleanup
This patch applies some corrections suggested by Steven Rostedt.
Change the type of shed_ref into int since it is used
into a Mutex, we don't need it anymore as an atomic
variable in the sched_switch tracer.
Also change the name of the register mutex.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhance boot trace output with scheduling events
Use the sched_switch tracer from the boot tracer.
We also can trace schedule events inside the initcalls.
Sched tracing is disabled after the initcall has finished and
then reenabled before the next one is started.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
When init_sched_switch_trace() is called, it has no reason to start
the sched tracer if the sched_ref is not zero.
_ If this is non-zero, the tracer is already used, but we can register it
to the tracing engine. There is already a security which avoid the tracer
probes not to be resgistered twice.
_ If this is zero, this block will not be used.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix race condition in sched_switch tracer
This patch fixes a race condition in the sched_switch tracer. If
several tasks (IE: concurrent initcalls) are playing with
tracing_start_cmdline_record() and tracing_stop_cmdline_record(), the
following situation could happen:
_ Task A and B are using the same tracepoint probe. Task A holds it.
Task B is sleeping and doesn't hold it.
_ Task A frees the sched tracer, then sched_ref is decremented to 0.
_ Task A is preempted and hadn't yet unregistered its tracepoint
probe, then B runs.
_ B increments sched_ref, sees it's 1 and then guess it has to
register its probe. But it has not been freed by task A.
_ A lot of bad things can happen after that...
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: modify boot tracer
We used to disable the initcall tracing at a specified time (IE: end
of builtin initcalls). But we don't need it anymore. It will be
stopped when initcalls are finished.
However we want two things:
_Start this tracing only after pre-smp initcalls are finished.
_Since we are planning to trace sched_switches at the same time, we
want to enable them only during the initcall execution.
For this purpose, this patch introduce two functions to enable/disable
the sched_switch tracing during boot.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2.6.28-rc tightened up the ELF architecture checks on ARM. For
non-EABI it only allows VFP if the hardware supports it. However,
the kernel fails to also inspect the soft-float flag, so it
incorrectly rejects binaries using soft-VFP.
The fix is simple: also check that EF_ARM_SOFT_FLOAT isn't set
before rejecting VFP binaries on non-VFP hardware.
Acked-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Impact: fix sysctl name typo
Steve must have needed more coffee ;-)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add SysRq-z support to dump trace buffers
Allows one to force an ftrace dump from sysrq
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Documentation update only
Update the version that the ftrace document was written for.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Documentation update only
A lot of changes have gone into ftrace. This patch updates
the ftrace.txt document.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: disable interrupts during trace entry creation (as opposed to preempt)
To help with performance, I set the ftracer to not disable interrupts,
and only to disable preemption. If an interrupt occurred, it would not
be traced, because the function tracer protects itself from recursion.
This may be faster, but the trace output might miss some traces.
This patch makes the fuction trace disable interrupts, but it also
adds a runtime feature to disable preemption instead. It does this by
having two different tracer functions. When the function tracer is
enabled, it will check to see which version is requested (irqs disabled
or preemption disabled). Then it will use the corresponding function
as the tracer.
Irq disabling is the default behaviour, but if the user wants better
performance, with the chance of missing traces, then they can choose
the preempt disabled version.
Running hackbench 3 times with the irqs disabled and 3 times with
the preempt disabled function tracer yielded:
tracing type times entries recorded
------------ -------- ----------------
irq disabled 43.393 166433066
43.282 166172618
43.298 166256704
preempt disabled 38.969 159871710
38.943 159972935
39.325 161056510
Average:
irqs disabled: 43.324 166287462
preempt disabled: 39.079 160300385
preempt is 10.8 percent faster than irqs disabled.
I wrote a patch to count function trace recursion and reran hackbench.
With irq disabled: 1,150 times the function tracer did not trace due to
recursion.
with preempt disabled: 5,117,718 times.
The thousand times with irq disabled could be due to NMIs, or simply a case
where it called a function that was not protected by notrace.
But we also see that a large amount of the trace is lost with the
preempt version.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: use new, consolidated APIs in ftrace plugins
This patch replaces the schedule safe preempt disable code with the
ftrace_preempt_disable() and ftrace_preempt_enable() safe functions.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new ftrace-plugin internal APIs
Parts of the tracer needs to be careful about schedule recursion.
If the NEED_RESCHED flag is set, a preempt_enable will call schedule.
Inside the schedule function, the NEED_RESCHED flag is cleared.
The problem arises when a trace happens in the schedule function but before
NEED_RESCHED is cleared. The race is as follows:
schedule()
>> tracer called
trace_function()
preempt_disable()
[ record trace ]
preempt_enable() <<- here's the issue.
[check NEED_RESCHED]
schedule()
[ Repeat the above, over and over again ]
The naive approach is simply to use preempt_enable_no_schedule instead.
The problem with that approach is that, although we solve the schedule
recursion issue, we now might lose a preemption check when not in the
schedule function.
trace_function()
preempt_disable()
[ record trace ]
[Interrupt comes in and sets NEED_RESCHED]
preempt_enable_no_resched()
[continue without scheduling]
The way ftrace handles this problem is with the following approach:
int resched;
resched = need_resched();
preempt_disable_notrace();
[record trace]
if (resched)
preempt_enable_no_sched_notrace();
else
preempt_enable_notrace();
This may seem like the opposite of what we want. If resched is set
then we call the "no_sched" version?? The reason we do this is because
if NEED_RESCHED is set before we disable preemption, there's two reasons
for that:
1) we are in an atomic code path
2) we are already on our way to the schedule function, and maybe even
in the schedule function, but have yet to clear the flag.
Both the above cases we do not want to schedule.
This solution has already been implemented within the ftrace infrastructure.
But the problem is that it has been implemented several times. This patch
encapsulates this code to two nice functions.
resched = ftrace_preempt_disable();
[ record trace]
ftrace_preempt_enable(resched);
This way the tracers do not need to worry about getting it right.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix udelay when "notsc" boot parameter is passed
With notsc passed on commandline, tsc may not be used for
udelays, make sure that we do not use tsc_khz to calculate
the lpj value in such cases.
Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
libata restores SControl on detach; however, trying to restore
non-zero DET can cause undeterministic behavior including PMP device
going offline till power cycling. Mask off DET when restoring
SControl.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata always uses PIO for ATAPI commands when the number of bytes to
transfer isn't multiple of 16 but quantum DAT72 chokes on odd bytes
PIO transfers. Implement a horkage to skip the mod16 check and apply
it to the quantum device.
This is reported by John Clark in the following thread.
http://thread.gmane.org/gmane.linux.ide/34748
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: John Clark <clarkjc@runbox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Peter Moulder has pointed out that there is a slight chance that a
negative value might be passed to jiffies_to_msecs() in
ata_scsi_park_show(). This is fixed by saving the value of jiffies in a
local variable, thus also reducing code since the volatile variable
jiffies is accessed only once.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Tejun Heo <tj.kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
All three flavors of sata_nv's are different in how their hardreset
behaves.
* generic: Hardreset is not reliable. Link often doesn't come online
after hardreset.
* nf2/3: A little bit better - link comes online with longer debounce
timing. However, nf2/3 can't reliable wait for the first D2H
Register FIS, so it can't wait for device readiness or classify the
device after hardreset. Follow-up SRST required.
* ck804: Hardreset finally works.
The core layer change to prefer hardreset and follow up changes
exposed the above issues and caused various detection regressions for
all three flavors. This patch, hopefully, fixes all the known issues
and should make sata_nv error handling more reliable.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
commit b9d5b89b48 (sata_via: fix support
for 5287) accidently (?) removed vt*_prepare_host error handling - restore it
catched by gcc:
drivers/ata/sata_via.c: In function 'svia_init_one':
drivers/ata/sata_via.c:567: warning: 'host' may be used uninitialized in this function
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Joseph Chan <JosephChan@via.com.tw>
Cc: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Promise ATA engines need to be reset when errors occur.
That's currently done for errors detected by sata_promise itself,
but it's not done for errors like timeouts detected outside of
the low-level driver.
The effect of this omission is that a timeout tends to result
in a sequence of failed COMRESETs after which libata EH gives
up and disables the port. At that point the port's ATA engine
hangs and even reloading the driver will not resume it.
To fix this, make sata_promise override ->hardreset on SATA
ports with code which calls pdc_reset_port() on the port in
question before calling libata's hardreset. PATA ports don't
use ->hardreset, so for those we override ->softreset instead.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
From: Alexey Dobriyan <adobriyan@gmail.com>
Based upon a lockdep trace by Simon Arlott.
xfrm_policy_kill() can be called from both BH and
non-BH contexts, so we have to grab xfrm_policy_gc_lock
with BH disabling.
Signed-off-by: David S. Miller <davem@davemloft.net>
netif_carrier_off was called too early at the probe. In case of failure
or simply bad timing, this can cause a fatal error since linkwatch_event
might run too soon.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code read nothing but zeros on big-endian (wrong part of the
32bits). This caused poor performance on big-endian machines. Though this
issue did not cause the system to crash, the performance is significantly
better with the fix so I view it as critical bug fix.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the PMF flag is set, the driver can access the HW freely. When the
driver is unloaded, it should not access the HW. The problem caused fatal
errors when "ethtool -i" was called after the calling instance was unloaded
and another instance was already loaded
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ext4_sync_fs, we only wait for a commit to finish if we started it,
but there may be one already in progress which will not be synced.
In the case of a data=ordered umount with pending long symlinks which
are delayed due to a long list of other I/O on the backing block
device, this causes the buffer associated with the long symlinks to
not be moved to the inode dirty list in the second phase of
fsync_super. Then, before they can be dirtied again, kjournald exits,
seeing the UMOUNT flag and the dirty pages are never written to the
backing block device, causing long symlink corruption and exposing new
or previously freed block data to userspace.
To ensure all commits are synced, we flush all journal commits now
when sync_fs'ing ext4.
Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Use le16_to_cpu to read the s_reserved_gdt_blocks values
from super block.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we try to free a block which is already freed, the code was
returning without first unlocking the group.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
pci_mmap_fits() returns the wrong answer if the sysfs resource file size
is not a multiple of the page size. vm_end and vm_start are already
page-aligned, so size - start < nr, causing mmap() to return EINVAL.
Signed-off-by: Ed Swierk <eswierk@aristanetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix pci/rom.c kernel-doc function notation:
Warning(drivers/pci/rom.c:110): Excess function parameter or struct member 'return' description in 'pci_map_rom'
Warning(drivers/pci/rom.c:177): Excess function parameter or struct member 'return' description in 'pci_map_rom_copy'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Was missing from the initial patch.
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
VPD quirks need to be called after the VPD capability is initialized.
Since VPD initialization now runs after pci_fixup_header (due to the
capabilities consolidation), VPD quirks should be done at
pci_fixup_final stage correspondingly.
Tested-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The mv643xx_eth mii bus implementation uses wait_event_timeout() to
wait for SMI completion interrupts.
If wait_event_timeout() would return zero, mv643xx_eth would conclude
that the SMI access timed out, but this is not necessarily true --
wait_event_timeout() can also return zero in the case where the SMI
completion interrupt did happen in time but where it took longer than
the requested timeout for the process performing the SMI access to be
scheduled again. This would lead to occasional SMI access timeouts
when the system would be under heavy load.
The fix is to ignore the return value of wait_event_timeout(), and
to re-check the SMI done bit after wait_event_timeout() returns to
determine whether or not the SMI access timed out.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The bool kconfig option added to ixgbe and myri10ge for DCA is ambigous,
so this patch adds a description to the kconfig option.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix renaming one hardlink on top of another
[CIFS] fix error in smb_send2
[CIFS] Reduce number of socket retries in large write path
cifs: fix renaming one hardlink on top of another
POSIX says that renaming one hardlink on top of another to the same
inode is a no-op. We had the logic mostly right, but forgot to clear
the return code.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing, ring-buffer: add paranoid checks for loops
ftrace: use kretprobe trampoline name to test in output
tracing, alpha: undefined reference to `save_stack_trace'
* 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
io mapping: clean up #ifdefs
io mapping: improve documentation
i915: use io-mapping interfaces instead of a variety of mapping kludges
resources: add io-mapping functions to dynamically map large device apertures
x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda: make a STAC_DELL_EQ option
ALSA: emu10k1 - Add more invert_shared_spdif flag to Audigy models
ALSA: hda - Add a quirk for another Acer Aspire (1025:0090)
ALSA: remove direct access of dev->bus_id in sound/isa/*
sound: struct device - replace bus_id with dev_name(), dev_set_name()
ALSA: Fix PIT lockup on some chipsets when using the PC-Speaker
ALSA: rawmidi - Add open check in rawmidi callbacks
ALSA: hda - Add digital-mic for ALC269 auto-probe mode
ALSA: hda - Disable broken mic auto-muting in Realtek codes
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
i915: Add GEM ioctl to get available aperture size.
drm/radeon: fixup further bus mastering confusion.
build fix: CONFIG_DRM_I915=y && CONFIG_ACPI=n
Impact: cleanup
clean up ifdefs: change #ifdef CONFIG_X86_32/64 to
CONFIG_HAVE_ATOMIC_IOMAP.
flip around the #ifdef sections to clean up the structure.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add support for explicitly enabling the EQ distortion hack for
systems without software biquad support.
Signed-off-by: Matthew Ranostay <mranostay@embeddedalley.com>
Cc: stable@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
While writing a new tracer, I had a bug where I caused the ring-buffer
to recurse in a bad way. The bug was with the tracer I was writing
and not the ring-buffer itself. But it took a long time to find the
problem.
This patch adds paranoid checks into the ring-buffer infrastructure
that will catch bugs of this nature.
Note: I put the bug back in the tracer and this patch showed the error
nicely and prevented the lockup.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>