Meaning receive multiple messages, reducing the number of syscalls and
net stack entry/exit operations.
Next patches will introduce mechanisms where protocols that want to
optimize this operation will provide an unlocked_recvmsg operation.
This takes into account comments made by:
. Paul Moore: sock_recvmsg is called only for the first datagram,
sock_recvmsg_nosec is used for the rest.
. Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
works in the same fashion as the ppoll one.
If the underlying protocol returns a datagram with MSG_OOB set, this
will make recvmmsg return right away with as many datagrams (+ the OOB
one) it has received so far.
. Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
datagrams and then recvmsg returns an error, recvmmsg will return
the successfully received datagrams, store the error and return it
in the next call.
This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
where we will be able to acquire the lock only at batch start and end, not at
every underlying recvmsg call.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bill Gatliff & David Brownell pointed out we were missing some
copyrights, and licensing terms in some of the files in
./arch/blackfin, so this fixes things, and cleans them up.
It also removes:
- verbose GPL text(refer to the top level ./COPYING file)
- file names (you are looking at the file)
- bug url (it's in the ./MAINTAINERS file)
- "or later" on GPL-2, when we did not have that right
It also allows some Blackfin-specific assembly files to be under a BSD
like license (for people to use them outside of Linux).
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make sure our interrupt entry code with exact hardware errors handles
anomaly 05000283 (infinite stall in system MMR kill) so we don't stall
while under load.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The majority of the time we are returning to user space, it is not in the
fixed atomic code region. So rather than branch to a function where we
check the PC and return, do the check inline and branch only when needed.
Also, tweak some of the fixed code handling based on assumptions we are
aware of but cannot be expressed in C.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The handling of updating the [DI]MEM_CONTROL MMRs does not follow proper
sync procedures as laid out in the Blackfin programming manual. So rather
than audit/fix every call location, create helper functions that do the
right things in order to safely update these MMRs. Then convert all call
sites to use these new helper functions.
While we're fixing the code, drop the workaround for anomaly 05000125 as
that anomaly applies to old versions of silicon that we do not support.
Signed-off-by: Yi Li <yi.li@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Since the hardware only provides reporting for the last exception handled,
and the values are valid only when executing the exception handler, we
need to save the context for reporting at a later point. While we do this
for one exception, it doesn't work properly when handling a second one as
the original exception is clobbered by the double fault. So when double
fault debugging is enabled, create a dedicated shadow of these values and
save/restore out of there. Now the crash report properly displays the
first exception as well as the second one.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cleanup is performed in two ways:
- remove extraneous updates of IPEND[4] w/ CONFIG_IPIPE,
and document remaining use.
- substitute pop-reg-from-stack instructions with plain SP fixups in
all save-RETI-then-discard patterns.
Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The purpose of the EVT14 handler may depend on whether CONFIG_IPIPE is
enabled, albeit its implementation can be the same in both cases. When
the interrupt pipeline is enabled, EVT14 can be used to raise the core
priority level for the running code; when CONFIG_IPIPE is off, EVT14
can be used to lower this level before running softirq handlers.
Rename evt14_softirq to evt_evt14 to pick an identifier that fits
both, which allows to reuse the same vector setup code as well.
Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
We handle many exceptions at EVT5 (hardware error level) so that we can
catch exceptions in our exception handling code. Today - if the global
interrupt enable bit (IPEND[4]) is set (interrupts disabled) our trap
handling code goes into a infinite loop, since we need interrupts to be
on to defer things to EVT5.
Normal kernel code should not trigger this for any reason as IPEND[4] gets
cleared early (when doing an interrupt context save) and the kernel stack
there should be sane (or something much worse is happening in the system).
But there have been a few times where this has happened, so this change
makes sure we dump a proper crash message even when things have gone south.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Improve the assembly with a few explanatory comments and use symbolic
defines rather than numeric values for bit positions.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
We don't need to handle CPLB protection violations unless we are running
with the MPU on. Fix the entry code to call common trap_c, and remove the
code which is never run. This allows the traps test suite to run on older
boards with the MPU disabled.
URL: http://blackfin.uclinux.org/gf/tracker/5129
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Hardware errors on the Blackfin architecture are queued by nature of the
hardware design. Things that could generate a hardware level queue up at
the system interface and might not process until much later, at which
point the system would send a notification back to the core.
As such, it is possible for user space code to do something that would
trigger a hardware error, but have it delay long enough for the process
context to switch. So when the hardware error does signal, we mistakenly
evaluate it as a different process or as kernel context and panic (erp!).
This makes it pretty difficult to find the offending context. But wait,
there is good news somewhere.
By forcing a SSYNC in the interrupt entry, we force all pending queues at
the system level to be processed and all hardware errors to be signaled.
Then we check the current interrupt state to see if the hardware error is
now signaled. If so, we re-queue the current interrupt and return thus
allowing the higher priority hardware error interrupt to process properly.
Since we haven't done any other context processing yet, the right context
will be selected and killed. There is still the possibility that the
exact offending instruction will be unknown, but at least we'll have a
much better idea of where to look.
The downside of course is that this causes system-wide syncs at every
interrupt point which results in significant performance degradation.
Since this situation should not occur in any properly configured system
(as hardware errors are triggered by things like bad pointers), make it a
debug configuration option and disable it by default.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Move exception stack mess from entry.S to init.c to fix link failure when
CONFIG_EXCEPTION_L1_SCRATCH is in use.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
- unable to single step over emuexcpt instruction
- gdbproxy goes into infinite loop when doing gdb does "next" over
"emuexcpt"
Don't decrement PC after software breakpoint.
Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
This is a mixture ofcMichael McTernan's patch and the existing cplb-mpu code.
We ditch the old cplb-nompu implementation, which is a good example of
why a good algorithm in a HLL is preferrable to a bad algorithm written in
assembly. Rather than try to construct a table of all posible CPLBs and
search it, we just create a (smaller) table of memory regions and
their attributes. Some of the data structures are now unified for both
the mpu and nompu cases. A lot of needless complexity in cplbinit.c is
removed.
Further optimizations:
* compile cplbmgr.c with a lot of -ffixed-reg options, and omit saving
these registers on the stack when entering a CPLB exception.
* lose cli/nop/nop/sti sequences for some workarounds - these don't
* make
sense in an exception context
Additional code unification should be possible after this.
[Mike Frysinger <vapier.adi@gmail.com>:
- convert CPP if statements to C if statements
- remove redundant statements
- use a do...while loop rather than a for loop to get slightly better
optimization and to avoid gcc "may be used uninitialized" warnings ...
we know that the [id]cplb_nr_bounds variables will never be 0, so this
is OK
- the no-mpu code was the last user of MAX_MEM_SIZE and with that rewritten,
we can punt it
- add some BUG_ON() checks to make sure we dont overflow the small
cplb_bounds array
- add i/d cplb entries for the bootrom because there is functions/data in
there we want to access
- we do not need a NULL trailing entry as any time we access the bounds
arrays, we use the nr_bounds variable
]
Signed-off-by: Michael McTernan <mmcternan@airvana.com>
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
arch/blackfin/mach-common/entry.S:465: Error: pcrel too far
BFD_RELOC_BFIN_10
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
rename irq_flags to bfin_irq_flags to avoid namespace
collision with common code
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Blackfin dual core BF561 processor can support SMP like features.
https://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:smp-like
In this patch, we provide SMP extend to Blackfin header files
and machine common code
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
The original code defined _exception_stack but not alloc space for the exception
stack. In exception, this area is over written by exception stack. Common kernel
luckly boot up, but SMP kernel stuck.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
allow people to stick exception stack into L1 scratch
and make sure it gets placed into .bss sections rather than .data
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
a global _sys_trace will cause the assembler to fail, it should be fixed in toolchain side firstly.
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Apply ANOMALY_05000283 & ANOMALY_05000315
Workaround also to the EXCEPTION path.
Cover evt_ivhw also with ANOMALY_05000315
The Workaround needs to be prior to accesses (either read or write) to
any system MMR.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
- Skip single step if global interrupt disable bit is set.
- Extend bernds' patch r4673 to skip single step in any interrupt entry
that interrupts the code which is under single stepping. Bernds' patch
only allow user space single stepping.
Singed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Skip single step if event priority of current instruction is higher than
that of the first instruction, from which gdb starts single step.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
When transferring to IRQ5 from an exception, save SYSCFG in memory across the
transfer and clear the trace bit.
When we get a single step exception, check whether we can safely clear the
trace bit in SYSCFG. We can (and should) clear it after the first instruction
of the interrupt handler; the first insn saves SYSCFG to the stack in all
handlers.
Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
In the double fault handler, set up the PT_RETI slot so that
we print out the correct return address in the dumping code.
Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Remove the circular buffering mechanism for exceptions. Instead, point RETX
at a safe location from which to fetch three NOPs.
This safe location is now in the fixed code area, and also used for certain
anomaly workarounds, to ensure that user space can find a valid ICPLB when
things are built with CONFIG_MPU.
Also, save I/DCPLB_FAULT_ADDRESS when lowering to level 5, since the hardware
reg is valid only at exception level.
Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
There were a couple of problems with the way the trace buffer state
is saved/restored in assembly. The DEBUG_HWTRACE_SAVE/RESTORE macros
save a value to the stack, which is not immediately obvious; the CPLB
exception code needed changes to load the correct value of the stack
pointer. The other problem is that the SAVE/RESTORE macros weren't
pushing and popping the value downwards on the stack, but rather moving
it _upwards_, which is of course completely broken.
We also need to make sure there's a matching DEBUG_HWTRACE_RESTORE in
the error case of the CPLB handler.
Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
To save/restore the trace buffer control so that if we take an exception
after turning off the trace buffer at a higher level we dont inadvertently
turn the trace buffer back on
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);
The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).
The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.
The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.
Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:
http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>