Change memory region to ohci "middle address space". This effectively
reduces the number of packets by 50%.
[Stefan R.:] This eliminates 1394 ack packets and improved throughput
by a few percent in some tests with an S400a connection with and without
gap count optimization. Since firewire-net taxes the AR-req DMA unit of
a FireWire controller much more than firewire-sbp2 (which uses the
middle address space with PCI posted writes too), this commit also
changes a related error printk into a ratelimited one as a precaution.
Side note: The IPv4-over-1394 drivers of Mac OS X 10.4, Windows XP SP3,
and the Thesycon 1394 bus driver for Windows all use the middle address
space too.
Signed-off-by: Stephan Gatzka <stephan@gatzka.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Use kernel.h's convenience macros. Also omit a printk that should never
happen and won't matter much if it ever happened.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Fixing a deprecation, replacing __attribute__((packed)) with __packed.
It was deprecated for portability, specifically to avoid GCC specific
code. See commit 82ddcb0405.
Signed-off-by: August Lilleaas <august@augustl.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (added include compiler.h)
When queueing iso packets, the run time is dominated by the two
MMIO accesses that set the DMA context's wake bit. Because most
drivers submit packets in batches, we can save much time by
removing all but the last wakeup.
The internal kernel API is changed to require a call to
fw_iso_context_queue_flush() after a batch of queued packets.
The user space API does not change, so one call to
FW_CDEV_IOC_QUEUE_ISO must specify multiple packets to take
advantage of this optimization.
In my measurements, this patch reduces the time needed to queue
fifty skip packets from userspace to one sixth on a 2.5 GHz CPU,
or to one third at 800 MHz.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This makes it possible to resume communication with a node that dropped
off the bus for a brief period. Otherwise communication will only be
possible after ARP cache entry timeouts.
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (rebased)
At ifup, carrier status would be shown on even if it actually was off.
Also add an include for ethtool_ops rather than to rely on the one from
netdevice.h.
Note, we can alas not use fwnet_device_mutex to serialize access to
dev->peer_count (as I originally wanted). This would cause a lock
inversion:
- fwnet_probe | takes fwnet_device_mutex
+ register_netdev | takes rtnl_mutex
- devinet_ioctl | takes rtnl_mutex
+ fwnet_open | ...must not take fwnet_device_mutex
Hence use the dev->lock spinlock for serialization.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
To make userland, e.g. NetworkManager work with firewire, we need to
detect whether cable is plugged or not. Simple and correct way of doing
that is just counting number of peers. No peers - no link and vice
versa.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Unfortunately its easy to trigger such error messages by removing the
cable while sending streams of data over the link.
Such errors are normal, and therefore this patch stops firewire-net from
flooding the kernel log with these errors, by combining series of same
errors together.
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
(Stefan R:) Eventually we should remove this logging when firewire-net
and related firewire-ohci facilities have been stabilized.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This prevents firewire-net from submitting write requests in fast
succession until failure due to all 64 transaction labels were used up
for unfinished split transactions. The netif_stop/wake_queue API is
used for this purpose.
Without this stop/wake mechanism, datagrams were simply lost whenever
the tlabel pool was exhausted. Plus, tlabel exhaustion by firewire-net
also prevented other unrelated outbound transactions to be initiated.
The chosen queue depth was checked by me to hit the maximum possible
throughput with an OS X peer whose receive DMA is good enough to never
reject requests due to busy inbound request FIFO. Current Linux peers
show a mixed picture of -5%...+15% change in bandwidth; their current
bottleneck are RCODE_BUSY situations (fewer or more, depending on TX
queue depth) due to too small AR buffer in firewire-ohci.
Maxim Levitsky tested this change with similar watermarks with a Linux
peer and some pending firewire-ohci improvements that address the
RCODE_BUSY problem and confirmed that these TX queue limits are good.
Note: This removes some netif_wake_queue from reception code paths.
They were apparently copy&paste artefacts from a nonsensical
netif_wake_queue use in the older eth1394 driver. This belongs only
into the transmit path.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
The current transmit code does not at all make use of
- fwnet_device.packet_list
and only very limited use of
- fwnet_device.broadcasted_list,
- fwnet_device.queued_packets.
Their current function is to track whether the TX soft-IRQ finished
dealing with an skb when the AT-req tasklet takes over, and to discard
pending tx datagrams (if there are any) when the local node is removed.
The latter does actually contain a race condition bug with TX soft-IRQ
and AT-req tasklet.
Instead of these lists and the corresponding link in fwnet_packet_task,
- a flag in fwnet_packet_task to track whether fwnet_tx is done,
- a counter of queued datagrams in fwnet_device
do the job as well.
The above mentioned theoretic race condition is resolved by letting
fwnet_remove sleep until all datagrams were flushed. It may sleep
almost arbitrarily long since fwnet_remove is executed in the context of
a multithreaded (concurrency managed) workqueue.
The type of max_payload is changed to u16 here to avoid waste in struct
fwnet_packet_task. This value cannot exceed 4096 per IEEE 1394:2008
table 16-18 (or 32678 per specification of packet headers, if there is
ever going to be something else than beta mode).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
a) fwnet_transmit_packet_done used to poison ptask->pt_link by list_del.
If fwnet_send_packet checked later whether it was responsible to clean
up (in the border case that the TX soft IRQ was outpaced by the AT-req
tasklet on another CPU), it missed this because ptask->pt_link was no
longer shown as empty.
b) If fwnet_write_complete got an rcode other than RCODE_COMPLETE, we
missed to free the skb and ptask entirely.
Also, count stats.tx_dropped and stats.tx_errors when rcode != 0.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The incoming request hander fwnet_receive_packet() expects subsequent
datagram handling code to return non-zero on errors. However, almost
none of the failure paths did so. Fix them all.
(This error reporting is used to send and RCODE_CONFLICT_ERROR to the
sender node in such failure cases. Two modes of failure exist: Out of
memory, or firewire-net is unaware of any peer node to which a fragment
or an ARP packet belongs. However, it is unclear whether a sender can
actually make use of such information. A Linux peer apparently can't.
Maybe it should all be simplified to void functions.)
Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The driver name and bus address for a net_device can normally be found
through the driver model now. Instead of requiring drivers to provide
this information redundantly through the ethtool_ops::get_drvinfo
operation, use the driver model to do so if the driver does not define
the operation. Since ETHTOOL_GDRVINFO no longer requires the driver
to implement any operations, do not require net_device::ethtool_ops to
be set either.
Remove implementations of get_drvinfo and ethtool_ops that provide
only this information.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/firewire/core-card.c
drivers/firewire/core-cdev.c
and forgotten #include <linux/time.h> in drivers/firewire/ohci.c
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
void (*fw_address_callback_t)(..., int speed, ...) is the speed that a
remote node chose to transmit a request to us. In case of split
transactions, firewire-core will transmit the response at that speed.
Upper layer drivers on the other hand (firewire-net, -sbp2, firedtv, and
userspace drivers) cannot do anything useful with that speed datum,
except log it for debug purposes. But data that is merely potentially
(not even actually) used for debug purposes does not belong into the API.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
In the transmit path of firewire-net (IPv4 over 1394), the following
race condition may occur:
- The networking soft IRQ inserts a datagram into the 1394 async
request transmit DMA.
- The 1394 async transmit completion tasklet runs to finish cleaning
up (unlink datagram from list of pending ones, release skb and
outbound 1394 transaction object) --- before the networking soft IRQ
had a chance to proceed and add the datagram to the list of pending
datagrams.
This caused a panic in the 1394 async transmit completion tasklet when
it dereferenced unitialized list heads:
http://bugzilla.kernel.org/show_bug.cgi?id=15077
The fix is to add checks in the tx soft IRQ and in the tasklet to
determine which of these two is the last referrer to the transaction
object. Then handle the cleanup of the object by the last referrer
rather than assuming that the tasklet is always the last one.
There is another similar race: Between said tasklet and fwnet_close,
i.e. at ifdown. However, that race is much less likely to occur in
practice and shall be fixed in a separate update.
Reported-by: Илья Басин <basinilya@gmail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Neil Horman <nhorman@txudriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to put ethtool_ops in data, they should be const.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These are all drivers that don't touch real hardware.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The AR req handler should not check the generation; higher level code
is the better place to handle bus generation changes. The target node
ID just needs to be checked for not being the "all nodes" address; in
this case don't handle the request and don't respond.
Use Address_Error and Type_Error rcodes as appropriate.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Fix some problems from "firewire: net: allow for unordered unit
discovery":
- fwnet_remove was missing a list_del, causing fwnet_probe to crash if
called after fwnet_remove, e.g. if firewire-ohci was unloaded and
reloaded.
- fwnet_probe should set its new_netdev flag only if it actually
allocated a net_device.
- Use dev_set_drvdata and dev_get_drvdata instead of deprecated direct
access to device.driver_data.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The .ndo_tx_timeout callback is currently without function; delete it.
Give .watchdog_timeo a proper time value; lower it to 2 seconds.
Decrease the .tx_queue_len from 1000 (as in Ethernet card drivers) to 10
because we have only 64 transaction labels available, and responders
might have further limits of their AR req contexts.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Decouple the creation and destruction of the net_device from the order
of discovery and removal of nodes with RFC 2734 unit directories since
there is no reliable order. The net_device is now created when the
first RFC 2734 unit on a card is discovered, and destroyed when the last
RFC 2734 unit on a card went away. This includes all remote units as
well as the local unit, which is therefore tracked as a peer now too.
Also, locking around the list of peers is slightly extended to guard
against peer removal. As a side effect, fwnet_peer.pdg_lock has become
superfluous and is deleted.
Peer data (max_rec, speed, node ID, generation) are updated more
carefully.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The driver is now called firewire-net. It might implement the transport
of other networking protocols in the future, notably IPv6 per RFC 3146.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-06-14 14:26:29 +02:00
Renamed from drivers/firewire/fw-ipv4.c (Browse further)