Merge commit 'v2.6.34-rc1' into perf/urgent

Conflicts:
	tools/perf/util/probe-event.c

Merge reason: Pick up -rc1 and resolve the conflict as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
Ingo Molnar 2010-03-09 17:11:53 +01:00
commit 548b841669
6404 changed files with 404980 additions and 172153 deletions

1
.gitignore vendored
View file

@ -36,6 +36,7 @@ modules.builtin
#
tags
TAGS
linux
vmlinux
vmlinuz
System.map

View file

@ -0,0 +1,7 @@
What: /sys/devices/system/node/nodeX
Date: October 2002
Contact: Linux Memory Management list <linux-mm@kvack.org>
Description:
When CONFIG_NUMA is enabled, this is a directory containing
information on node X such as what CPUs are local to the
node.

View file

@ -128,3 +128,17 @@ Description:
preferred request size for workloads where sustained
throughput is desired. If no optimal I/O size is
reported this file contains 0.
What: /sys/block/<disk>/queue/nomerges
Date: January 2010
Contact:
Description:
Standard I/O elevator operations include attempts to
merge contiguous I/Os. For known random I/O loads these
attempts will always fail and result in extra cycles
being spent in the kernel. This allows one to turn off
this behavior on one of two ways: When set to 1, complex
merge checks are disabled, but the simple one-shot merges
with the previous I/O request are enabled. When set to 2,
all merge tries are disabled. The default value is 0 -
which enables all types of merge tries.

View file

@ -159,3 +159,14 @@ Description:
device. This is useful to ensure auto probing won't
match the driver to the device. For example:
# echo "046d c315" > /sys/bus/usb/drivers/foo/remove_id
What: /sys/bus/usb/device/.../avoid_reset
Date: December 2009
Contact: Oliver Neukum <oliver@neukum.org>
Description:
Writing 1 to this file tells the kernel that this
device will morph into another mode when it is reset.
Drivers will not use reset for error handling for
such devices.
Users:
usb_modeswitch

View file

@ -0,0 +1,79 @@
What: /sys/devices/.../power/
Date: January 2009
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../power directory contains attributes
allowing the user space to check and modify some power
management related properties of given device.
What: /sys/devices/.../power/wakeup
Date: January 2009
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../power/wakeup attribute allows the user
space to check if the device is enabled to wake up the system
from sleep states, such as the memory sleep state (suspend to
RAM) and hibernation (suspend to disk), and to enable or disable
it to do that as desired.
Some devices support "wakeup" events, which are hardware signals
used to activate the system from a sleep state. Such devices
have one of the following two values for the sysfs power/wakeup
file:
+ "enabled\n" to issue the events;
+ "disabled\n" not to do so;
In that cases the user space can change the setting represented
by the contents of this file by writing either "enabled", or
"disabled" to it.
For the devices that are not capable of generating system wakeup
events this file contains "\n". In that cases the user space
cannot modify the contents of this file and the device cannot be
enabled to wake up the system.
What: /sys/devices/.../power/control
Date: January 2009
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../power/control attribute allows the user
space to control the run-time power management of the device.
All devices have one of the following two values for the
power/control file:
+ "auto\n" to allow the device to be power managed at run time;
+ "on\n" to prevent the device from being power managed;
The default for all devices is "auto", which means that they may
be subject to automatic power management, depending on their
drivers. Changing this attribute to "on" prevents the driver
from power managing the device at run time. Doing that while
the device is suspended causes it to be woken up.
What: /sys/devices/.../power/async
Date: January 2009
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../async attribute allows the user space to
enable or diasble the device's suspend and resume callbacks to
be executed asynchronously (ie. in separate threads, in parallel
with the main suspend/resume thread) during system-wide power
transitions (eg. suspend to RAM, hibernation).
All devices have one of the following two values for the
power/async file:
+ "enabled\n" to permit the asynchronous suspend/resume;
+ "disabled\n" to forbid it;
The value of this attribute may be changed by writing either
"enabled", or "disabled" to it.
It generally is unsafe to permit the asynchronous suspend/resume
of a device unless it is certain that all of the PM dependencies
of the device are known to the PM core. However, for some
devices this attribute is set to "enabled" by bus type code or
device drivers and in that cases it should be safe to leave the
default value.

View file

@ -1,4 +1,4 @@
What: /sys/devices/platform/asus-laptop/display
What: /sys/devices/platform/asus_laptop/display
Date: January 2007
KernelVersion: 2.6.20
Contact: "Corentin Chary" <corentincj@iksaif.net>
@ -13,7 +13,7 @@ Description:
Ex: - 0 (0000b) means no display
- 3 (0011b) CRT+LCD.
What: /sys/devices/platform/asus-laptop/gps
What: /sys/devices/platform/asus_laptop/gps
Date: January 2007
KernelVersion: 2.6.20
Contact: "Corentin Chary" <corentincj@iksaif.net>
@ -21,7 +21,7 @@ Description:
Control the gps device. 1 means on, 0 means off.
Users: Lapsus
What: /sys/devices/platform/asus-laptop/ledd
What: /sys/devices/platform/asus_laptop/ledd
Date: January 2007
KernelVersion: 2.6.20
Contact: "Corentin Chary" <corentincj@iksaif.net>
@ -29,11 +29,11 @@ Description:
Some models like the W1N have a LED display that can be
used to display several informations.
To control the LED display, use the following :
echo 0x0T000DDD > /sys/devices/platform/asus-laptop/
echo 0x0T000DDD > /sys/devices/platform/asus_laptop/
where T control the 3 letters display, and DDD the 3 digits display.
The DDD table can be found in Documentation/laptops/asus-laptop.txt
What: /sys/devices/platform/asus-laptop/bluetooth
What: /sys/devices/platform/asus_laptop/bluetooth
Date: January 2007
KernelVersion: 2.6.20
Contact: "Corentin Chary" <corentincj@iksaif.net>
@ -42,7 +42,7 @@ Description:
This may control the led, the device or both.
Users: Lapsus
What: /sys/devices/platform/asus-laptop/wlan
What: /sys/devices/platform/asus_laptop/wlan
Date: January 2007
KernelVersion: 2.6.20
Contact: "Corentin Chary" <corentincj@iksaif.net>

View file

@ -1,4 +1,4 @@
What: /sys/devices/platform/eeepc-laptop/disp
What: /sys/devices/platform/eeepc/disp
Date: May 2008
KernelVersion: 2.6.26
Contact: "Corentin Chary" <corentincj@iksaif.net>
@ -9,21 +9,21 @@ Description:
- 3 = LCD+CRT
If you run X11, you should use xrandr instead.
What: /sys/devices/platform/eeepc-laptop/camera
What: /sys/devices/platform/eeepc/camera
Date: May 2008
KernelVersion: 2.6.26
Contact: "Corentin Chary" <corentincj@iksaif.net>
Description:
Control the camera. 1 means on, 0 means off.
What: /sys/devices/platform/eeepc-laptop/cardr
What: /sys/devices/platform/eeepc/cardr
Date: May 2008
KernelVersion: 2.6.26
Contact: "Corentin Chary" <corentincj@iksaif.net>
Description:
Control the card reader. 1 means on, 0 means off.
What: /sys/devices/platform/eeepc-laptop/cpufv
What: /sys/devices/platform/eeepc/cpufv
Date: Jun 2009
KernelVersion: 2.6.31
Contact: "Corentin Chary" <corentincj@iksaif.net>
@ -42,7 +42,7 @@ Description:
`------------ Availables modes
For example, 0x301 means: mode 1 selected, 3 available modes.
What: /sys/devices/platform/eeepc-laptop/available_cpufv
What: /sys/devices/platform/eeepc/available_cpufv
Date: Jun 2009
KernelVersion: 2.6.31
Contact: "Corentin Chary" <corentincj@iksaif.net>

View file

@ -101,3 +101,16 @@ Description:
CAUTION: Using it will cause your machine's real-time (CMOS)
clock to be set to a random invalid time after a resume.
What: /sys/power/pm_async
Date: January 2009
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/power/pm_async file controls the switch allowing the
user space to enable or disable asynchronous suspend and resume
of devices. If enabled, this feature will cause some device
drivers' suspend and resume callbacks to be executed in parallel
with each other and with the main suspend thread. It is enabled
if this file contains "1", which is the default. It may be
disabled by writing "0" to this file, in which case all devices
will be suspended and resumed synchronously.

View file

@ -45,7 +45,7 @@
</sect1>
<sect1><title>Atomic and pointer manipulation</title>
!Iarch/x86/include/asm/atomic_32.h
!Iarch/x86/include/asm/atomic.h
!Iarch/x86/include/asm/unaligned.h
</sect1>

View file

@ -316,7 +316,7 @@ CPU B: spin_unlock_irqrestore(&amp;dev_lock, flags)
<chapter id="pubfunctions">
<title>Public Functions Provided</title>
!Iarch/x86/include/asm/io_32.h
!Iarch/x86/include/asm/io.h
!Elib/iomap.c
</chapter>

View file

@ -144,7 +144,7 @@ usage should require reading the full document.
this though and the recommendation to allow only a single
interface in STA mode at first!
</para>
!Finclude/net/mac80211.h ieee80211_if_init_conf
!Finclude/net/mac80211.h ieee80211_vif
</chapter>
<chapter id="rx-tx">
@ -234,7 +234,6 @@ usage should require reading the full document.
<title>Multiple queues and QoS support</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_tx_queue_params
!Finclude/net/mac80211.h ieee80211_tx_queue_stats
</chapter>
<chapter id="AP">

View file

@ -589,7 +589,8 @@ number of a video input as in &v4l2-input; field
<entry></entry>
<entry>A place holder for future extensions and custom
(driver defined) buffer types
<constant>V4L2_BUF_TYPE_PRIVATE</constant> and higher.</entry>
<constant>V4L2_BUF_TYPE_PRIVATE</constant> and higher. Applications
should set this to 0.</entry>
</row>
</tbody>
</tgroup>

View file

@ -54,12 +54,10 @@ to enqueue an empty (capturing) or filled (output) buffer in the
driver's incoming queue. The semantics depend on the selected I/O
method.</para>
<para>To enqueue a <link linkend="mmap">memory mapped</link>
buffer applications set the <structfield>type</structfield> field of a
&v4l2-buffer; to the same buffer type as previously &v4l2-format;
<structfield>type</structfield> and &v4l2-requestbuffers;
<structfield>type</structfield>, the <structfield>memory</structfield>
field to <constant>V4L2_MEMORY_MMAP</constant> and the
<para>To enqueue a buffer applications set the <structfield>type</structfield>
field of a &v4l2-buffer; to the same buffer type as was previously used
with &v4l2-format; <structfield>type</structfield> and &v4l2-requestbuffers;
<structfield>type</structfield>. Applications must also set the
<structfield>index</structfield> field. Valid index numbers range from
zero to the number of buffers allocated with &VIDIOC-REQBUFS;
(&v4l2-requestbuffers; <structfield>count</structfield>) minus one. The
@ -70,8 +68,19 @@ intended for output (<structfield>type</structfield> is
<constant>V4L2_BUF_TYPE_VBI_OUTPUT</constant>) applications must also
initialize the <structfield>bytesused</structfield>,
<structfield>field</structfield> and
<structfield>timestamp</structfield> fields. See <xref
linkend="buffer" /> for details. When
<structfield>timestamp</structfield> fields, see <xref
linkend="buffer" /> for details.
Applications must also set <structfield>flags</structfield> to 0. If a driver
supports capturing from specific video inputs and you want to specify a video
input, then <structfield>flags</structfield> should be set to
<constant>V4L2_BUF_FLAG_INPUT</constant> and the field
<structfield>input</structfield> must be initialized to the desired input.
The <structfield>reserved</structfield> field must be set to 0.
</para>
<para>To enqueue a <link linkend="mmap">memory mapped</link>
buffer applications set the <structfield>memory</structfield>
field to <constant>V4L2_MEMORY_MMAP</constant>. When
<constant>VIDIOC_QBUF</constant> is called with a pointer to this
structure the driver sets the
<constant>V4L2_BUF_FLAG_MAPPED</constant> and
@ -81,14 +90,10 @@ structure the driver sets the
&EINVAL;.</para>
<para>To enqueue a <link linkend="userp">user pointer</link>
buffer applications set the <structfield>type</structfield> field of a
&v4l2-buffer; to the same buffer type as previously &v4l2-format;
<structfield>type</structfield> and &v4l2-requestbuffers;
<structfield>type</structfield>, the <structfield>memory</structfield>
field to <constant>V4L2_MEMORY_USERPTR</constant> and the
buffer applications set the <structfield>memory</structfield>
field to <constant>V4L2_MEMORY_USERPTR</constant>, the
<structfield>m.userptr</structfield> field to the address of the
buffer and <structfield>length</structfield> to its size. When the
buffer is intended for output additional fields must be set as above.
buffer and <structfield>length</structfield> to its size.
When <constant>VIDIOC_QBUF</constant> is called with a pointer to this
structure the driver sets the <constant>V4L2_BUF_FLAG_QUEUED</constant>
flag and clears the <constant>V4L2_BUF_FLAG_MAPPED</constant> and
@ -96,13 +101,14 @@ flag and clears the <constant>V4L2_BUF_FLAG_MAPPED</constant> and
<structfield>flags</structfield> field, or it returns an error code.
This ioctl locks the memory pages of the buffer in physical memory,
they cannot be swapped out to disk. Buffers remain locked until
dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl are
dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is
called, or until the device is closed.</para>
<para>Applications call the <constant>VIDIOC_DQBUF</constant>
ioctl to dequeue a filled (capturing) or displayed (output) buffer
from the driver's outgoing queue. They just set the
<structfield>type</structfield> and <structfield>memory</structfield>
<structfield>type</structfield>, <structfield>memory</structfield>
and <structfield>reserved</structfield>
fields of a &v4l2-buffer; as above, when <constant>VIDIOC_DQBUF</constant>
is called with a pointer to this structure the driver fills the
remaining fields or returns an error code.</para>

View file

@ -54,12 +54,13 @@ buffer at any time after buffers have been allocated with the
&VIDIOC-REQBUFS; ioctl.</para>
<para>Applications set the <structfield>type</structfield> field
of a &v4l2-buffer; to the same buffer type as previously
of a &v4l2-buffer; to the same buffer type as was previously used with
&v4l2-format; <structfield>type</structfield> and &v4l2-requestbuffers;
<structfield>type</structfield>, and the <structfield>index</structfield>
field. Valid index numbers range from zero
to the number of buffers allocated with &VIDIOC-REQBUFS;
(&v4l2-requestbuffers; <structfield>count</structfield>) minus one.
The <structfield>reserved</structfield> field should to set to 0.
After calling <constant>VIDIOC_QUERYBUF</constant> with a pointer to
this structure drivers return an error code or fill the rest of
the structure.</para>
@ -68,8 +69,8 @@ the structure.</para>
<constant>V4L2_BUF_FLAG_MAPPED</constant>,
<constant>V4L2_BUF_FLAG_QUEUED</constant> and
<constant>V4L2_BUF_FLAG_DONE</constant> flags will be valid. The
<structfield>memory</structfield> field will be set to
<constant>V4L2_MEMORY_MMAP</constant>, the <structfield>m.offset</structfield>
<structfield>memory</structfield> field will be set to the current
I/O method, the <structfield>m.offset</structfield>
contains the offset of the buffer from the start of the device memory,
the <structfield>length</structfield> field its size. The driver may
or may not set the remaining fields and flags, they are meaningless in

View file

@ -54,23 +54,23 @@ I/O. Memory mapped buffers are located in device memory and must be
allocated with this ioctl before they can be mapped into the
application's address space. User buffers are allocated by
applications themselves, and this ioctl is merely used to switch the
driver into user pointer I/O mode.</para>
driver into user pointer I/O mode and to setup some internal structures.</para>
<para>To allocate device buffers applications initialize three
fields of a <structname>v4l2_requestbuffers</structname> structure.
<para>To allocate device buffers applications initialize all
fields of the <structname>v4l2_requestbuffers</structname> structure.
They set the <structfield>type</structfield> field to the respective
stream or buffer type, the <structfield>count</structfield> field to
the desired number of buffers, and <structfield>memory</structfield>
must be set to <constant>V4L2_MEMORY_MMAP</constant>. When the ioctl
is called with a pointer to this structure the driver attempts to
allocate the requested number of buffers and stores the actual number
the desired number of buffers, <structfield>memory</structfield>
must be set to the requested I/O method and the reserved array
must be zeroed. When the ioctl
is called with a pointer to this structure the driver will attempt to allocate
the requested number of buffers and it stores the actual number
allocated in the <structfield>count</structfield> field. It can be
smaller than the number requested, even zero, when the driver runs out
of free memory. A larger number is possible when the driver requires
more buffers to function correctly.<footnote>
<para>For example video output requires at least two buffers,
of free memory. A larger number is also possible when the driver requires
more buffers to function correctly. For example video output requires at least two buffers,
one displayed and one filled by the application.</para>
</footnote> When memory mapping I/O is not supported the ioctl
<para>When the I/O method is not supported the ioctl
returns an &EINVAL;.</para>
<para>Applications can call <constant>VIDIOC_REQBUFS</constant>
@ -81,14 +81,6 @@ in progress, an implicit &VIDIOC-STREAMOFF;. <!-- mhs: I see no
reason why munmap()ping one or even all buffers must imply
streamoff.--></para>
<para>To negotiate user pointer I/O, applications initialize only
the <structfield>type</structfield> field and set
<structfield>memory</structfield> to
<constant>V4L2_MEMORY_USERPTR</constant>. When the ioctl is called
with a pointer to this structure the driver prepares for user pointer
I/O, when this I/O method is not supported the ioctl returns an
&EINVAL;.</para>
<table pgwide="1" frame="none" id="v4l2-requestbuffers">
<title>struct <structname>v4l2_requestbuffers</structname></title>
<tgroup cols="3">
@ -97,9 +89,7 @@ I/O, when this I/O method is not supported the ioctl returns an
<row>
<entry>__u32</entry>
<entry><structfield>count</structfield></entry>
<entry>The number of buffers requested or granted. This
field is only used when <structfield>memory</structfield> is set to
<constant>V4L2_MEMORY_MMAP</constant>.</entry>
<entry>The number of buffers requested or granted.</entry>
</row>
<row>
<entry>&v4l2-buf-type;</entry>
@ -120,7 +110,7 @@ as the &v4l2-format; <structfield>type</structfield> field. See <xref
<entry><structfield>reserved</structfield>[2]</entry>
<entry>A place holder for future extensions and custom
(driver defined) buffer types <constant>V4L2_BUF_TYPE_PRIVATE</constant> and
higher.</entry>
higher. This array should be zeroed by applications.</entry>
</row>
</tbody>
</tgroup>

View file

@ -221,8 +221,8 @@ branches. These different branches are:
- main 2.6.x kernel tree
- 2.6.x.y -stable kernel tree
- 2.6.x -git kernel patches
- 2.6.x -mm kernel patches
- subsystem specific kernel trees and patches
- the 2.6.x -next kernel tree for integration tests
2.6.x kernel tree
-----------------
@ -232,7 +232,7 @@ process is as follows:
- As soon as a new kernel is released a two weeks window is open,
during this period of time maintainers can submit big diffs to
Linus, usually the patches that have already been included in the
-mm kernel for a few weeks. The preferred way to submit big changes
-next kernel for a few weeks. The preferred way to submit big changes
is using git (the kernel's source management tool, more information
can be found at http://git.or.cz/) but plain patches are also just
fine.
@ -293,84 +293,43 @@ daily and represent the current state of Linus' tree. They are more
experimental than -rc kernels since they are generated automatically
without even a cursory glance to see if they are sane.
2.6.x -mm kernel patches
------------------------
These are experimental kernel patches released by Andrew Morton. Andrew
takes all of the different subsystem kernel trees and patches and mushes
them together, along with a lot of patches that have been plucked from
the linux-kernel mailing list. This tree serves as a proving ground for
new features and patches. Once a patch has proved its worth in -mm for
a while Andrew or the subsystem maintainer pushes it on to Linus for
inclusion in mainline.
It is heavily encouraged that all new patches get tested in the -mm tree
before they are sent to Linus for inclusion in the main kernel tree. Code
which does not make an appearance in -mm before the opening of the merge
window will prove hard to merge into the mainline.
These kernels are not appropriate for use on systems that are supposed
to be stable and they are more risky to run than any of the other
branches.
If you wish to help out with the kernel development process, please test
and use these kernel releases and provide feedback to the linux-kernel
mailing list if you have any problems, and if everything works properly.
In addition to all the other experimental patches, these kernels usually
also contain any changes in the mainline -git kernels available at the
time of release.
The -mm kernels are not released on a fixed schedule, but usually a few
-mm kernels are released in between each -rc kernel (1 to 3 is common).
Subsystem Specific kernel trees and patches
-------------------------------------------
A number of the different kernel subsystem developers expose their
development trees so that others can see what is happening in the
different areas of the kernel. These trees are pulled into the -mm
kernel releases as described above.
The maintainers of the various kernel subsystems --- and also many
kernel subsystem developers --- expose their current state of
development in source repositories. That way, others can see what is
happening in the different areas of the kernel. In areas where
development is rapid, a developer may be asked to base his submissions
onto such a subsystem kernel tree so that conflicts between the
submission and other already ongoing work are avoided.
Here is a list of some of the different kernel trees available:
git trees:
- Kbuild development tree, Sam Ravnborg <sam@ravnborg.org>
git.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild.git
Most of these repositories are git trees, but there are also other SCMs
in use, or patch queues being published as quilt series. Addresses of
these subsystem repositories are listed in the MAINTAINERS file. Many
of them can be browsed at http://git.kernel.org/.
- ACPI development tree, Len Brown <len.brown@intel.com>
git.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git
Before a proposed patch is committed to such a subsystem tree, it is
subject to review which primarily happens on mailing lists (see the
respective section below). For several kernel subsystems, this review
process is tracked with the tool patchwork. Patchwork offers a web
interface which shows patch postings, any comments on a patch or
revisions to it, and maintainers can mark patches as under review,
accepted, or rejected. Most of these patchwork sites are listed at
http://patchwork.kernel.org/ or http://patchwork.ozlabs.org/.
- Block development tree, Jens Axboe <jens.axboe@oracle.com>
git.kernel.org:/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git
2.6.x -next kernel tree for integration tests
---------------------------------------------
Before updates from subsystem trees are merged into the mainline 2.6.x
tree, they need to be integration-tested. For this purpose, a special
testing repository exists into which virtually all subsystem trees are
pulled on an almost daily basis:
http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git
http://linux.f-seidel.de/linux-next/pmwiki/
- DRM development tree, Dave Airlie <airlied@linux.ie>
git.kernel.org:/pub/scm/linux/kernel/git/airlied/drm-2.6.git
This way, the -next kernel gives a summary outlook onto what will be
expected to go into the mainline kernel at the next merge period.
Adventurous testers are very welcome to runtime-test the -next kernel.
- ia64 development tree, Tony Luck <tony.luck@intel.com>
git.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6.git
- infiniband, Roland Dreier <rolandd@cisco.com>
git.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git
- libata, Jeff Garzik <jgarzik@pobox.com>
git.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git
- network drivers, Jeff Garzik <jgarzik@pobox.com>
git.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
- pcmcia, Dominik Brodowski <linux@dominikbrodowski.net>
git.kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git
- SCSI, James Bottomley <James.Bottomley@hansenpartnership.com>
git.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git
- x86, Ingo Molnar <mingo@elte.hu>
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git
quilt trees:
- USB, Driver Core, and I2C, Greg Kroah-Hartman <gregkh@suse.de>
kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/
Other kernel trees can be found listed at http://git.kernel.org/ and in
the MAINTAINERS file.
Bug Reporting
-------------

View file

@ -6,16 +6,22 @@ checklist.txt
- Review Checklist for RCU Patches
listRCU.txt
- Using RCU to Protect Read-Mostly Linked Lists
lockdep.txt
- RCU and lockdep checking
NMI-RCU.txt
- Using RCU to Protect Dynamic NMI Handlers
rcubarrier.txt
- RCU and Unloadable Modules
rculist_nulls.txt
- RCU list primitives for use with SLAB_DESTROY_BY_RCU
rcuref.txt
- Reference-count design for elements of lists/arrays protected by RCU
rcu.txt
- RCU Concepts
rcubarrier.txt
- Unloading modules that use RCU callbacks
RTFP.txt
- List of RCU papers (bibliography) going back to 1980.
stallwarn.txt
- RCU CPU stall warnings (CONFIG_RCU_CPU_STALL_DETECTOR)
torture.txt
- RCU Torture Test Operation (CONFIG_RCU_TORTURE_TEST)
trace.txt

View file

@ -25,10 +25,10 @@ to be referencing the data structure. However, this mechanism was not
optimized for modern computer systems, which is not surprising given
that these overheads were not so expensive in the mid-80s. Nonetheless,
passive serialization appears to be the first deferred-destruction
mechanism to be used in production. Furthermore, the relevant patent has
lapsed, so this approach may be used in non-GPL software, if desired.
(In contrast, use of RCU is permitted only in software licensed under
GPL. Sorry!!!)
mechanism to be used in production. Furthermore, the relevant patent
has lapsed, so this approach may be used in non-GPL software, if desired.
(In contrast, implementation of RCU is permitted only in software licensed
under either GPL or LGPL. Sorry!!!)
In 1990, Pugh [Pugh90] noted that explicitly tracking which threads
were reading a given data structure permitted deferred free to operate
@ -150,6 +150,18 @@ preemptible RCU [PaulEMcKenney2007PreemptibleRCU], and the three-part
LWN "What is RCU?" series [PaulEMcKenney2007WhatIsRCUFundamentally,
PaulEMcKenney2008WhatIsRCUUsage, and PaulEMcKenney2008WhatIsRCUAPI].
2008 saw a journal paper on real-time RCU [DinakarGuniguntala2008IBMSysJ],
a history of how Linux changed RCU more than RCU changed Linux
[PaulEMcKenney2008RCUOSR], and a design overview of hierarchical RCU
[PaulEMcKenney2008HierarchicalRCU].
2009 introduced user-level RCU algorithms [PaulEMcKenney2009MaliciousURCU],
which Mathieu Desnoyers is now maintaining [MathieuDesnoyers2009URCU]
[MathieuDesnoyersPhD]. TINY_RCU [PaulEMcKenney2009BloatWatchRCU] made
its appearance, as did expedited RCU [PaulEMcKenney2009expeditedRCU].
The problem of resizeable RCU-protected hash tables may now be on a path
to a solution [JoshTriplett2009RPHash].
Bibtex Entries
@article{Kung80
@ -730,6 +742,11 @@ Revised:
"
}
#
# "What is RCU?" LWN series.
#
########################################################################
@article{DinakarGuniguntala2008IBMSysJ
,author="D. Guniguntala and P. E. McKenney and J. Triplett and J. Walpole"
,title="The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with {Linux}"
@ -820,3 +837,39 @@ Revised:
Uniprocessor assumptions allow simplified RCU implementation.
"
}
@unpublished{PaulEMcKenney2009expeditedRCU
,Author="Paul E. McKenney"
,Title="[{PATCH} -tip 0/3] expedited 'big hammer' {RCU} grace periods"
,month="June"
,day="25"
,year="2009"
,note="Available:
\url{http://lkml.org/lkml/2009/6/25/306}
[Viewed August 16, 2009]"
,annotation="
First posting of expedited RCU to be accepted into -tip.
"
}
@unpublished{JoshTriplett2009RPHash
,Author="Josh Triplett"
,Title="Scalable concurrent hash tables via relativistic programming"
,month="September"
,year="2009"
,note="Linux Plumbers Conference presentation"
,annotation="
RP fun with hash tables.
"
}
@phdthesis{MathieuDesnoyersPhD
, title = "Low-Impact Operating System Tracing"
, author = "Mathieu Desnoyers"
, school = "Ecole Polytechnique de Montr\'{e}al"
, month = "December"
, year = 2009
,note="Available:
\url{http://www.lttng.org/pub/thesis/desnoyers-dissertation-2009-12.pdf}
[Viewed December 9, 2009]"
}

View file

@ -8,13 +8,12 @@ would cause. This list is based on experiences reviewing such patches
over a rather long period of time, but improvements are always welcome!
0. Is RCU being applied to a read-mostly situation? If the data
structure is updated more than about 10% of the time, then
you should strongly consider some other approach, unless
detailed performance measurements show that RCU is nonetheless
the right tool for the job. Yes, you might think of RCU
as simply cutting overhead off of the readers and imposing it
on the writers. That is exactly why normal uses of RCU will
do much more reading than updating.
structure is updated more than about 10% of the time, then you
should strongly consider some other approach, unless detailed
performance measurements show that RCU is nonetheless the right
tool for the job. Yes, RCU does reduce read-side overhead by
increasing write-side overhead, which is exactly why normal uses
of RCU will do much more reading than updating.
Another exception is where performance is not an issue, and RCU
provides a simpler implementation. An example of this situation
@ -35,13 +34,13 @@ over a rather long period of time, but improvements are always welcome!
If you choose #b, be prepared to describe how you have handled
memory barriers on weakly ordered machines (pretty much all of
them -- even x86 allows reads to be reordered), and be prepared
to explain why this added complexity is worthwhile. If you
choose #c, be prepared to explain how this single task does not
become a major bottleneck on big multiprocessor machines (for
example, if the task is updating information relating to itself
that other tasks can read, there by definition can be no
bottleneck).
them -- even x86 allows later loads to be reordered to precede
earlier stores), and be prepared to explain why this added
complexity is worthwhile. If you choose #c, be prepared to
explain how this single task does not become a major bottleneck on
big multiprocessor machines (for example, if the task is updating
information relating to itself that other tasks can read, there
by definition can be no bottleneck).
2. Do the RCU read-side critical sections make proper use of
rcu_read_lock() and friends? These primitives are needed
@ -51,8 +50,10 @@ over a rather long period of time, but improvements are always welcome!
actuarial risk of your kernel.
As a rough rule of thumb, any dereference of an RCU-protected
pointer must be covered by rcu_read_lock() or rcu_read_lock_bh()
or by the appropriate update-side lock.
pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
rcu_read_lock_sched(), or by the appropriate update-side lock.
Disabling of preemption can serve as rcu_read_lock_sched(), but
is less readable.
3. Does the update code tolerate concurrent accesses?
@ -62,25 +63,27 @@ over a rather long period of time, but improvements are always welcome!
of ways to handle this concurrency, depending on the situation:
a. Use the RCU variants of the list and hlist update
primitives to add, remove, and replace elements on an
RCU-protected list. Alternatively, use the RCU-protected
trees that have been added to the Linux kernel.
primitives to add, remove, and replace elements on
an RCU-protected list. Alternatively, use the other
RCU-protected data structures that have been added to
the Linux kernel.
This is almost always the best approach.
b. Proceed as in (a) above, but also maintain per-element
locks (that are acquired by both readers and writers)
that guard per-element state. Of course, fields that
the readers refrain from accessing can be guarded by the
update-side lock.
the readers refrain from accessing can be guarded by
some other lock acquired only by updaters, if desired.
This works quite well, also.
c. Make updates appear atomic to readers. For example,
pointer updates to properly aligned fields will appear
atomic, as will individual atomic primitives. Operations
performed under a lock and sequences of multiple atomic
primitives will -not- appear to be atomic.
pointer updates to properly aligned fields will
appear atomic, as will individual atomic primitives.
Sequences of perations performed under a lock will -not-
appear to be atomic to RCU readers, nor will sequences
of multiple atomic primitives.
This can work, but is starting to get a bit tricky.
@ -98,9 +101,9 @@ over a rather long period of time, but improvements are always welcome!
a new structure containing updated values.
4. Weakly ordered CPUs pose special challenges. Almost all CPUs
are weakly ordered -- even i386 CPUs allow reads to be reordered.
RCU code must take all of the following measures to prevent
memory-corruption problems:
are weakly ordered -- even x86 CPUs allow later loads to be
reordered to precede earlier stores. RCU code must take all of
the following measures to prevent memory-corruption problems:
a. Readers must maintain proper ordering of their memory
accesses. The rcu_dereference() primitive ensures that
@ -113,14 +116,25 @@ over a rather long period of time, but improvements are always welcome!
The rcu_dereference() primitive is also an excellent
documentation aid, letting the person reading the code
know exactly which pointers are protected by RCU.
Please note that compilers can also reorder code, and
they are becoming increasingly aggressive about doing
just that. The rcu_dereference() primitive therefore
also prevents destructive compiler optimizations.
The rcu_dereference() primitive is used by the various
"_rcu()" list-traversal primitives, such as the
list_for_each_entry_rcu(). Note that it is perfectly
legal (if redundant) for update-side code to use
rcu_dereference() and the "_rcu()" list-traversal
primitives. This is particularly useful in code
that is common to readers and updaters.
The rcu_dereference() primitive is used by the
various "_rcu()" list-traversal primitives, such
as the list_for_each_entry_rcu(). Note that it is
perfectly legal (if redundant) for update-side code to
use rcu_dereference() and the "_rcu()" list-traversal
primitives. This is particularly useful in code that
is common to readers and updaters. However, lockdep
will complain if you access rcu_dereference() outside
of an RCU read-side critical section. See lockdep.txt
to learn what to do about this.
Of course, neither rcu_dereference() nor the "_rcu()"
list-traversal primitives can substitute for a good
concurrency design coordinating among multiple updaters.
b. If the list macros are being used, the list_add_tail_rcu()
and list_add_rcu() primitives must be used in order
@ -135,11 +149,14 @@ over a rather long period of time, but improvements are always welcome!
readers. Similarly, if the hlist macros are being used,
the hlist_del_rcu() primitive is required.
The list_replace_rcu() primitive may be used to
replace an old structure with a new one in an
RCU-protected list.
The list_replace_rcu() and hlist_replace_rcu() primitives
may be used to replace an old structure with a new one
in their respective types of RCU-protected lists.
d. Updates must ensure that initialization of a given
d. Rules similar to (4b) and (4c) apply to the "hlist_nulls"
type of RCU-protected linked lists.
e. Updates must ensure that initialization of a given
structure happens before pointers to that structure are
publicized. Use the rcu_assign_pointer() primitive
when publicizing a pointer to a structure that can
@ -151,16 +168,31 @@ over a rather long period of time, but improvements are always welcome!
it cannot block.
6. Since synchronize_rcu() can block, it cannot be called from
any sort of irq context. Ditto for synchronize_sched() and
synchronize_srcu().
any sort of irq context. The same rule applies for
synchronize_rcu_bh(), synchronize_sched(), synchronize_srcu(),
synchronize_rcu_expedited(), synchronize_rcu_bh_expedited(),
synchronize_sched_expedite(), and synchronize_srcu_expedited().
7. If the updater uses call_rcu(), then the corresponding readers
must use rcu_read_lock() and rcu_read_unlock(). If the updater
uses call_rcu_bh(), then the corresponding readers must use
rcu_read_lock_bh() and rcu_read_unlock_bh(). If the updater
uses call_rcu_sched(), then the corresponding readers must
disable preemption. Mixing things up will result in confusion
and broken kernels.
The expedited forms of these primitives have the same semantics
as the non-expedited forms, but expediting is both expensive
and unfriendly to real-time workloads. Use of the expedited
primitives should be restricted to rare configuration-change
operations that would not normally be undertaken while a real-time
workload is running.
7. If the updater uses call_rcu() or synchronize_rcu(), then the
corresponding readers must use rcu_read_lock() and
rcu_read_unlock(). If the updater uses call_rcu_bh() or
synchronize_rcu_bh(), then the corresponding readers must
use rcu_read_lock_bh() and rcu_read_unlock_bh(). If the
updater uses call_rcu_sched() or synchronize_sched(), then
the corresponding readers must disable preemption, possibly
by calling rcu_read_lock_sched() and rcu_read_unlock_sched().
If the updater uses synchronize_srcu(), the the corresponding
readers must use srcu_read_lock() and srcu_read_unlock(),
and with the same srcu_struct. The rules for the expedited
primitives are the same as for their non-expedited counterparts.
Mixing things up will result in confusion and broken kernels.
One exception to this rule: rcu_read_lock() and rcu_read_unlock()
may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
@ -212,6 +244,8 @@ over a rather long period of time, but improvements are always welcome!
e. Periodically invoke synchronize_rcu(), permitting a limited
number of updates per grace period.
The same cautions apply to call_rcu_bh() and call_rcu_sched().
9. All RCU list-traversal primitives, which include
rcu_dereference(), list_for_each_entry_rcu(),
list_for_each_continue_rcu(), and list_for_each_safe_rcu(),
@ -219,7 +253,9 @@ over a rather long period of time, but improvements are always welcome!
must be protected by appropriate update-side locks. RCU
read-side critical sections are delimited by rcu_read_lock()
and rcu_read_unlock(), or by similar primitives such as
rcu_read_lock_bh() and rcu_read_unlock_bh().
rcu_read_lock_bh() and rcu_read_unlock_bh(), in which case
the matching rcu_dereference() primitive must be used in order
to keep lockdep happy, in this case, rcu_dereference_bh().
The reason that it is permissible to use RCU list-traversal
primitives when the update-side lock is held is that doing so
@ -229,7 +265,8 @@ over a rather long period of time, but improvements are always welcome!
10. Conversely, if you are in an RCU read-side critical section,
and you don't hold the appropriate update-side lock, you -must-
use the "_rcu()" variants of the list macros. Failing to do so
will break Alpha and confuse people reading your code.
will break Alpha, cause aggressive compilers to generate bad code,
and confuse people trying to read your code.
11. Note that synchronize_rcu() -only- guarantees to wait until
all currently executing rcu_read_lock()-protected RCU read-side
@ -239,15 +276,21 @@ over a rather long period of time, but improvements are always welcome!
rcu_read_lock()-protected read-side critical sections, do -not-
use synchronize_rcu().
If you want to wait for some of these other things, you might
instead need to use synchronize_irq() or synchronize_sched().
Similarly, disabling preemption is not an acceptable substitute
for rcu_read_lock(). Code that attempts to use preemption
disabling where it should be using rcu_read_lock() will break
in real-time kernel builds.
If you want to wait for interrupt handlers, NMI handlers, and
code under the influence of preempt_disable(), you instead
need to use synchronize_irq() or synchronize_sched().
12. Any lock acquired by an RCU callback must be acquired elsewhere
with softirq disabled, e.g., via spin_lock_irqsave(),
spin_lock_bh(), etc. Failing to disable irq on a given
acquisition of that lock will result in deadlock as soon as the
RCU callback happens to interrupt that acquisition's critical
section.
acquisition of that lock will result in deadlock as soon as
the RCU softirq handler happens to run your RCU callback while
interrupting that acquisition's critical section.
13. RCU callbacks can be and are executed in parallel. In many cases,
the callback code simply wrappers around kfree(), so that this
@ -265,29 +308,30 @@ over a rather long period of time, but improvements are always welcome!
not the case, a self-spawning RCU callback would prevent the
victim CPU from ever going offline.)
14. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu())
may only be invoked from process context. Unlike other forms of
RCU, it -is- permissible to block in an SRCU read-side critical
section (demarked by srcu_read_lock() and srcu_read_unlock()),
hence the "SRCU": "sleepable RCU". Please note that if you
don't need to sleep in read-side critical sections, you should
be using RCU rather than SRCU, because RCU is almost always
faster and easier to use than is SRCU.
14. SRCU (srcu_read_lock(), srcu_read_unlock(), srcu_dereference(),
synchronize_srcu(), and synchronize_srcu_expedited()) may only
be invoked from process context. Unlike other forms of RCU, it
-is- permissible to block in an SRCU read-side critical section
(demarked by srcu_read_lock() and srcu_read_unlock()), hence the
"SRCU": "sleepable RCU". Please note that if you don't need
to sleep in read-side critical sections, you should be using
RCU rather than SRCU, because RCU is almost always faster and
easier to use than is SRCU.
Also unlike other forms of RCU, explicit initialization
and cleanup is required via init_srcu_struct() and
cleanup_srcu_struct(). These are passed a "struct srcu_struct"
that defines the scope of a given SRCU domain. Once initialized,
the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
and synchronize_srcu(). A given synchronize_srcu() waits only
for SRCU read-side critical sections governed by srcu_read_lock()
and srcu_read_unlock() calls that have been passd the same
srcu_struct. This property is what makes sleeping read-side
critical sections tolerable -- a given subsystem delays only
its own updates, not those of other subsystems using SRCU.
Therefore, SRCU is less prone to OOM the system than RCU would
be if RCU's read-side critical sections were permitted to
sleep.
synchronize_srcu(), and synchronize_srcu_expedited(). A given
synchronize_srcu() waits only for SRCU read-side critical
sections governed by srcu_read_lock() and srcu_read_unlock()
calls that have been passed the same srcu_struct. This property
is what makes sleeping read-side critical sections tolerable --
a given subsystem delays only its own updates, not those of other
subsystems using SRCU. Therefore, SRCU is less prone to OOM the
system than RCU would be if RCU's read-side critical sections
were permitted to sleep.
The ability to sleep in read-side critical sections does not
come for free. First, corresponding srcu_read_lock() and
@ -311,12 +355,12 @@ over a rather long period of time, but improvements are always welcome!
destructive operation, and -only- -then- invoke call_rcu(),
synchronize_rcu(), or friends.
Because these primitives only wait for pre-existing readers,
it is the caller's responsibility to guarantee safety to
any subsequent readers.
Because these primitives only wait for pre-existing readers, it
is the caller's responsibility to guarantee that any subsequent
readers will execute safely.
16. The various RCU read-side primitives do -not- contain memory
barriers. The CPU (and in some cases, the compiler) is free
to reorder code into and out of RCU read-side critical sections.
It is the responsibility of the RCU update-side primitives to
deal with this.
16. The various RCU read-side primitives do -not- necessarily contain
memory barriers. You should therefore plan for the CPU
and the compiler to freely reorder code into and out of RCU
read-side critical sections. It is the responsibility of the
RCU update-side primitives to deal with this.

View file

@ -0,0 +1,67 @@
RCU and lockdep checking
All flavors of RCU have lockdep checking available, so that lockdep is
aware of when each task enters and leaves any flavor of RCU read-side
critical section. Each flavor of RCU is tracked separately (but note
that this is not the case in 2.6.32 and earlier). This allows lockdep's
tracking to include RCU state, which can sometimes help when debugging
deadlocks and the like.
In addition, RCU provides the following primitives that check lockdep's
state:
rcu_read_lock_held() for normal RCU.
rcu_read_lock_bh_held() for RCU-bh.
rcu_read_lock_sched_held() for RCU-sched.
srcu_read_lock_held() for SRCU.
These functions are conservative, and will therefore return 1 if they
aren't certain (for example, if CONFIG_DEBUG_LOCK_ALLOC is not set).
This prevents things like WARN_ON(!rcu_read_lock_held()) from giving false
positives when lockdep is disabled.
In addition, a separate kernel config parameter CONFIG_PROVE_RCU enables
checking of rcu_dereference() primitives:
rcu_dereference(p):
Check for RCU read-side critical section.
rcu_dereference_bh(p):
Check for RCU-bh read-side critical section.
rcu_dereference_sched(p):
Check for RCU-sched read-side critical section.
srcu_dereference(p, sp):
Check for SRCU read-side critical section.
rcu_dereference_check(p, c):
Use explicit check expression "c".
rcu_dereference_raw(p)
Don't check. (Use sparingly, if at all.)
The rcu_dereference_check() check expression can be any boolean
expression, but would normally include one of the rcu_read_lock_held()
family of functions and a lockdep expression. However, any boolean
expression can be used. For a moderately ornate example, consider
the following:
file = rcu_dereference_check(fdt->fd[fd],
rcu_read_lock_held() ||
lockdep_is_held(&files->file_lock) ||
atomic_read(&files->count) == 1);
This expression picks up the pointer "fdt->fd[fd]" in an RCU-safe manner,
and, if CONFIG_PROVE_RCU is configured, verifies that this expression
is used in:
1. An RCU read-side critical section, or
2. with files->file_lock held, or
3. on an unshared files_struct.
In case (1), the pointer is picked up in an RCU-safe manner for vanilla
RCU read-side critical sections, in case (2) the ->file_lock prevents
any change from taking place, and finally, in case (3) the current task
is the only task accessing the file_struct, again preventing any change
from taking place.
There are currently only "universal" versions of the rcu_assign_pointer()
and RCU list-/tree-traversal primitives, which do not (yet) check for
being in an RCU read-side critical section. In the future, separate
versions of these primitives might be created.

View file

@ -75,6 +75,8 @@ o I hear that RCU is patented? What is with that?
search for the string "Patent" in RTFP.txt to find them.
Of these, one was allowed to lapse by the assignee, and the
others have been contributed to the Linux kernel under GPL.
There are now also LGPL implementations of user-level RCU
available (http://lttng.org/?q=node/18).
o I hear that RCU needs work in order to support realtime kernels?
@ -91,48 +93,4 @@ o Where can I find more information on RCU?
o What are all these files in this directory?
NMI-RCU.txt
Describes how to use RCU to implement dynamic
NMI handlers, which can be revectored on the fly,
without rebooting.
RTFP.txt
List of RCU-related publications and web sites.
UP.txt
Discussion of RCU usage in UP kernels.
arrayRCU.txt
Describes how to use RCU to protect arrays, with
resizeable arrays whose elements reference other
data structures being of the most interest.
checklist.txt
Lists things to check for when inspecting code that
uses RCU.
listRCU.txt
Describes how to use RCU to protect linked lists.
This is the simplest and most common use of RCU
in the Linux kernel.
rcu.txt
You are reading it!
rcuref.txt
Describes how to combine use of reference counts
with RCU.
whatisRCU.txt
Overview of how the RCU implementation works. Along
the way, presents a conceptual view of RCU.
See 00-INDEX for the list.

View file

@ -0,0 +1,58 @@
Using RCU's CPU Stall Detector
The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables
RCU's CPU stall detector, which detects conditions that unduly delay
RCU grace periods. The stall detector's idea of what constitutes
"unduly delayed" is controlled by a pair of C preprocessor macros:
RCU_SECONDS_TILL_STALL_CHECK
This macro defines the period of time that RCU will wait from
the beginning of a grace period until it issues an RCU CPU
stall warning. It is normally ten seconds.
RCU_SECONDS_TILL_STALL_RECHECK
This macro defines the period of time that RCU will wait after
issuing a stall warning until it issues another stall warning.
It is normally set to thirty seconds.
RCU_STALL_RAT_DELAY
The CPU stall detector tries to make the offending CPU rat on itself,
as this often gives better-quality stack traces. However, if
the offending CPU does not detect its own stall in the number
of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will
complain. This is normally set to two jiffies.
The following problems can result in an RCU CPU stall warning:
o A CPU looping in an RCU read-side critical section.
o A CPU looping with interrupts disabled.
o A CPU looping with preemption disabled.
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule().
o A bug in the RCU implementation.
o A hardware failure. This is quite unlikely, but has occurred
at least once in a former life. A CPU failed in a running system,
becoming unresponsive, but not causing an immediate crash.
This resulted in a series of RCU CPU stall warnings, eventually
leading the realization that the CPU had failed.
The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning.
SRCU does not do so directly, but its calls to synchronize_sched() will
result in RCU-sched detecting any CPU stalls that might be occurring.
To diagnose the cause of the stall, inspect the stack traces. The offending
function will usually be near the top of the stack. If you have a series
of stall warnings from a single extended stall, comparing the stack traces
can often help determine where the stall is occurring, which will usually
be in the function nearest the top of the stack that stays the same from
trace to trace.
RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE.

View file

@ -30,6 +30,18 @@ MODULE PARAMETERS
This module has the following parameters:
fqs_duration Duration (in microseconds) of artificially induced bursts
of force_quiescent_state() invocations. In RCU
implementations having force_quiescent_state(), these
bursts help force races between forcing a given grace
period and that grace period ending on its own.
fqs_holdoff Holdoff time (in microseconds) between consecutive calls
to force_quiescent_state() within a burst.
fqs_stutter Wait time (in seconds) between consecutive bursts
of calls to force_quiescent_state().
irqreaders Says to invoke RCU readers from irq level. This is currently
done via timers. Defaults to "1" for variants of RCU that
permit this. (Or, more accurately, variants of RCU that do

View file

@ -323,14 +323,17 @@ used as follows:
Defer Protect
a. synchronize_rcu() rcu_read_lock() / rcu_read_unlock()
call_rcu()
call_rcu() rcu_dereference()
b. call_rcu_bh() rcu_read_lock_bh() / rcu_read_unlock_bh()
rcu_dereference_bh()
c. synchronize_sched() preempt_disable() / preempt_enable()
c. synchronize_sched() rcu_read_lock_sched() / rcu_read_unlock_sched()
preempt_disable() / preempt_enable()
local_irq_save() / local_irq_restore()
hardirq enter / hardirq exit
NMI enter / NMI exit
rcu_dereference_sched()
These three mechanisms are used as follows:
@ -780,9 +783,8 @@ Linux-kernel source code, but it helps to have a full list of the
APIs, since there does not appear to be a way to categorize them
in docbook. Here is the list, by category.
RCU pointer/list traversal:
RCU list traversal:
rcu_dereference
list_for_each_entry_rcu
hlist_for_each_entry_rcu
hlist_nulls_for_each_entry_rcu
@ -808,7 +810,7 @@ RCU: Critical sections Grace period Barrier
rcu_read_lock synchronize_net rcu_barrier
rcu_read_unlock synchronize_rcu
synchronize_rcu_expedited
rcu_dereference synchronize_rcu_expedited
call_rcu
@ -816,7 +818,7 @@ bh: Critical sections Grace period Barrier
rcu_read_lock_bh call_rcu_bh rcu_barrier_bh
rcu_read_unlock_bh synchronize_rcu_bh
synchronize_rcu_bh_expedited
rcu_dereference_bh synchronize_rcu_bh_expedited
sched: Critical sections Grace period Barrier
@ -825,12 +827,14 @@ sched: Critical sections Grace period Barrier
rcu_read_unlock_sched call_rcu_sched
[preempt_disable] synchronize_sched_expedited
[and friends]
rcu_dereference_sched
SRCU: Critical sections Grace period Barrier
srcu_read_lock synchronize_srcu N/A
srcu_read_unlock synchronize_srcu_expedited
srcu_dereference
SRCU: Initialization/cleanup
init_srcu_struct

View file

@ -59,7 +59,11 @@ PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region.
This maps the platforms RAM, and typically
maps all platform RAM in a 1:1 relationship.
TASK_SIZE PAGE_OFFSET-1 Kernel module space
PKMAP_BASE PAGE_OFFSET-1 Permanent kernel mappings
One way of mapping HIGHMEM pages into kernel
space.
MODULES_VADDR MODULES_END-1 Kernel module space
Kernel modules inserted via insmod are
placed here using dynamic mappings.

View file

@ -25,11 +25,11 @@ size allowed by the hardware.
nomerges (RW)
-------------
This enables the user to disable the lookup logic involved with IO merging
requests in the block layer. Merging may still occur through a direct
1-hit cache, since that comes for (almost) free. The IO scheduler will not
waste cycles doing tree/hash lookups for merges if nomerges is 1. Defaults
to 0, enabling all merges.
This enables the user to disable the lookup logic involved with IO
merging requests in the block layer. By default (0) all merges are
enabled. When set to 1 only simple one-hit merges will be tried. When
set to 2 no merge algorithms will be tried (including one-hit or more
complex tree/hash lookups).
nr_requests (RW)
----------------

View file

@ -88,12 +88,12 @@ changes occur:
This is used primarily during fault processing.
5) void update_mmu_cache(struct vm_area_struct *vma,
unsigned long address, pte_t pte)
unsigned long address, pte_t *ptep)
At the end of every page fault, this routine is invoked to
tell the architecture specific code that a translation
described by "pte" now exists at virtual address "address"
for address space "vma->vm_mm", in the software page tables.
now exists at virtual address "address" for address space
"vma->vm_mm", in the software page tables.
A port may use this information in any way it so chooses.
For example, it could use this event to pre-load TLB
@ -377,3 +377,27 @@ maps this page at its virtual address.
All the functionality of flush_icache_page can be implemented in
flush_dcache_page and update_mmu_cache. In 2.7 the hope is to
remove this interface completely.
The final category of APIs is for I/O to deliberately aliased address
ranges inside the kernel. Such aliases are set up by use of the
vmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O
subsystem assumes that the user mapping and kernel offset mapping are
the only aliases. This isn't true for vmap aliases, so anything in
the kernel trying to do I/O to vmap areas must manually manage
coherency. It must do this by flushing the vmap range before doing
I/O and invalidating it after the I/O returns.
void flush_kernel_vmap_range(void *vaddr, int size)
flushes the kernel cache for a given virtual address range in
the vmap area. This is to make sure that any data the kernel
modified in the vmap range is made visible to the physical
page. The design is to make this area safe to perform I/O on.
Note that this API does *not* also flush the offset map alias
of the area.
void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates
the cache for a given virtual address range in the vmap area
which prevents the processor from making the cache stale by
speculatively reading data while the I/O was occurring to the
physical pages. This is only necessary for data reads into the
vmap area.

View file

@ -159,42 +159,7 @@ two arguments: the CDROM device, and the slot number to which you wish
to change. If the slot number is -1, the drive is unloaded.
4. Compilation options
----------------------
There are a few additional options which can be set when compiling the
driver. Most people should not need to mess with any of these; they
are listed here simply for completeness. A compilation option can be
enabled by adding a line of the form `#define <option> 1' to the top
of ide-cd.c. All these options are disabled by default.
VERBOSE_IDE_CD_ERRORS
If this is set, ATAPI error codes will be translated into textual
descriptions. In addition, a dump is made of the command which
provoked the error. This is off by default to save the memory used
by the (somewhat long) table of error descriptions.
STANDARD_ATAPI
If this is set, the code needed to deal with certain drives which do
not properly implement the ATAPI spec will be disabled. If you know
your drive implements ATAPI properly, you can turn this on to get a
slightly smaller kernel.
NO_DOOR_LOCKING
If this is set, the driver will never attempt to lock the door of
the drive.
CDROM_NBLOCKS_BUFFER
This sets the size of the buffer to be used for a CDROMREADAUDIO
ioctl. The default is 8.
TEST
This currently enables an additional ioctl which enables a user-mode
program to execute an arbitrary packet command. See the source for
details. This should be left off unless you know what you're doing.
5. Common problems
4. Common problems
------------------
This section discusses some common problems encountered when trying to
@ -371,7 +336,7 @@ f. Data corruption.
expense of low system performance.
6. cdchange.c
5. cdchange.c
-------------
/*

View file

@ -0,0 +1,207 @@
/*
* pcc-cpufreq.txt - PCC interface documentation
*
* Copyright (C) 2009 Red Hat, Matthew Garrett <mjg@redhat.com>
* Copyright (C) 2009 Hewlett-Packard Development Company, L.P.
* Nagananda Chumbalkar <nagananda.chumbalkar@hp.com>
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; version 2 of the License.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or NON
* INFRINGEMENT. See the GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 675 Mass Ave, Cambridge, MA 02139, USA.
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
Processor Clocking Control Driver
---------------------------------
Contents:
---------
1. Introduction
1.1 PCC interface
1.1.1 Get Average Frequency
1.1.2 Set Desired Frequency
1.2 Platforms affected
2. Driver and /sys details
2.1 scaling_available_frequencies
2.2 cpuinfo_transition_latency
2.3 cpuinfo_cur_freq
2.4 related_cpus
3. Caveats
1. Introduction:
----------------
Processor Clocking Control (PCC) is an interface between the platform
firmware and OSPM. It is a mechanism for coordinating processor
performance (ie: frequency) between the platform firmware and the OS.
The PCC driver (pcc-cpufreq) allows OSPM to take advantage of the PCC
interface.
OS utilizes the PCC interface to inform platform firmware what frequency the
OS wants for a logical processor. The platform firmware attempts to achieve
the requested frequency. If the request for the target frequency could not be
satisfied by platform firmware, then it usually means that power budget
conditions are in place, and "power capping" is taking place.
1.1 PCC interface:
------------------
The complete PCC specification is available here:
http://www.acpica.org/download/Processor-Clocking-Control-v1p0.pdf
PCC relies on a shared memory region that provides a channel for communication
between the OS and platform firmware. PCC also implements a "doorbell" that
is used by the OS to inform the platform firmware that a command has been
sent.
The ACPI PCCH() method is used to discover the location of the PCC shared
memory region. The shared memory region header contains the "command" and
"status" interface. PCCH() also contains details on how to access the platform
doorbell.
The following commands are supported by the PCC interface:
* Get Average Frequency
* Set Desired Frequency
The ACPI PCCP() method is implemented for each logical processor and is
used to discover the offsets for the input and output buffers in the shared
memory region.
When PCC mode is enabled, the platform will not expose processor performance
or throttle states (_PSS, _TSS and related ACPI objects) to OSPM. Therefore,
the native P-state driver (such as acpi-cpufreq for Intel, powernow-k8 for
AMD) will not load.
However, OSPM remains in control of policy. The governor (eg: "ondemand")
computes the required performance for each processor based on server workload.
The PCC driver fills in the command interface, and the input buffer and
communicates the request to the platform firmware. The platform firmware is
responsible for delivering the requested performance.
Each PCC command is "global" in scope and can affect all the logical CPUs in
the system. Therefore, PCC is capable of performing "group" updates. With PCC
the OS is capable of getting/setting the frequency of all the logical CPUs in
the system with a single call to the BIOS.
1.1.1 Get Average Frequency:
----------------------------
This command is used by the OSPM to query the running frequency of the
processor since the last time this command was completed. The output buffer
indicates the average unhalted frequency of the logical processor expressed as
a percentage of the nominal (ie: maximum) CPU frequency. The output buffer
also signifies if the CPU frequency is limited by a power budget condition.
1.1.2 Set Desired Frequency:
----------------------------
This command is used by the OSPM to communicate to the platform firmware the
desired frequency for a logical processor. The output buffer is currently
ignored by OSPM. The next invocation of "Get Average Frequency" will inform
OSPM if the desired frequency was achieved or not.
1.2 Platforms affected:
-----------------------
The PCC driver will load on any system where the platform firmware:
* supports the PCC interface, and the associated PCCH() and PCCP() methods
* assumes responsibility for managing the hardware clocking controls in order
to deliver the requested processor performance
Currently, certain HP ProLiant platforms implement the PCC interface. On those
platforms PCC is the "default" choice.
However, it is possible to disable this interface via a BIOS setting. In
such an instance, as is also the case on platforms where the PCC interface
is not implemented, the PCC driver will fail to load silently.
2. Driver and /sys details:
---------------------------
When the driver loads, it merely prints the lowest and the highest CPU
frequencies supported by the platform firmware.
The PCC driver loads with a message such as:
pcc-cpufreq: (v1.00.00) driver loaded with frequency limits: 1600 MHz, 2933
MHz
This means that the OPSM can request the CPU to run at any frequency in
between the limits (1600 MHz, and 2933 MHz) specified in the message.
Internally, there is no need for the driver to convert the "target" frequency
to a corresponding P-state.
The VERSION number for the driver will be of the format v.xy.ab.
eg: 1.00.02
----- --
| |
| -- this will increase with bug fixes/enhancements to the driver
|-- this is the version of the PCC specification the driver adheres to
The following is a brief discussion on some of the fields exported via the
/sys filesystem and how their values are affected by the PCC driver:
2.1 scaling_available_frequencies:
----------------------------------
scaling_available_frequencies is not created in /sys. No intermediate
frequencies need to be listed because the BIOS will try to achieve any
frequency, within limits, requested by the governor. A frequency does not have
to be strictly associated with a P-state.
2.2 cpuinfo_transition_latency:
-------------------------------
The cpuinfo_transition_latency field is 0. The PCC specification does
not include a field to expose this value currently.
2.3 cpuinfo_cur_freq:
---------------------
A) Often cpuinfo_cur_freq will show a value different than what is declared
in the scaling_available_frequencies or scaling_cur_freq, or scaling_max_freq.
This is due to "turbo boost" available on recent Intel processors. If certain
conditions are met the BIOS can achieve a slightly higher speed than requested
by OSPM. An example:
scaling_cur_freq : 2933000
cpuinfo_cur_freq : 3196000
B) There is a round-off error associated with the cpuinfo_cur_freq value.
Since the driver obtains the current frequency as a "percentage" (%) of the
nominal frequency from the BIOS, sometimes, the values displayed by
scaling_cur_freq and cpuinfo_cur_freq may not match. An example:
scaling_cur_freq : 1600000
cpuinfo_cur_freq : 1583000
In this example, the nominal frequency is 2933 MHz. The driver obtains the
current frequency, cpuinfo_cur_freq, as 54% of the nominal frequency:
54% of 2933 MHz = 1583 MHz
Nominal frequency is the maximum frequency of the processor, and it usually
corresponds to the frequency of the P0 P-state.
2.4 related_cpus:
-----------------
The related_cpus field is identical to affected_cpus.
affected_cpus : 4
related_cpus : 4
Currently, the PCC driver does not evaluate _PSD. The platforms that support
PCC do not implement SW_ALL. So OSPM doesn't need to perform any coordination
to ensure that the same frequency is requested of all dependent CPUs.
3. Caveats:
-----------
The "cpufreq_stats" module in its present form cannot be loaded and
expected to work with the PCC driver. Since the "cpufreq_stats" module
provides information wrt each P-state, it is not applicable to the PCC driver.

View file

@ -122,3 +122,47 @@ volumeGroup-base: 0 2097152 snapshot-merge 254:11 254:12 P 16
brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
brw------- 1 root root 254, 12 29 ago 18:16 /dev/mapper/volumeGroup-base-cow
brw------- 1 root root 254, 10 29 ago 18:16 /dev/mapper/volumeGroup-base
How to determine when a merging is complete
===========================================
The snapshot-merge and snapshot status lines end with:
<sectors_allocated>/<total_sectors> <metadata_sectors>
Both <sectors_allocated> and <total_sectors> include both data and metadata.
During merging, the number of sectors allocated gets smaller and
smaller. Merging has finished when the number of sectors holding data
is zero, in other words <sectors_allocated> == <metadata_sectors>.
Here is a practical example (using a hybrid of lvm and dmsetup commands):
# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
base volumeGroup owi-a- 4.00g
snap volumeGroup swi-a- 1.00g base 18.97
# dmsetup status volumeGroup-snap
0 8388608 snapshot 397896/2097152 1560
^^^^ metadata sectors
# lvconvert --merge -b volumeGroup/snap
Merging of volume snap started.
# lvs volumeGroup/snap
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
base volumeGroup Owi-a- 4.00g 17.23
# dmsetup status volumeGroup-base
0 8388608 snapshot-merge 281688/2097152 1104
# dmsetup status volumeGroup-base
0 8388608 snapshot-merge 180480/2097152 712
# dmsetup status volumeGroup-base
0 8388608 snapshot-merge 16/2097152 16
Merging has finished.
# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
base volumeGroup owi-a- 4.00g

View file

@ -69,7 +69,6 @@ av_permissions.h
bbootsect
bin2c
binkernel.spec
binoffset
bootsect
bounds.h
bsetup

View file

@ -26,7 +26,7 @@ use IO::Handle;
"dec3000s", "vp7041", "dibusb", "nxt2002", "nxt2004",
"or51211", "or51132_qam", "or51132_vsb", "bluebird",
"opera1", "cx231xx", "cx18", "cx23885", "pvrusb2", "mpc718",
"af9015");
"af9015", "ngene");
# Check args
syntax() if (scalar(@ARGV) != 1);
@ -39,7 +39,7 @@ for ($i=0; $i < scalar(@components); $i++) {
die $@ if $@;
print STDERR <<EOF;
Firmware(s) $outfile extracted successfully.
Now copy it(they) to either /usr/lib/hotplug/firmware or /lib/firmware
Now copy it(them) to either /usr/lib/hotplug/firmware or /lib/firmware
(depending on configuration of firmware hotplug).
EOF
exit(0);
@ -549,6 +549,24 @@ sub af9015 {
close INFILE;
}
sub ngene {
my $url = "http://www.digitaldevices.de/download/";
my $file1 = "ngene_15.fw";
my $hash1 = "d798d5a757121174f0dbc5f2833c0c85";
my $file2 = "ngene_17.fw";
my $hash2 = "26b687136e127b8ac24b81e0eeafc20b";
checkstandard();
wgetfile($file1, $url . $file1);
verify($file1, $hash1);
wgetfile($file2, $url . $file2);
verify($file2, $hash2);
"$file1, $file2";
}
# ---------------------------------------------------------------
# Utilities
@ -667,6 +685,7 @@ sub delzero{
sub syntax() {
print STDERR "syntax: get_dvb_firmware <component>\n";
print STDERR "Supported components:\n";
@components = sort @components;
for($i=0; $i < scalar(@components); $i++) {
print STDERR "\t" . $components[$i] . "\n";
}

View file

@ -0,0 +1,38 @@
The lkdtm module provides an interface to crash or injure the kernel at
predefined crashpoints to evaluate the reliability of crash dumps obtained
using different dumping solutions. The module uses KPROBEs to instrument
crashing points, but can also crash the kernel directly without KRPOBE
support.
You can provide the way either through module arguments when inserting
the module, or through a debugfs interface.
Usage: insmod lkdtm.ko [recur_count={>0}] cpoint_name=<> cpoint_type=<>
[cpoint_count={>0}]
recur_count : Recursion level for the stack overflow test. Default is 10.
cpoint_name : Crash point where the kernel is to be crashed. It can be
one of INT_HARDWARE_ENTRY, INT_HW_IRQ_EN, INT_TASKLET_ENTRY,
FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_DISPATCH_CMD,
IDE_CORE_CP, DIRECT
cpoint_type : Indicates the action to be taken on hitting the crash point.
It can be one of PANIC, BUG, EXCEPTION, LOOP, OVERFLOW,
CORRUPT_STACK, UNALIGNED_LOAD_STORE_WRITE, OVERWRITE_ALLOCATION,
WRITE_AFTER_FREE,
cpoint_count : Indicates the number of times the crash point is to be hit
to trigger an action. The default is 10.
You can also induce failures by mounting debugfs and writing the type to
<mountpoint>/provoke-crash/<crashpoint>. E.g.,
mount -t debugfs debugfs /mnt
echo EXCEPTION > /mnt/provoke-crash/INT_HARDWARE_ENTRY
A special file is `DIRECT' which will induce the crash directly without
KPROBE instrumentation. This mode is the only one available when the module
is built on a kernel without KPROBEs support.

View file

@ -6,21 +6,6 @@ be removed from this file.
---------------------------
What: USER_SCHED
When: 2.6.34
Why: USER_SCHED was implemented as a proof of concept for group scheduling.
The effect of USER_SCHED can already be achieved from userspace with
the help of libcgroup. The removal of USER_SCHED will also simplify
the scheduler code with the removal of one major ifdef. There are also
issues USER_SCHED has with USER_NS. A decision was taken not to fix
those and instead remove USER_SCHED. Also new group scheduling
features will not be implemented for USER_SCHED.
Who: Dhaval Giani <dhaval@linux.vnet.ibm.com>
---------------------------
What: PRISM54
When: 2.6.34
@ -64,6 +49,17 @@ Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com>
---------------------------
What: Deprecated snapshot ioctls
When: 2.6.36
Why: The ioctls in kernel/power/user.c were marked as deprecated long time
ago. Now they notify users about that so that they need to replace
their userspace. After some more time, remove them completely.
Who: Jiri Slaby <jirislaby@gmail.com>
---------------------------
What: The ieee80211_regdom module parameter
When: March 2010 / desktop catchup
@ -88,27 +84,6 @@ Who: Luis R. Rodriguez <lrodriguez@atheros.com>
---------------------------
What: CONFIG_WIRELESS_OLD_REGULATORY - old static regulatory information
When: March 2010 / desktop catchup
Why: The old regulatory infrastructure has been replaced with a new one
which does not require statically defined regulatory domains. We do
not want to keep static regulatory domains in the kernel due to the
the dynamic nature of regulatory law and localization. We kept around
the old static definitions for the regulatory domains of:
* US
* JP
* EU
and used by default the US when CONFIG_WIRELESS_OLD_REGULATORY was
set. We will remove this option once the standard Linux desktop catches
up with the new userspace APIs we have implemented.
Who: Luis R. Rodriguez <lrodriguez@atheros.com>
---------------------------
What: dev->power.power_state
When: July 2007
Why: Broken design for runtime control over driver power states, confusing
@ -142,19 +117,25 @@ Who: Mauro Carvalho Chehab <mchehab@infradead.org>
---------------------------
What: PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl])
When: November 2005
When: 2.6.35/2.6.36
Files: drivers/pcmcia/: pcmcia_ioctl.c
Why: With the 16-bit PCMCIA subsystem now behaving (almost) like a
normal hotpluggable bus, and with it using the default kernel
infrastructure (hotplug, driver core, sysfs) keeping the PCMCIA
control ioctl needed by cardmgr and cardctl from pcmcia-cs is
unnecessary, and makes further cleanups and integration of the
unnecessary and potentially harmful (it does not provide for
proper locking), and makes further cleanups and integration of the
PCMCIA subsystem into the Linux kernel device driver model more
difficult. The features provided by cardmgr and cardctl are either
handled by the kernel itself now or are available in the new
pcmciautils package available at
http://kernel.org/pub/linux/utils/kernel/pcmcia/
Who: Dominik Brodowski <linux@brodo.de>
For all architectures except ARM, the associated config symbol
has been removed from kernel 2.6.34; for ARM, it will be likely
be removed from kernel 2.6.35. The actual code will then likely
be removed from kernel 2.6.36.
Who: Dominik Brodowski <linux@dominikbrodowski.net>
---------------------------
@ -468,12 +449,6 @@ Who: Alok N Kataria <akataria@vmware.com>
----------------------------
What: adt7473 hardware monitoring driver
When: February 2010
Why: Obsoleted by the adt7475 driver.
Who: Jean Delvare <khali@linux-fr.org>
---------------------------
What: Support for lcd_switch and display_get in asus-laptop driver
When: March 2010
Why: These two features use non-standard interfaces. There are the
@ -542,3 +517,68 @@ Why: Duplicate functionality with the gspca_zc3xx driver, zc0301 only
sensors) wich are also supported by the gspca_zc3xx driver
(which supports 53 USB-ID's in total)
Who: Hans de Goede <hdegoede@redhat.com>
----------------------------
What: corgikbd, spitzkbd, tosakbd driver
When: 2.6.35
Files: drivers/input/keyboard/{corgi,spitz,tosa}kbd.c
Why: We now have a generic GPIO based matrix keyboard driver that
are fully capable of handling all the keys on these devices.
The original drivers manipulate the GPIO registers directly
and so are difficult to maintain.
Who: Eric Miao <eric.y.miao@gmail.com>
----------------------------
What: corgi_ssp and corgi_ts driver
When: 2.6.35
Files: arch/arm/mach-pxa/corgi_ssp.c, drivers/input/touchscreen/corgi_ts.c
Why: The corgi touchscreen is now deprecated in favour of the generic
ads7846.c driver. The noise reduction technique used in corgi_ts.c,
that's to wait till vsync before ADC sampling, is also integrated into
ads7846 driver now. Provided that the original driver is not generic
and is difficult to maintain, it will be removed later.
Who: Eric Miao <eric.y.miao@gmail.com>
----------------------------
What: capifs
When: February 2011
Files: drivers/isdn/capi/capifs.*
Why: udev fully replaces this special file system that only contains CAPI
NCCI TTY device nodes. User space (pppdcapiplugin) works without
noticing the difference.
Who: Jan Kiszka <jan.kiszka@web.de>
----------------------------
What: KVM memory aliases support
When: July 2010
Why: Memory aliasing support is used for speeding up guest vga access
through the vga windows.
Modern userspace no longer uses this feature, so it's just bitrotted
code and can be removed with no impact.
Who: Avi Kivity <avi@redhat.com>
----------------------------
What: KVM kernel-allocated memory slots
When: July 2010
Why: Since 2.6.25, kvm supports user-allocated memory slots, which are
much more flexible than kernel-allocated slots. All current userspace
supports the newer interface and this code can be removed with no
impact.
Who: Avi Kivity <avi@redhat.com>
----------------------------
What: KVM paravirt mmu host support
When: January 2011
Why: The paravirt mmu host support is slower than non-paravirt mmu, both
on newer and older hardware. It is already not exposed to the guest,
and kept only for live migration purposes.
Who: Avi Kivity <avi@redhat.com>
----------------------------

View file

@ -62,6 +62,8 @@ jfs.txt
- info and mount options for the JFS filesystem.
locks.txt
- info on file locking implementations, flock() vs. fcntl(), etc.
logfs.txt
- info on the LogFS flash filesystem.
mandatory-locking.txt
- info on the Linux implementation of Sys V mandatory file locking.
ncpfs.txt

View file

@ -460,13 +460,6 @@ in sys_read() and friends.
--------------------------- dquot_operations -------------------------------
prototypes:
int (*initialize) (struct inode *, int);
int (*drop) (struct inode *);
int (*alloc_space) (struct inode *, qsize_t, int);
int (*alloc_inode) (const struct inode *, unsigned long);
int (*free_space) (struct inode *, qsize_t);
int (*free_inode) (const struct inode *, unsigned long);
int (*transfer) (struct inode *, struct iattr *);
int (*write_dquot) (struct dquot *);
int (*acquire_dquot) (struct dquot *);
int (*release_dquot) (struct dquot *);
@ -479,13 +472,6 @@ a proper locking wrt the filesystem and call the generic quota operations.
What filesystem should expect from the generic quota functions:
FS recursion Held locks when called
initialize: yes maybe dqonoff_sem
drop: yes -
alloc_space: ->mark_dirty() -
alloc_inode: ->mark_dirty() -
free_space: ->mark_dirty() -
free_inode: ->mark_dirty() -
transfer: yes -
write_dquot: yes dqonoff_sem or dqptr_sem
acquire_dquot: yes dqonoff_sem or dqptr_sem
release_dquot: yes dqonoff_sem or dqptr_sem
@ -495,10 +481,6 @@ write_info: yes dqonoff_sem
FS recursion means calling ->quota_read() and ->quota_write() from superblock
operations.
->alloc_space(), ->alloc_inode(), ->free_space(), ->free_inode() are called
only directly by the filesystem and do not call any fs functions only
the ->mark_dirty() operation.
More details about quota locking can be found in fs/dquot.c.
--------------------------- vm_operations_struct -----------------------------

View file

@ -62,7 +62,8 @@ changes are :
2. Insertion of a dentry into the hash table is done using
hlist_add_head_rcu() which take care of ordering the writes - the
writes to the dentry must be visible before the dentry is
inserted. This works in conjunction with hlist_for_each_rcu() while
inserted. This works in conjunction with hlist_for_each_rcu(),
which has since been replaced by hlist_for_each_entry_rcu(), while
walking the hash chain. The only requirement is that all
initialization to the dentry must be done before
hlist_add_head_rcu() since we don't have dcache_lock protection

View file

@ -0,0 +1,241 @@
The LogFS Flash Filesystem
==========================
Specification
=============
Superblocks
-----------
Two superblocks exist at the beginning and end of the filesystem.
Each superblock is 256 Bytes large, with another 3840 Bytes reserved
for future purposes, making a total of 4096 Bytes.
Superblock locations may differ for MTD and block devices. On MTD the
first non-bad block contains a superblock in the first 4096 Bytes and
the last non-bad block contains a superblock in the last 4096 Bytes.
On block devices, the first 4096 Bytes of the device contain the first
superblock and the last aligned 4096 Byte-block contains the second
superblock.
For the most part, the superblocks can be considered read-only. They
are written only to correct errors detected within the superblocks,
move the journal and change the filesystem parameters through tunefs.
As a result, the superblock does not contain any fields that require
constant updates, like the amount of free space, etc.
Segments
--------
The space in the device is split up into equal-sized segments.
Segments are the primary write unit of LogFS. Within each segments,
writes happen from front (low addresses) to back (high addresses. If
only a partial segment has been written, the segment number, the
current position within and optionally a write buffer are stored in
the journal.
Segments are erased as a whole. Therefore Garbage Collection may be
required to completely free a segment before doing so.
Journal
--------
The journal contains all global information about the filesystem that
is subject to frequent change. At mount time, it has to be scanned
for the most recent commit entry, which contains a list of pointers to
all currently valid entries.
Object Store
------------
All space except for the superblocks and journal is part of the object
store. Each segment contains a segment header and a number of
objects, each consisting of the object header and the payload.
Objects are either inodes, directory entries (dentries), file data
blocks or indirect blocks.
Levels
------
Garbage collection (GC) may fail if all data is written
indiscriminately. One requirement of GC is that data is seperated
roughly according to the distance between the tree root and the data.
Effectively that means all file data is on level 0, indirect blocks
are on levels 1, 2, 3 4 or 5 for 1x, 2x, 3x, 4x or 5x indirect blocks,
respectively. Inode file data is on level 6 for the inodes and 7-11
for indirect blocks.
Each segment contains objects of a single level only. As a result,
each level requires its own seperate segment to be open for writing.
Inode File
----------
All inodes are stored in a special file, the inode file. Single
exception is the inode file's inode (master inode) which for obvious
reasons is stored in the journal instead. Instead of data blocks, the
leaf nodes of the inode files are inodes.
Aliases
-------
Writes in LogFS are done by means of a wandering tree. A naïve
implementation would require that for each write or a block, all
parent blocks are written as well, since the block pointers have
changed. Such an implementation would not be very efficient.
In LogFS, the block pointer changes are cached in the journal by means
of alias entries. Each alias consists of its logical address - inode
number, block index, level and child number (index into block) - and
the changed data. Any 8-byte word can be changes in this manner.
Currently aliases are used for block pointers, file size, file used
bytes and the height of an inodes indirect tree.
Segment Aliases
---------------
Related to regular aliases, these are used to handle bad blocks.
Initially, bad blocks are handled by moving the affected segment
content to a spare segment and noting this move in the journal with a
segment alias, a simple (to, from) tupel. GC will later empty this
segment and the alias can be removed again. This is used on MTD only.
Vim
---
By cleverly predicting the life time of data, it is possible to
seperate long-living data from short-living data and thereby reduce
the GC overhead later. Each type of distinc life expectency (vim) can
have a seperate segment open for writing. Each (level, vim) tupel can
be open just once. If an open segment with unknown vim is encountered
at mount time, it is closed and ignored henceforth.
Indirect Tree
-------------
Inodes in LogFS are similar to FFS-style filesystems with direct and
indirect block pointers. One difference is that LogFS uses a single
indirect pointer that can be either a 1x, 2x, etc. indirect pointer.
A height field in the inode defines the height of the indirect tree
and thereby the indirection of the pointer.
Another difference is the addressing of indirect blocks. In LogFS,
the first 16 pointers in the first indirect block are left empty,
corresponding to the 16 direct pointers in the inode. In ext2 (maybe
others as well) the first pointer in the first indirect block
corresponds to logical block 12, skipping the 12 direct pointers.
So where ext2 is using arithmetic to better utilize space, LogFS keeps
arithmetic simple and uses compression to save space.
Compression
-----------
Both file data and metadata can be compressed. Compression for file
data can be enabled with chattr +c and disabled with chattr -c. Doing
so has no effect on existing data, but new data will be stored
accordingly. New inodes will inherit the compression flag of the
parent directory.
Metadata is always compressed. However, the space accounting ignores
this and charges for the uncompressed size. Failing to do so could
result in GC failures when, after moving some data, indirect blocks
compress worse than previously. Even on a 100% full medium, GC may
not consume any extra space, so the compression gains are lost space
to the user.
However, they are not lost space to the filesystem internals. By
cheating the user for those bytes, the filesystem gained some slack
space and GC will run less often and faster.
Garbage Collection and Wear Leveling
------------------------------------
Garbage collection is invoked whenever the number of free segments
falls below a threshold. The best (known) candidate is picked based
on the least amount of valid data contained in the segment. All
remaining valid data is copied elsewhere, thereby invalidating it.
The GC code also checks for aliases and writes then back if their
number gets too large.
Wear leveling is done by occasionally picking a suboptimal segment for
garbage collection. If a stale segments erase count is significantly
lower than the active segments' erase counts, it will be picked. Wear
leveling is rate limited, so it will never monopolize the device for
more than one segment worth at a time.
Values for "occasionally", "significantly lower" are compile time
constants.
Hashed directories
------------------
To satisfy efficient lookup(), directory entries are hashed and
located based on the hash. In order to both support large directories
and not be overly inefficient for small directories, several hash
tables of increasing size are used. For each table, the hash value
modulo the table size gives the table index.
Tables sizes are chosen to limit the number of indirect blocks with a
fully populated table to 0, 1, 2 or 3 respectively. So the first
table contains 16 entries, the second 512-16, etc.
The last table is special in several ways. First its size depends on
the effective 32bit limit on telldir/seekdir cookies. Since logfs
uses the upper half of the address space for indirect blocks, the size
is limited to 2^31. Secondly the table contains hash buckets with 16
entries each.
Using single-entry buckets would result in birthday "attacks". At
just 2^16 used entries, hash collisions would be likely (P >= 0.5).
My math skills are insufficient to do the combinatorics for the 17x
collisions necessary to overflow a bucket, but testing showed that in
10,000 runs the lowest directory fill before a bucket overflow was
188,057,130 entries with an average of 315,149,915 entries. So for
directory sizes of up to a million, bucket overflows should be
virtually impossible under normal circumstances.
With carefully chosen filenames, it is obviously possible to cause an
overflow with just 21 entries (4 higher tables + 16 entries + 1). So
there may be a security concern if a malicious user has write access
to a directory.
Open For Discussion
===================
Device Address Space
--------------------
A device address space is used for caching. Both block devices and
MTD provide functions to either read a single page or write a segment.
Partial segments may be written for data integrity, but where possible
complete segments are written for performance on simple block device
flash media.
Meta Inodes
-----------
Inodes are stored in the inode file, which is just a regular file for
most purposes. At umount time, however, the inode file needs to
remain open until all dirty inodes are written. So
generic_shutdown_super() may not close this inode, but shouldn't
complain about remaining inodes due to the inode file either. Same
goes for mapping inode of the device address space.
Currently logfs uses a hack that essentially copies part of fs/inode.c
code over. A general solution would be preferred.
Indirect block mapping
----------------------
With compression, the block device (or mapping inode) cannot be used
to cache indirect blocks. Some other place is required. Currently
logfs uses the top half of each inode's address space. The low 8TB
(on 32bit) are filled with file data, the high 8TB are used for
indirect blocks.
One problem is that 16TB files created on 64bit systems actually have
data in the top 8TB. But files >16TB would cause problems anyway, so
only the limit has changed.

View file

@ -17,8 +17,7 @@ kernels must turn 4.1 on or off *before* turning support for version 4
on or off; rpc.nfsd does this correctly.)
The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based
on the latest NFSv4.1 Internet Draft:
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29
on RFC 5661.
From the many new features in NFSv4.1 the current implementation
focuses on the mandatory-to-implement NFSv4.1 Sessions, providing
@ -44,7 +43,7 @@ interoperability problems with future clients. Known issues:
trunking, but this is a mandatory feature, and its use is
recommended to clients in a number of places. (E.g. to ensure
timely renewal in case an existing connection's retry timeouts
have gotten too long; see section 8.3 of the draft.)
have gotten too long; see section 8.3 of the RFC.)
Therefore, lack of this feature may cause future clients to
fail.
- Incomplete backchannel support: incomplete backchannel gss

View file

@ -74,6 +74,9 @@ norecovery Disable recovery of the filesystem on mount.
This disables every write access on the device for
read-only mounts or snapshots. This option will fail
for r/w mounts on an unclean volume.
discard Issue discard/TRIM commands to the underlying block
device when blocks are freed. This is useful for SSD
devices and sparse/thinly-provisioned LUNs.
NILFS2 usage
============

View file

@ -164,6 +164,7 @@ read the file /proc/PID/status:
VmExe: 68 kB
VmLib: 1412 kB
VmPTE: 20 kb
VmSwap: 0 kB
Threads: 1
SigQ: 0/28578
SigPnd: 0000000000000000
@ -188,6 +189,12 @@ memory usage. Its seven fields are explained in Table 1-3. The stat file
contains details information about the process itself. Its fields are
explained in Table 1-4.
(for SMP CONFIG users)
For making accounting scalable, RSS related information are handled in
asynchronous manner and the vaule may not be very precise. To see a precise
snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table.
It's slow but very precise.
Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
..............................................................................
Field Content
@ -213,6 +220,7 @@ Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
VmExe size of text segment
VmLib size of shared library code
VmPTE size of page table entries
VmSwap size of swap usage (the number of referred swapents)
Threads number of threads
SigQ number of signals queued/max. number for queue
SigPnd bitmap of pending signals for the thread
@ -430,6 +438,7 @@ Table 1-5: Kernel info in /proc
modules List of loaded modules
mounts Mounted filesystems
net Networking info (see text)
pagetypeinfo Additional page allocator information (see text) (2.5)
partitions Table of partitions known to the system
pci Deprecated info of PCI bus (new way -> /proc/bus/pci/,
decoupled by lspci (2.4)
@ -584,7 +593,7 @@ Node 0, zone DMA 0 4 5 4 4 3 ...
Node 0, zone Normal 1 0 0 1 101 8 ...
Node 0, zone HighMem 2 0 0 1 1 0 ...
Memory fragmentation is a problem under some workloads, and buddyinfo is a
External fragmentation is a problem under some workloads, and buddyinfo is a
useful tool for helping diagnose these problems. Buddyinfo will give you a
clue as to how big an area you can safely allocate, or why a previous
allocation failed.
@ -594,6 +603,48 @@ available. In this case, there are 0 chunks of 2^0*PAGE_SIZE available in
ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE
available in ZONE_NORMAL, etc...
More information relevant to external fragmentation can be found in
pagetypeinfo.
> cat /proc/pagetypeinfo
Page block order: 9
Pages per block: 512
Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10
Node 0, zone DMA, type Unmovable 0 0 0 1 1 1 1 1 1 1 0
Node 0, zone DMA, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Movable 1 1 2 1 2 1 1 0 1 0 2
Node 0, zone DMA, type Reserve 0 0 0 0 0 0 0 0 0 1 0
Node 0, zone DMA, type Isolate 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA32, type Unmovable 103 54 77 1 1 1 11 8 7 1 9
Node 0, zone DMA32, type Reclaimable 0 0 2 1 0 0 0 0 1 0 0
Node 0, zone DMA32, type Movable 169 152 113 91 77 54 39 13 6 1 452
Node 0, zone DMA32, type Reserve 1 2 2 2 2 0 1 1 1 1 0
Node 0, zone DMA32, type Isolate 0 0 0 0 0 0 0 0 0 0 0
Number of blocks type Unmovable Reclaimable Movable Reserve Isolate
Node 0, zone DMA 2 0 5 1 0
Node 0, zone DMA32 41 6 967 2 0
Fragmentation avoidance in the kernel works by grouping pages of different
migrate types into the same contiguous regions of memory called page blocks.
A page block is typically the size of the default hugepage size e.g. 2MB on
X86-64. By keeping pages grouped based on their ability to move, the kernel
can reclaim pages within a page block to satisfy a high-order allocation.
The pagetypinfo begins with information on the size of a page block. It
then gives the same type of information as buddyinfo except broken down
by migrate-type and finishes with details on how many page blocks of each
type exist.
If min_free_kbytes has been tuned correctly (recommendations made by hugeadm
from libhugetlbfs http://sourceforge.net/projects/libhugetlbfs/), one can
make an estimate of the likely number of huge pages that can be allocated
at a given point in time. All the "Movable" blocks should be allocatable
unless memory has been mlock()'d. Some of the Reclaimable blocks should
also be allocatable although a lot of filesystem metadata may have to be
reclaimed to achieve this.
..............................................................................
meminfo:

View file

@ -837,6 +837,9 @@ replicas continue to be exactly same.
individual lists does not affect propagation or the way propagation
tree is modified by operations.
All vfsmounts in a peer group have the same ->mnt_master. If it is
non-NULL, they form a contiguous (ordered) segment of slave list.
A example propagation tree looks as shown in the figure below.
[ NOTE: Though it looks like a forest, if we consider all the shared
mounts as a conceptual entity called 'pnode', it becomes a tree]
@ -874,8 +877,19 @@ replicas continue to be exactly same.
NOTE: The propagation tree is orthogonal to the mount tree.
8B Locking:
8B Algorithm:
->mnt_share, ->mnt_slave, ->mnt_slave_list, ->mnt_master are protected
by namespace_sem (exclusive for modifications, shared for reading).
Normally we have ->mnt_flags modifications serialized by vfsmount_lock.
There are two exceptions: do_add_mount() and clone_mnt().
The former modifies a vfsmount that has not been visible in any shared
data structures yet.
The latter holds namespace_sem and the only references to vfsmount
are in lists that can't be traversed without namespace_sem.
8C Algorithm:
The crux of the implementation resides in rbind/move operation.

View file

@ -253,6 +253,70 @@ pin setup (e.g. controlling which pin the GPIO uses, pullup/pulldown).
Also note that it's your responsibility to have stopped using a GPIO
before you free it.
Considering in most cases GPIOs are actually configured right after they
are claimed, three additional calls are defined:
/* request a single GPIO, with initial configuration specified by
* 'flags', identical to gpio_request() wrt other arguments and
* return value
*/
int gpio_request_one(unsigned gpio, unsigned long flags, const char *label);
/* request multiple GPIOs in a single call
*/
int gpio_request_array(struct gpio *array, size_t num);
/* release multiple GPIOs in a single call
*/
void gpio_free_array(struct gpio *array, size_t num);
where 'flags' is currently defined to specify the following properties:
* GPIOF_DIR_IN - to configure direction as input
* GPIOF_DIR_OUT - to configure direction as output
* GPIOF_INIT_LOW - as output, set initial level to LOW
* GPIOF_INIT_HIGH - as output, set initial level to HIGH
since GPIOF_INIT_* are only valid when configured as output, so group valid
combinations as:
* GPIOF_IN - configure as input
* GPIOF_OUT_INIT_LOW - configured as output, initial level LOW
* GPIOF_OUT_INIT_HIGH - configured as output, initial level HIGH
In the future, these flags can be extended to support more properties such
as open-drain status.
Further more, to ease the claim/release of multiple GPIOs, 'struct gpio' is
introduced to encapsulate all three fields as:
struct gpio {
unsigned gpio;
unsigned long flags;
const char *label;
};
A typical example of usage:
static struct gpio leds_gpios[] = {
{ 32, GPIOF_OUT_INIT_HIGH, "Power LED" }, /* default to ON */
{ 33, GPIOF_OUT_INIT_LOW, "Green LED" }, /* default to OFF */
{ 34, GPIOF_OUT_INIT_LOW, "Red LED" }, /* default to OFF */
{ 35, GPIOF_OUT_INIT_LOW, "Blue LED" }, /* default to OFF */
{ ... },
};
err = gpio_request_one(31, GPIOF_IN, "Reset Button");
if (err)
...
err = gpio_request_array(leds_gpios, ARRAY_SIZE(leds_gpios));
if (err)
...
gpio_free_array(leds_gpios, ARRAY_SIZE(leds_gpios));
GPIOs mapped to IRQs
--------------------

View file

@ -0,0 +1,42 @@
Kernel driver adt7411
=====================
Supported chips:
* Analog Devices ADT7411
Prefix: 'adt7411'
Addresses scanned: 0x48, 0x4a, 0x4b
Datasheet: Publicly available at the Analog Devices website
Author: Wolfram Sang (based on adt7470 by Darrick J. Wong)
Description
-----------
This driver implements support for the Analog Devices ADT7411 chip. There may
be other chips that implement this interface.
The ADT7411 can use an I2C/SMBus compatible 2-wire interface or an
SPI-compatible 4-wire interface. It provides a 10-bit analog to digital
converter which measures 1 temperature, vdd and 8 input voltages. It has an
internal temperature sensor, but an external one can also be connected (one
loses 2 inputs then). There are high- and low-limit registers for all inputs.
Check the datasheet for details.
sysfs-Interface
---------------
in0_input - vdd voltage input
in[1-8]_input - analog 1-8 input
temp1_input - temperature input
Besides standard interfaces, this driver adds (0 = off, 1 = on):
adc_ref_vdd - Use vdd as reference instead of 2.25 V
fast_sampling - Sample at 22.5 kHz instead of 1.4 kHz, but drop filters
no_average - Turn off averaging over 16 samples
Notes
-----
SPI, external temperature sensor and limit registers are not supported yet.

View file

@ -1,74 +0,0 @@
Kernel driver adt7473
======================
Supported chips:
* Analog Devices ADT7473
Prefix: 'adt7473'
Addresses scanned: I2C 0x2C, 0x2D, 0x2E
Datasheet: Publicly available at the Analog Devices website
Author: Darrick J. Wong
This driver is depreacted, please use the adt7475 driver instead.
Description
-----------
This driver implements support for the Analog Devices ADT7473 chip family.
The ADT7473 uses the 2-wire interface compatible with the SMBUS 2.0
specification. Using an analog to digital converter it measures three (3)
temperatures and two (2) voltages. It has four (4) 16-bit counters for
measuring fan speed. There are three (3) PWM outputs that can be used
to control fan speed.
A sophisticated control system for the PWM outputs is designed into the
ADT7473 that allows fan speed to be adjusted automatically based on any of the
three temperature sensors. Each PWM output is individually adjustable and
programmable. Once configured, the ADT7473 will adjust the PWM outputs in
response to the measured temperatures without further host intervention.
This feature can also be disabled for manual control of the PWM's.
Each of the measured inputs (voltage, temperature, fan speed) has
corresponding high/low limit values. The ADT7473 will signal an ALARM if
any measured value exceeds either limit.
The ADT7473 samples all inputs continuously. The driver will not read
the registers more often than once every other second. Further,
configuration data is only read once per minute.
Special Features
----------------
The ADT7473 have a 10-bit ADC and can therefore measure temperatures
with 0.25 degC resolution. Temperature readings can be configured either
for twos complement format or "Offset 64" format, wherein 63 is subtracted
from the raw value to get the temperature value.
The Analog Devices datasheet is very detailed and describes a procedure for
determining an optimal configuration for the automatic PWM control.
Configuration Notes
-------------------
Besides standard interfaces driver adds the following:
* PWM Control
* pwm#_auto_point1_pwm and temp#_auto_point1_temp and
* pwm#_auto_point2_pwm and temp#_auto_point2_temp -
point1: Set the pwm speed at a lower temperature bound.
point2: Set the pwm speed at a higher temperature bound.
The ADT7473 will scale the pwm between the lower and higher pwm speed when
the temperature is between the two temperature boundaries. PWM values range
from 0 (off) to 255 (full speed). Fan speed will be set to maximum when the
temperature sensor associated with the PWM control exceeds temp#_max.
Notes
-----
The NVIDIA binary driver presents an ADT7473 chip via an on-card i2c bus.
Unfortunately, they fail to set the i2c adapter class, so this driver may
fail to find the chip until the nvidia driver is patched.

296
Documentation/hwmon/asc7621 Normal file
View file

@ -0,0 +1,296 @@
Kernel driver asc7621
==================
Supported chips:
Andigilog aSC7621 and aSC7621a
Prefix: 'asc7621'
Addresses scanned: I2C 0x2c, 0x2d, 0x2e
Datasheet: http://www.fairview5.com/linux/asc7621/asc7621.pdf
Author:
George Joseph
Description provided by Dave Pivin @ Andigilog:
Andigilog has both the PECI and pre-PECI versions of the Heceta-6, as
Intel calls them. Heceta-6e has high frequency PWM and Heceta-6p has
added PECI and a 4th thermal zone. The Andigilog aSC7611 is the
Heceta-6e part and aSC7621 is the Heceta-6p part. They are both in
volume production, shipping to Intel and their subs.
We have enhanced both parts relative to the governing Intel
specification. First enhancement is temperature reading resolution. We
have used registers below 20h for vendor-specific functions in addition
to those in the Intel-specified vendor range.
Our conversion process produces a result that is reported as two bytes.
The fan speed control uses this finer value to produce a "step-less" fan
PWM output. These two bytes are "read-locked" to guarantee that once a
high or low byte is read, the other byte is locked-in until after the
next read of any register. So to get an atomic reading, read high or low
byte, then the very next read should be the opposite byte. Our data
sheet says 10-bits of resolution, although you may find the lower bits
are active, they are not necessarily reliable or useful externally. We
chose not to mask them.
We employ significant filtering that is user tunable as described in the
data sheet. Our temperature reports and fan PWM outputs are very smooth
when compared to the competition, in addition to the higher resolution
temperature reports. The smoother PWM output does not require user
intervention.
We offer GPIO features on the former VID pins. These are open-drain
outputs or inputs and may be used as general purpose I/O or as alarm
outputs that are based on temperature limits. These are in 19h and 1Ah.
We offer flexible mapping of temperature readings to thermal zones. Any
temperature may be mapped to any zone, which has a default assignment
that follows Intel's specs.
Since there is a fan to zone assignment that allows for the "hotter" of
a set of zones to control the PWM of an individual fan, but there is no
indication to the user, we have added an indicator that shows which zone
is currently controlling the PWM for a given fan. This is in register
00h.
Both remote diode temperature readings may be given an offset value such
that the reported reading as well as the temperature used to determine
PWM may be offset for system calibration purposes.
PECI Extended configuration allows for having more than two domains per
PECI address and also provides an enabling function for each PECI
address. One could use our flexible zone assignment to have a zone
assigned to up to 4 PECI addresses. This is not possible in the default
Intel configuration. This would be useful in multi-CPU systems with
individual fans on each that would benefit from individual fan control.
This is in register 0Eh.
The tachometer measurement system is flexible and able to adapt to many
fan types. We can also support pulse-stretched PWM so that 3-wire fans
may be used. These characteristics are in registers 04h to 07h.
Finally, we have added a tach disable function that turns off the tach
measurement system for individual tachs in order to save power. That is
in register 75h.
--
aSC7621 Product Description
The aSC7621 has a two wire digital interface compatible with SMBus 2.0.
Using a 10-bit ADC, the aSC7621 measures the temperature of two remote diode
connected transistors as well as its own die. Support for Platform
Environmental Control Interface (PECI) is included.
Using temperature information from these four zones, an automatic fan speed
control algorithm is employed to minimize acoustic impact while achieving
recommended CPU temperature under varying operational loads.
To set fan speed, the aSC7621 has three independent pulse width modulation
(PWM) outputs that are controlled by one, or a combination of three,
temperature zones. Both high- and low-frequency PWM ranges are supported.
The aSC7621 also includes a digital filter that can be invoked to smooth
temperature readings for better control of fan speed and minimum acoustic
impact.
The aSC7621 has tachometer inputs to measure fan speed on up to four fans.
Limit and status registers for all measured values are included to alert
the system host that any measurements are outside of programmed limits
via status registers.
System voltages of VCCP, 2.5V, 3.3V, 5.0V, and 12V motherboard power are
monitored efficiently with internal scaling resistors.
Features
- Supports PECI interface and monitors internal and remote thermal diodes
- 2-wire, SMBus 2.0 compliant, serial interface
- 10-bit ADC
- Monitors VCCP, 2.5V, 3.3V, 5.0V, and 12V motherboard/processor supplies
- Programmable autonomous fan control based on temperature readings
- Noise filtering of temperature reading for fan speed control
- 0.25C digital temperature sensor resolution
- 3 PWM fan speed control outputs for 2-, 3- or 4-wire fans and up to 4 fan
tachometer inputs
- Enhanced measured temperature to Temperature Zone assignment.
- Provides high and low PWM frequency ranges
- 3 GPIO pins for custom use
- 24-Lead QSOP package
Configuration Notes
===================
Except where noted below, the sysfs entries created by this driver follow
the standards defined in "sysfs-interface".
temp1_source
0 (default) peci_legacy = 0, Remote 1 Temperature
peci_legacy = 1, PECI Processor Temperature 0
1 Remote 1 Temperature
2 Remote 2 Temperature
3 Internal Temperature
4 PECI Processor Temperature 0
5 PECI Processor Temperature 1
6 PECI Processor Temperature 2
7 PECI Processor Temperature 3
temp2_source
0 (default) Internal Temperature
1 Remote 1 Temperature
2 Remote 2 Temperature
3 Internal Temperature
4 PECI Processor Temperature 0
5 PECI Processor Temperature 1
6 PECI Processor Temperature 2
7 PECI Processor Temperature 3
temp3_source
0 (default) Remote 2 Temperature
1 Remote 1 Temperature
2 Remote 2 Temperature
3 Internal Temperature
4 PECI Processor Temperature 0
5 PECI Processor Temperature 1
6 PECI Processor Temperature 2
7 PECI Processor Temperature 3
temp4_source
0 (default) peci_legacy = 0, PECI Processor Temperature 0
peci_legacy = 1, Remote 1 Temperature
1 Remote 1 Temperature
2 Remote 2 Temperature
3 Internal Temperature
4 PECI Processor Temperature 0
5 PECI Processor Temperature 1
6 PECI Processor Temperature 2
7 PECI Processor Temperature 3
temp[1-4]_smoothing_enable
temp[1-4]_smoothing_time
Smooths spikes in temp readings caused by noise.
Valid values in milliseconds are:
35000
17600
11800
7000
4400
3000
1600
800
temp[1-4]_crit
When the corresponding zone temperature reaches this value,
ALL pwm outputs will got to 100%.
temp[5-8]_input
temp[5-8]_enable
The aSC7621 can also read temperatures provided by the processor
via the PECI bus. Usually these are "core" temps and are relative
to the point where the automatic thermal control circuit starts
throttling. This means that these are usually negative numbers.
pwm[1-3]_enable
0 Fan off.
1 Fan on manual control.
2 Fan on automatic control and will run at the minimum pwm
if the temperature for the zone is below the minimum.
3 Fan on automatic control but will be off if the temperature
for the zone is below the minimum.
4-254 Ignored.
255 Fan on full.
pwm[1-3]_auto_channels
Bitmap as described in sysctl-interface with the following
exceptions...
Only the following combination of zones (and their corresponding masks)
are valid:
1
2
3
2,3
1,2,3
4
1,2,3,4
Special values:
0 Disabled.
16 Fan on manual control.
31 Fan on full.
pwm[1-3]_invert
When set, inverts the meaning of pwm[1-3].
i.e. when pwm = 0, the fan will be on full and
when pwm = 255 the fan will be off.
pwm[1-3]_freq
PWM frequency in Hz
Valid values in Hz are:
10
15
23
30 (default)
38
47
62
94
23000
24000
25000
26000
27000
28000
29000
30000
Setting any other value will be ignored.
peci_enable
Enables or disables PECI
peci_avg
Input filter average time.
0 0 Sec. (no Smoothing) (default)
1 0.25 Sec.
2 0.5 Sec.
3 1.0 Sec.
4 2.0 Sec.
5 4.0 Sec.
6 8.0 Sec.
7 0.0 Sec.
peci_legacy
0 Standard Mode (default)
Remote Diode 1 reading is associated with
Temperature Zone 1, PECI is associated with
Zone 4
1 Legacy Mode
PECI is associated with Temperature Zone 1,
Remote Diode 1 is associated with Zone 4
peci_diode
Diode filter
0 0.25 Sec.
1 1.1 Sec.
2 2.4 Sec. (default)
3 3.4 Sec.
4 5.0 Sec.
5 6.8 Sec.
6 10.2 Sec.
7 16.4 Sec.
peci_4domain
Four domain enable
0 1 or 2 Domains for enabled processors (default)
1 3 or 4 Domains for enabled processors
peci_domain
Domain
0 Processor contains a single domain (0) (default)
1 Processor contains two domains (0,1)

View file

@ -5,31 +5,23 @@ Supported chips:
* IT8705F
Prefix: 'it87'
Addresses scanned: from Super I/O config space (8 I/O ports)
Datasheet: Publicly available at the ITE website
http://www.ite.com.tw/product_info/file/pc/IT8705F_V.0.4.1.pdf
Datasheet: Once publicly available at the ITE website, but no longer
* IT8712F
Prefix: 'it8712'
Addresses scanned: from Super I/O config space (8 I/O ports)
Datasheet: Publicly available at the ITE website
http://www.ite.com.tw/product_info/file/pc/IT8712F_V0.9.1.pdf
http://www.ite.com.tw/product_info/file/pc/Errata%20V0.1%20for%20IT8712F%20V0.9.1.pdf
http://www.ite.com.tw/product_info/file/pc/IT8712F_V0.9.3.pdf
Datasheet: Once publicly available at the ITE website, but no longer
* IT8716F/IT8726F
Prefix: 'it8716'
Addresses scanned: from Super I/O config space (8 I/O ports)
Datasheet: Publicly available at the ITE website
http://www.ite.com.tw/product_info/file/pc/IT8716F_V0.3.ZIP
http://www.ite.com.tw/product_info/file/pc/IT8726F_V0.3.pdf
Datasheet: Once publicly available at the ITE website, but no longer
* IT8718F
Prefix: 'it8718'
Addresses scanned: from Super I/O config space (8 I/O ports)
Datasheet: Publicly available at the ITE website
http://www.ite.com.tw/product_info/file/pc/IT8718F_V0.2.zip
http://www.ite.com.tw/product_info/file/pc/IT8718F_V0%203_(for%20C%20version).zip
Datasheet: Once publicly available at the ITE website, but no longer
* IT8720F
Prefix: 'it8720'
Addresses scanned: from Super I/O config space (8 I/O ports)
Datasheet: Not yet publicly available.
Datasheet: Not publicly available
* SiS950 [clone of IT8705F]
Prefix: 'it87'
Addresses scanned: from Super I/O config space (8 I/O ports)
@ -136,6 +128,10 @@ registers are read whenever any data is read (unless it is less than 1.5
seconds since the last update). This means that you can easily miss
once-only alarms.
Out-of-limit readings can also result in beeping, if the chip is properly
wired and configured. Beeping can be enabled or disabled per sensor type
(temperatures, voltages and fans.)
The IT87xx only updates its values each 1.5 seconds; reading it more often
will do no harm, but will return 'old' values.
@ -150,11 +146,38 @@ Fan speed control
-----------------
The fan speed control features are limited to manual PWM mode. Automatic
"Smart Guardian" mode control handling is not implemented. However
if you want to go for "manual mode" just write 1 to pwmN_enable.
"Smart Guardian" mode control handling is only implemented for older chips
(see below.) However if you want to go for "manual mode" just write 1 to
pwmN_enable.
If you are only able to control the fan speed with very small PWM values,
try lowering the PWM base frequency (pwm1_freq). Depending on the fan,
it may give you a somewhat greater control range. The same frequency is
used to drive all fan outputs, which is why pwm2_freq and pwm3_freq are
read-only.
Automatic fan speed control (old interface)
-------------------------------------------
The driver supports the old interface to automatic fan speed control
which is implemented by IT8705F chips up to revision F and IT8712F
chips up to revision G.
This interface implements 4 temperature vs. PWM output trip points.
The PWM output of trip point 4 is always the maximum value (fan running
at full speed) while the PWM output of the other 3 trip points can be
freely chosen. The temperature of all 4 trip points can be freely chosen.
Additionally, trip point 1 has an hysteresis temperature attached, to
prevent fast switching between fan on and off.
The chip automatically computes the PWM output value based on the input
temperature, based on this simple rule: if the temperature value is
between trip point N and trip point N+1 then the PWM output value is
the one of trip point N. The automatic control mode is less flexible
than the manual control mode, but it reacts faster, is more robust and
doesn't use CPU cycles.
Trip points must be set properly before switching to automatic fan speed
control mode. The driver will perform basic integrity checks before
actually switching to automatic control mode.

View file

@ -84,6 +84,10 @@ Supported chips:
Addresses scanned: I2C 0x4c
Datasheet: Publicly available at the Maxim website
http://www.maxim-ic.com/quick_view2.cfm/qv_pk/3500
* Winbond/Nuvoton W83L771AWG/ASG
Prefix: 'w83l771'
Addresses scanned: I2C 0x4c
Datasheet: Not publicly available, can be requested from Nuvoton
Author: Jean Delvare <khali@linux-fr.org>
@ -147,6 +151,12 @@ MAX6680 and MAX6681:
* Selectable address
* Remote sensor type selection
W83L771AWG/ASG
* The AWG and ASG variants only differ in package format.
* Filter and alert configuration register at 0xBF
* Diode ideality factor configuration (remote sensor) at 0xE3
* Moving average (depending on conversion rate)
All temperature values are given in degrees Celsius. Resolution
is 1.0 degree for the local temperature, 0.125 degree for the remote
temperature, except for the MAX6657, MAX6658 and MAX6659 which have a
@ -163,6 +173,18 @@ The lm90 driver will not update its values more frequently than every
other second; reading them more often will do no harm, but will return
'old' values.
SMBus Alert Support
-------------------
This driver has basic support for SMBus alert. When an alert is received,
the status register is read and the faulty temperature channel is logged.
The Analog Devices chips (ADM1032 and ADT7461) do not implement the SMBus
alert protocol properly so additional care is needed: the ALERT output is
disabled when an alert is received, and is re-enabled only when the alarm
is gone. Otherwise the chip would block alerts from other chips in the bus
as long as the alarm is active.
PEC Support
-----------

View file

@ -15,7 +15,8 @@ Supported adapters:
* Intel 82801I (ICH9)
* Intel EP80579 (Tolapai)
* Intel 82801JI (ICH10)
* Intel PCH
* Intel 3400/5 Series (PCH)
* Intel Cougar Point (PCH)
Datasheets: Publicly available at the Intel website
Authors:

View file

@ -29,6 +29,9 @@ can be easily added when needed.
Earlier kernels defaulted to type=0 (Philips). But now, if the type
parameter is missing, the driver will simply fail to initialize.
SMBus alert support is available on adapters which have this line properly
connected to the parallel port's interrupt pin.
Building your own adapter
-------------------------

View file

@ -9,3 +9,14 @@ parport handling is not an option. The drawback is a reduced portability
and the impossibility to daisy-chain other parallel port devices.
Please see i2c-parport for documentation.
Module parameters:
* type: type of adapter (see i2c-parport or modinfo)
* base: base I/O address
Default is 0x378 which is fairly common for parallel ports, at least on PC.
* irq: optional IRQ
This must be passed if you want SMBus alert support, assuming your adapter
actually supports this.

View file

@ -185,6 +185,22 @@ the protocol. All ARP communications use slave address 0x61 and
require PEC checksums.
SMBus Alert
===========
SMBus Alert was introduced in Revision 1.0 of the specification.
The SMBus alert protocol allows several SMBus slave devices to share a
single interrupt pin on the SMBus master, while still allowing the master
to know which slave triggered the interrupt.
This is implemented the following way in the Linux kernel:
* I2C bus drivers which support SMBus alert should call
i2c_setup_smbus_alert() to setup SMBus alert support.
* I2C drivers for devices which can trigger SMBus alerts should implement
the optional alert() callback.
I2C Block Transactions
======================

View file

@ -318,8 +318,9 @@ Plain I2C communication
These routines read and write some bytes from/to a client. The client
contains the i2c address, so you do not have to include it. The second
parameter contains the bytes to read/write, the third the number of bytes
to read/write (must be less than the length of the buffer.) Returned is
the actual number of bytes read/written.
to read/write (must be less than the length of the buffer, also should be
less than 64k since msg.len is u16.) Returned is the actual number of bytes
read/written.
int i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msg,
int num);

49
Documentation/init.txt Normal file
View file

@ -0,0 +1,49 @@
Explaining the dreaded "No init found." boot hang message
=========================================================
OK, so you've got this pretty unintuitive message (currently located
in init/main.c) and are wondering what the H*** went wrong.
Some high-level reasons for failure (listed roughly in order of execution)
to load the init binary are:
A) Unable to mount root FS
B) init binary doesn't exist on rootfs
C) broken console device
D) binary exists but dependencies not available
E) binary cannot be loaded
Detailed explanations:
0) Set "debug" kernel parameter (in bootloader config file or CONFIG_CMDLINE)
to get more detailed kernel messages.
A) make sure you have the correct root FS type
(and root= kernel parameter points to the correct partition),
required drivers such as storage hardware (such as SCSI or USB!)
and filesystem (ext3, jffs2 etc.) are builtin (alternatively as modules,
to be pre-loaded by an initrd)
C) Possibly a conflict in console= setup --> initial console unavailable.
E.g. some serial consoles are unreliable due to serial IRQ issues (e.g.
missing interrupt-based configuration).
Try using a different console= device or e.g. netconsole= .
D) e.g. required library dependencies of the init binary such as
/lib/ld-linux.so.2 missing or broken. Use readelf -d <INIT>|grep NEEDED
to find out which libraries are required.
E) make sure the binary's architecture matches your hardware.
E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM hardware.
In case you tried loading a non-binary file here (shell script?),
you should make sure that the script specifies an interpreter in its shebang
header line (#!/...) that is fully working (including its library
dependencies). And before tackling scripts, better first test a simple
non-script binary such as /bin/sh and confirm its successful execution.
To find out more, add code to init/main.c to display kernel_execve()s
return values.
Please extend this explanation whenever you find new failure causes
(after all loading the init binary is a CRITICAL and hard transition step
which needs to be made as painless as possible), then submit patch to LKML.
Further TODOs:
- Implement the various run_init_process() invocations via a struct array
which can then store the kernel_execve() result value and on failure
log it all by iterating over _all_ results (very important usability fix).
- try to make the implementation itself more helpful in general,
e.g. by providing additional error messages at affected places.
Andreas Mohr <andi at lisas period de>

View file

@ -1,5 +1,5 @@
Copyright (C) 2002-2008 Sentelic Corporation.
Last update: Oct-31-2008
Copyright (C) 2002-2010 Sentelic Corporation.
Last update: Jan-13-2010
==============================================================================
* Finger Sensing Pad Intellimouse Mode(scrolling wheel, 4th and 5th buttons)
@ -44,7 +44,7 @@ B) MSID 6: Horizontal and Vertical scrolling.
Packet 1
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|l|r|u|d|
1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|r|l|u|d|
|---------------| |---------------| |---------------| |---------------|
Byte 1: Bit7 => Y overflow
@ -59,15 +59,15 @@ Byte 2: X Movement(9-bit 2's complement integers)
Byte 3: Y Movement(9-bit 2's complement integers)
Byte 4: Bit0 => the Vertical scrolling movement downward.
Bit1 => the Vertical scrolling movement upward.
Bit2 => the Vertical scrolling movement rightward.
Bit3 => the Vertical scrolling movement leftward.
Bit2 => the Horizontal scrolling movement leftward.
Bit3 => the Horizontal scrolling movement rightward.
Bit4 => 1 = 4th mouse button is pressed, Forward one page.
0 = 4th mouse button is not pressed.
Bit5 => 1 = 5th mouse button is pressed, Backward one page.
0 = 5th mouse button is not pressed.
C) MSID 7:
# FSP uses 2 packets(8 Bytes) data to represent Absolute Position
# FSP uses 2 packets (8 Bytes) to represent Absolute Position.
so we have PACKET NUMBER to identify packets.
If PACKET NUMBER is 0, the packet is Packet 1.
If PACKET NUMBER is 1, the packet is Packet 2.
@ -129,7 +129,7 @@ Byte 3: Message Type => 0x00 (Disabled)
Byte 4: Bit7~Bit0 => Don't Care
==============================================================================
* Absolute position for STL3888-A0.
* Absolute position for STL3888-Ax.
==============================================================================
Packet 1 (ABSOLUTE POSITION)
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
@ -179,14 +179,14 @@ Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0])
Bit5~Bit4 => y2_g
Bit7~Bit6 => x2_g
Notify Packet for STL3888-A0
Notify Packet for STL3888-Ax
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
1 |1|0|1|P|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |0|0|F|F|0|0|0|i| 4 |r|l|d|u|0|0|0|0|
|---------------| |---------------| |---------------| |---------------|
Byte 1: Bit7~Bit6 => 00, Normal data packet
=> 01, Absolute coordination packet
=> 01, Absolute coordinates packet
=> 10, Notify packet
Bit5 => 1
Bit4 => when in absolute coordinates mode (valid when EN_PKT_GO is 1):
@ -205,15 +205,106 @@ Byte 4: Bit7 => scroll right button
Bit6 => scroll left button
Bit5 => scroll down button
Bit4 => scroll up button
* Note that if gesture and additional button (Bit4~Bit7)
happen at the same time, the button information will not
be sent.
* Note that if gesture and additional buttoni (Bit4~Bit7)
happen at the same time, the button information will not
be sent.
Bit3~Bit0 => Reserved
Sample sequence of Multi-finger, Multi-coordinate mode:
notify packet (valid bit == 1), abs pkt 1, abs pkt 2, abs pkt 1,
abs pkt 2, ..., notify packet(valid bit == 0)
abs pkt 2, ..., notify packet (valid bit == 0)
==============================================================================
* Absolute position for STL3888-B0.
==============================================================================
Packet 1(ABSOLUTE POSITION)
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
1 |0|1|V|F|1|0|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |r|l|u|d|X|X|Y|Y|
|---------------| |---------------| |---------------| |---------------|
Byte 1: Bit7~Bit6 => 00, Normal data packet
=> 01, Absolute coordinates packet
=> 10, Notify packet
Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up.
When both fingers are up, the last two reports have zero valid
bit.
Bit4 => finger up/down information. 1: finger down, 0: finger up.
Bit3 => 1
Bit2 => finger index, 0 is the first finger, 1 is the second finger.
Bit1 => Right Button, 1 is pressed, 0 is not pressed.
Bit0 => Left Button, 1 is pressed, 0 is not pressed.
Byte 2: X coordinate (xpos[9:2])
Byte 3: Y coordinate (ypos[9:2])
Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0])
Bit3~Bit2 => X coordinate (ypos[1:0])
Bit4 => scroll down button
Bit5 => scroll up button
Bit6 => scroll left button
Bit7 => scroll right button
Packet 2 (ABSOLUTE POSITION)
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
1 |0|1|V|F|1|1|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |r|l|u|d|X|X|Y|Y|
|---------------| |---------------| |---------------| |---------------|
Byte 1: Bit7~Bit6 => 00, Normal data packet
=> 01, Absolute coordination packet
=> 10, Notify packet
Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up.
When both fingers are up, the last two reports have zero valid
bit.
Bit4 => finger up/down information. 1: finger down, 0: finger up.
Bit3 => 1
Bit2 => finger index, 0 is the first finger, 1 is the second finger.
Bit1 => Right Button, 1 is pressed, 0 is not pressed.
Bit0 => Left Button, 1 is pressed, 0 is not pressed.
Byte 2: X coordinate (xpos[9:2])
Byte 3: Y coordinate (ypos[9:2])
Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0])
Bit3~Bit2 => X coordinate (ypos[1:0])
Bit4 => scroll down button
Bit5 => scroll up button
Bit6 => scroll left button
Bit7 => scroll right button
Notify Packet for STL3888-B0
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
1 |1|0|1|P|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |0|0|F|F|0|0|0|i| 4 |r|l|u|d|0|0|0|0|
|---------------| |---------------| |---------------| |---------------|
Byte 1: Bit7~Bit6 => 00, Normal data packet
=> 01, Absolute coordination packet
=> 10, Notify packet
Bit5 => 1
Bit4 => when in absolute coordinate mode (valid when EN_PKT_GO is 1):
0: left button is generated by the on-pad command
1: left button is generated by the external button
Bit3 => 1
Bit2 => Middle Button, 1 is pressed, 0 is not pressed.
Bit1 => Right Button, 1 is pressed, 0 is not pressed.
Bit0 => Left Button, 1 is pressed, 0 is not pressed.
Byte 2: Message Type => 0xB7 (Multi Finger, Multi Coordinate mode)
Byte 3: Bit7~Bit6 => Don't care
Bit5~Bit4 => Number of fingers
Bit3~Bit1 => Reserved
Bit0 => 1: enter gesture mode; 0: leaving gesture mode
Byte 4: Bit7 => scroll right button
Bit6 => scroll left button
Bit5 => scroll up button
Bit4 => scroll down button
* Note that if gesture and additional button(Bit4~Bit7)
happen at the same time, the button information will not
be sent.
Bit3~Bit0 => Reserved
Sample sequence of Multi-finger, Multi-coordinate mode:
notify packet (valid bit == 1), abs pkt 1, abs pkt 2, abs pkt 1,
abs pkt 2, ..., notify packet (valid bit == 0)
==============================================================================
* FSP Enable/Disable packet
@ -409,7 +500,8 @@ offset width default r/w name
0: read only, 1: read/write enable
(Note that following registers does not require clock gating being
enabled prior to write: 05 06 07 08 09 0c 0f 10 11 12 16 17 18 23 2e
40 41 42 43.)
40 41 42 43. In addition to that, this bit must be 1 when gesture
mode is enabled)
0x31 RW on-pad command detection
bit7 0 RW on-pad command left button down tag
@ -463,6 +555,10 @@ offset width default r/w name
absolute coordinates; otherwise, host only receives packets with
relative coordinate.)
bit7 0 RW EN_PS2_F2: PS/2 gesture mode 2nd
finger packet enable
0: disable, 1: enable
0x43 RW on-pad control
bit0 0 RW on-pad control enable
0: disable, 1: enable

View file

@ -139,7 +139,6 @@ Code Seq#(hex) Include File Comments
'K' all linux/kd.h
'L' 00-1F linux/loop.h conflict!
'L' 10-1F drivers/scsi/mpt2sas/mpt2sas_ctl.h conflict!
'L' 20-2F linux/usb/vstusb.h
'L' E0-FF linux/ppdd.h encrypted disk device driver
<http://linux01.gwdg.de/~alatham/ppdd.html>
'M' all linux/soundcard.h conflict!

View file

@ -149,10 +149,11 @@ char *(*procinfo)(struct capi_ctr *ctrlr)
pointer to a callback function returning the entry for the device in
the CAPI controller info table, /proc/capi/controller
read_proc_t *ctr_read_proc
pointer to the read_proc callback function for the device's proc file
system entry, /proc/capi/controllers/<n>; will be called with a
pointer to the device's capi_ctr structure as the last (data) argument
const struct file_operations *proc_fops
pointers to callback functions for the device's proc file
system entry, /proc/capi/controllers/<n>; pointer to the device's
capi_ctr structure is available from struct proc_dir_entry::data
which is available from struct inode.
Note: Callback functions except send_message() are never called in interrupt
context.

View file

@ -292,10 +292,10 @@ GigaSet 307x Device Driver
to /etc/modprobe.d/gigaset, /etc/modprobe.conf.local or a similar file.
Problem:
Your isdn script aborts with a message about isdnlog.
The isdnlog program emits error messages or just doesn't work.
Solution:
Try deactivating (or commenting out) isdnlog. This driver does not
support it.
Isdnlog supports only the HiSax driver. Do not attempt to use it with
other drivers such as Gigaset.
Problem:
You have two or more DECT data adapters (M101/M105) and only the
@ -321,8 +321,8 @@ GigaSet 307x Device Driver
writing an appropriate value to /sys/module/gigaset/parameters/debug, e.g.
echo 0 > /sys/module/gigaset/parameters/debug
switches off debugging output completely,
echo 0x10a020 > /sys/module/gigaset/parameters/debug
enables the standard set of debugging output messages. These values are
echo 0x302020 > /sys/module/gigaset/parameters/debug
enables a reasonable set of debugging output messages. These values are
bit patterns where every bit controls a certain type of debugging output.
See the constants DEBUG_* in the source file gigaset.h for details.

View file

@ -54,6 +54,7 @@ parameter is applicable:
IMA Integrity measurement architecture is enabled.
IOSCHED More than one I/O scheduler is enabled.
IP_PNP IP DHCP, BOOTP, or RARP is enabled.
IPV6 IPv6 support is enabled.
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
JOY Appropriate joystick support is enabled.
@ -356,6 +357,9 @@ and is between 256 and 4096 characters. It is defined in the file
Change the amount of debugging information output
when initialising the APIC and IO-APIC components.
autoconf= [IPV6]
See Documentation/networking/ipv6.txt.
show_lapic= [APIC,X86] Advanced Programmable Interrupt Controller
Limit apic dumping. The parameter defines the maximal
number of local apics being dumped. Also it is possible
@ -638,6 +642,12 @@ and is between 256 and 4096 characters. It is defined in the file
See drivers/char/README.epca and
Documentation/serial/digiepca.txt.
disable= [IPV6]
See Documentation/networking/ipv6.txt.
disable_ipv6= [IPV6]
See Documentation/networking/ipv6.txt.
disable_mtrr_cleanup [X86]
The kernel tries to adjust MTRR layout from continuous
to discrete, to make X server driver able to add WB
@ -1738,6 +1748,9 @@ and is between 256 and 4096 characters. It is defined in the file
nomfgpt [X86-32] Disable Multi-Function General Purpose
Timer usage (for AMD Geode machines).
nopat [X86] Disable PAT (page attribute table extension of
pagetables) support.
norandmaps Don't use address space randomization. Equivalent to
echo 0 > /proc/sys/kernel/randomize_va_space
@ -1781,6 +1794,12 @@ and is between 256 and 4096 characters. It is defined in the file
purges which is reported from either PAL_VM_SUMMARY or
SAL PALO.
nr_cpus= [SMP] Maximum number of processors that an SMP kernel
could support. nr_cpus=n : n >= 1 limits the kernel to
supporting 'n' processors. Later in runtime you can not
use hotplug cpu feature to put more cpu back to online.
just like you compile the kernel NR_CPUS=n
nr_uarts= [SERIAL] maximum number of UARTs to be registered.
numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
@ -1948,8 +1967,12 @@ and is between 256 and 4096 characters. It is defined in the file
IRQ routing is enabled.
noacpi [X86] Do not use ACPI for IRQ routing
or for PCI scanning.
use_crs [X86] Use _CRS for PCI resource
allocation.
use_crs [X86] Use PCI host bridge window information
from ACPI. On BIOSes from 2008 or later, this
is enabled by default. If you need to use this,
please report a bug.
nocrs [X86] Ignore PCI host bridge windows from ACPI.
If you need to use this, please report a bug.
routeirq Do IRQ routing for all PCI devices.
This is normally done in pci_enable_device(),
so this option is a temporary workaround
@ -1998,6 +2021,14 @@ and is between 256 and 4096 characters. It is defined in the file
force Enable ASPM even on devices that claim not to support it.
WARNING: Forcing ASPM on may cause system lockups.
pcie_pme= [PCIE,PM] Native PCIe PME signaling options:
off Do not use native PCIe PME signaling.
force Use native PCIe PME signaling even if the BIOS refuses
to allow the kernel to control the relevant PCIe config
registers.
nomsi Do not use MSI for native PCIe PME signaling (this makes
all PCIe root ports use INTx for everything).
pcmv= [HW,PCMCIA] BadgePAD 4
pd. [PARIDE]
@ -2703,6 +2734,13 @@ and is between 256 and 4096 characters. It is defined in the file
medium is write-protected).
Example: quirks=0419:aaf5:rl,0421:0433:rc
userpte=
[X86] Flags controlling user PTE allocations.
nohigh = do not allocate PTE pages in
HIGHMEM regardless of setting
of CONFIG_HIGHPTE.
vdso= [X86,SH]
vdso=2: enable compat VDSO (default with COMPAT_VDSO)
vdso=1: enable VDSO (default)
@ -2796,6 +2834,12 @@ and is between 256 and 4096 characters. It is defined in the file
default x2apic cluster mode on platforms
supporting x2apic.
x86_mrst_timer= [X86-32,APBT]
Choose timer option for x86 Moorestown MID platform.
Two valid options are apbt timer only and lapic timer
plus one apbt timer for broadcast timer.
x86_mrst_timer=apbt_only | lapic_and_apbt
xd= [HW,XT] Original XT pre-IDE (RLL encoded) disks.
xd_geo= See header of drivers/block/xd.c.

View file

@ -266,7 +266,7 @@ kobj_type:
struct kobj_type {
void (*release)(struct kobject *);
struct sysfs_ops *sysfs_ops;
const struct sysfs_ops *sysfs_ops;
struct attribute **default_attrs;
};

View file

@ -1,6 +1,7 @@
Title : Kernel Probes (Kprobes)
Authors : Jim Keniston <jkenisto@us.ibm.com>
: Prasanna S Panchamukhi <prasanna@in.ibm.com>
: Prasanna S Panchamukhi <prasanna.panchamukhi@gmail.com>
: Masami Hiramatsu <mhiramat@redhat.com>
CONTENTS
@ -15,6 +16,7 @@ CONTENTS
9. Jprobes Example
10. Kretprobes Example
Appendix A: The kprobes debugfs interface
Appendix B: The kprobes sysctl interface
1. Concepts: Kprobes, Jprobes, Return Probes
@ -42,13 +44,13 @@ registration/unregistration of a group of *probes. These functions
can speed up unregistration process when you have to unregister
a lot of probes at once.
The next three subsections explain how the different types of
probes work. They explain certain things that you'll need to
know in order to make the best use of Kprobes -- e.g., the
difference between a pre_handler and a post_handler, and how
to use the maxactive and nmissed fields of a kretprobe. But
if you're in a hurry to start using Kprobes, you can skip ahead
to section 2.
The next four subsections explain how the different types of
probes work and how jump optimization works. They explain certain
things that you'll need to know in order to make the best use of
Kprobes -- e.g., the difference between a pre_handler and
a post_handler, and how to use the maxactive and nmissed fields of
a kretprobe. But if you're in a hurry to start using Kprobes, you
can skip ahead to section 2.
1.1 How Does a Kprobe Work?
@ -161,13 +163,125 @@ In case probed function is entered but there is no kretprobe_instance
object available, then in addition to incrementing the nmissed count,
the user entry_handler invocation is also skipped.
1.4 How Does Jump Optimization Work?
If you configured your kernel with CONFIG_OPTPROBES=y (currently
this option is supported on x86/x86-64, non-preemptive kernel) and
the "debug.kprobes_optimization" kernel parameter is set to 1 (see
sysctl(8)), Kprobes tries to reduce probe-hit overhead by using a jump
instruction instead of a breakpoint instruction at each probepoint.
1.4.1 Init a Kprobe
When a probe is registered, before attempting this optimization,
Kprobes inserts an ordinary, breakpoint-based kprobe at the specified
address. So, even if it's not possible to optimize this particular
probepoint, there'll be a probe there.
1.4.2 Safety Check
Before optimizing a probe, Kprobes performs the following safety checks:
- Kprobes verifies that the region that will be replaced by the jump
instruction (the "optimized region") lies entirely within one function.
(A jump instruction is multiple bytes, and so may overlay multiple
instructions.)
- Kprobes analyzes the entire function and verifies that there is no
jump into the optimized region. Specifically:
- the function contains no indirect jump;
- the function contains no instruction that causes an exception (since
the fixup code triggered by the exception could jump back into the
optimized region -- Kprobes checks the exception tables to verify this);
and
- there is no near jump to the optimized region (other than to the first
byte).
- For each instruction in the optimized region, Kprobes verifies that
the instruction can be executed out of line.
1.4.3 Preparing Detour Buffer
Next, Kprobes prepares a "detour" buffer, which contains the following
instruction sequence:
- code to push the CPU's registers (emulating a breakpoint trap)
- a call to the trampoline code which calls user's probe handlers.
- code to restore registers
- the instructions from the optimized region
- a jump back to the original execution path.
1.4.4 Pre-optimization
After preparing the detour buffer, Kprobes verifies that none of the
following situations exist:
- The probe has either a break_handler (i.e., it's a jprobe) or a
post_handler.
- Other instructions in the optimized region are probed.
- The probe is disabled.
In any of the above cases, Kprobes won't start optimizing the probe.
Since these are temporary situations, Kprobes tries to start
optimizing it again if the situation is changed.
If the kprobe can be optimized, Kprobes enqueues the kprobe to an
optimizing list, and kicks the kprobe-optimizer workqueue to optimize
it. If the to-be-optimized probepoint is hit before being optimized,
Kprobes returns control to the original instruction path by setting
the CPU's instruction pointer to the copied code in the detour buffer
-- thus at least avoiding the single-step.
1.4.5 Optimization
The Kprobe-optimizer doesn't insert the jump instruction immediately;
rather, it calls synchronize_sched() for safety first, because it's
possible for a CPU to be interrupted in the middle of executing the
optimized region(*). As you know, synchronize_sched() can ensure
that all interruptions that were active when synchronize_sched()
was called are done, but only if CONFIG_PREEMPT=n. So, this version
of kprobe optimization supports only kernels with CONFIG_PREEMPT=n.(**)
After that, the Kprobe-optimizer calls stop_machine() to replace
the optimized region with a jump instruction to the detour buffer,
using text_poke_smp().
1.4.6 Unoptimization
When an optimized kprobe is unregistered, disabled, or blocked by
another kprobe, it will be unoptimized. If this happens before
the optimization is complete, the kprobe is just dequeued from the
optimized list. If the optimization has been done, the jump is
replaced with the original code (except for an int3 breakpoint in
the first byte) by using text_poke_smp().
(*)Please imagine that the 2nd instruction is interrupted and then
the optimizer replaces the 2nd instruction with the jump *address*
while the interrupt handler is running. When the interrupt
returns to original address, there is no valid instruction,
and it causes an unexpected result.
(**)This optimization-safety checking may be replaced with the
stop-machine method that ksplice uses for supporting a CONFIG_PREEMPT=y
kernel.
NOTE for geeks:
The jump optimization changes the kprobe's pre_handler behavior.
Without optimization, the pre_handler can change the kernel's execution
path by changing regs->ip and returning 1. However, when the probe
is optimized, that modification is ignored. Thus, if you want to
tweak the kernel's execution path, you need to suppress optimization,
using one of the following techniques:
- Specify an empty function for the kprobe's post_handler or break_handler.
or
- Config CONFIG_OPTPROBES=n.
or
- Execute 'sysctl -w debug.kprobes_optimization=n'
2. Architectures Supported
Kprobes, jprobes, and return probes are implemented on the following
architectures:
- i386
- x86_64 (AMD-64, EM64T)
- i386 (Supports jump optimization)
- x86_64 (AMD-64, EM64T) (Supports jump optimization)
- ppc64
- ia64 (Does not support probes on instruction slot1.)
- sparc64 (Return probes not yet implemented.)
@ -193,6 +307,10 @@ it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO),
so you can use "objdump -d -l vmlinux" to see the source-to-object
code mapping.
If you want to reduce probing overhead, set "Kprobes jump optimization
support" (CONFIG_OPTPROBES) to "y". You can find this option under the
"Kprobes" line.
4. API Reference
The Kprobes API includes a "register" function and an "unregister"
@ -389,7 +507,10 @@ the probe which has been registered.
Kprobes allows multiple probes at the same address. Currently,
however, there cannot be multiple jprobes on the same function at
the same time.
the same time. Also, a probepoint for which there is a jprobe or
a post_handler cannot be optimized. So if you install a jprobe,
or a kprobe with a post_handler, at an optimized probepoint, the
probepoint will be unoptimized automatically.
In general, you can install a probe anywhere in the kernel.
In particular, you can probe interrupt handlers. Known exceptions
@ -453,6 +574,38 @@ reason, Kprobes doesn't support return probes (or kprobes or jprobes)
on the x86_64 version of __switch_to(); the registration functions
return -EINVAL.
On x86/x86-64, since the Jump Optimization of Kprobes modifies
instructions widely, there are some limitations to optimization. To
explain it, we introduce some terminology. Imagine a 3-instruction
sequence consisting of a two 2-byte instructions and one 3-byte
instruction.
IA
|
[-2][-1][0][1][2][3][4][5][6][7]
[ins1][ins2][ ins3 ]
[<- DCR ->]
[<- JTPR ->]
ins1: 1st Instruction
ins2: 2nd Instruction
ins3: 3rd Instruction
IA: Insertion Address
JTPR: Jump Target Prohibition Region
DCR: Detoured Code Region
The instructions in DCR are copied to the out-of-line buffer
of the kprobe, because the bytes in DCR are replaced by
a 5-byte jump instruction. So there are several limitations.
a) The instructions in DCR must be relocatable.
b) The instructions in DCR must not include a call instruction.
c) JTPR must not be targeted by any jump or call instruction.
d) DCR must not straddle the border betweeen functions.
Anyway, these limitations are checked by the in-kernel instruction
decoder, so you don't need to worry about that.
6. Probe Overhead
On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0
@ -476,6 +629,19 @@ k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07
ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU)
k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99
6.1 Optimized Probe Overhead
Typically, an optimized kprobe hit takes 0.07 to 0.1 microseconds to
process. Here are sample overhead figures (in usec) for x86 architectures.
k = unoptimized kprobe, b = boosted (single-step skipped), o = optimized kprobe,
r = unoptimized kretprobe, rb = boosted kretprobe, ro = optimized kretprobe.
i386: Intel(R) Xeon(R) E5410, 2.33GHz, 4656.90 bogomips
k = 0.80 usec; b = 0.33; o = 0.05; r = 1.10; rb = 0.61; ro = 0.33
x86-64: Intel(R) Xeon(R) E5410, 2.33GHz, 4656.90 bogomips
k = 0.99 usec; b = 0.43; o = 0.06; r = 1.24; rb = 0.68; ro = 0.30
7. TODO
a. SystemTap (http://sourceware.org/systemtap): Provides a simplified
@ -523,7 +689,8 @@ is also specified. Following columns show probe status. If the probe is on
a virtual address that is no longer valid (module init sections, module
virtual addresses that correspond to modules that've been unloaded),
such probes are marked with [GONE]. If the probe is temporarily disabled,
such probes are marked with [DISABLED].
such probes are marked with [DISABLED]. If the probe is optimized, it is
marked with [OPTIMIZED].
/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
@ -533,3 +700,19 @@ registered probes will be disarmed, till such time a "1" is echoed to this
file. Note that this knob just disarms and arms all kprobes and doesn't
change each probe's disabling state. This means that disabled kprobes (marked
[DISABLED]) will be not enabled if you turn ON all kprobes by this knob.
Appendix B: The kprobes sysctl interface
/proc/sys/debug/kprobes-optimization: Turn kprobes optimization ON/OFF.
When CONFIG_OPTPROBES=y, this sysctl interface appears and it provides
a knob to globally and forcibly turn jump optimization (see section
1.4) ON or OFF. By default, jump optimization is allowed (ON).
If you echo "0" to this file or set "debug.kprobes_optimization" to
0 via sysctl, all optimized probes will be unoptimized, and any new
probes registered after that will not be optimized. Note that this
knob *changes* the optimized state. This means that optimized probes
(marked [OPTIMIZED]) will be unoptimized ([OPTIMIZED] tag will be
removed). If the knob is turned on, they will be optimized again.

View file

@ -23,12 +23,12 @@ of a virtual machine. The ioctls belong to three classes
Only run vcpu ioctls from the same thread that was used to create the
vcpu.
2. File descritpors
2. File descriptors
The kvm API is centered around file descriptors. An initial
open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
handle will create a VM file descripror which can be used to issue VM
handle will create a VM file descriptor which can be used to issue VM
ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
and return a file descriptor pointing to it. Finally, ioctls on a vcpu
fd can be used to control the vcpu, including the important task of
@ -643,7 +643,7 @@ Type: vm ioctl
Parameters: struct kvm_clock_data (in)
Returns: 0 on success, -1 on error
Sets the current timestamp of kvmclock to the valued specific in its parameter.
Sets the current timestamp of kvmclock to the value specified in its parameter.
In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
such as migration.
@ -795,11 +795,11 @@ Unused.
__u64 data_offset; /* relative to kvm_run start */
} io;
If exit_reason is KVM_EXIT_IO_IN or KVM_EXIT_IO_OUT, then the vcpu has
If exit_reason is KVM_EXIT_IO, then the vcpu has
executed a port I/O instruction which could not be satisfied by kvm.
data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
where kvm expects application code to place the data for the next
KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a patcked array.
KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array.
struct {
struct kvm_debug_exit_arch arch;
@ -815,7 +815,7 @@ Unused.
__u8 is_write;
} mmio;
If exit_reason is KVM_EXIT_MMIO or KVM_EXIT_IO_OUT, then the vcpu has
If exit_reason is KVM_EXIT_MMIO, then the vcpu has
executed a memory-mapped I/O instruction which could not be satisfied
by kvm. The 'data' member contains the written data if 'is_write' is
true, and should be filled by application code otherwise.

View file

@ -650,6 +650,10 @@ LCD, CRT or DVI (if available). The following commands are available:
echo expand_toggle > /proc/acpi/ibm/video
echo video_switch > /proc/acpi/ibm/video
NOTE: Access to this feature is restricted to processes owning the
CAP_SYS_ADMIN capability for safety reasons, as it can interact badly
enough with some versions of X.org to crash it.
Each video output device can be enabled or disabled individually.
Reading /proc/acpi/ibm/video shows the status of each device.

View file

@ -34,7 +34,6 @@
#include <sys/uio.h>
#include <termios.h>
#include <getopt.h>
#include <zlib.h>
#include <assert.h>
#include <sched.h>
#include <limits.h>

View file

@ -32,6 +32,8 @@ cs89x0.txt
- the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver
cxacru.txt
- Conexant AccessRunner USB ADSL Modem
cxacru-cf.py
- Conexant AccessRunner USB ADSL Modem configuration file parser
de4x5.txt
- the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver
decnet.txt

View file

@ -0,0 +1,48 @@
#!/usr/bin/env python
# Copyright 2009 Simon Arlott
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
# more details.
#
# You should have received a copy of the GNU General Public License along with
# this program; if not, write to the Free Software Foundation, Inc., 59
# Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
# Usage: cxacru-cf.py < cxacru-cf.bin
# Output: values string suitable for the sysfs adsl_config attribute
#
# Warning: cxacru-cf.bin with MD5 hash cdbac2689969d5ed5d4850f117702110
# contains mis-aligned values which will stop the modem from being able
# to make a connection. If the first and last two bytes are removed then
# the values become valid, but the modulation will be forced to ANSI
# T1.413 only which may not be appropriate.
#
# The original binary format is a packed list of le32 values.
import sys
import struct
i = 0
while True:
buf = sys.stdin.read(4)
if len(buf) == 0:
break
elif len(buf) != 4:
sys.stdout.write("\n")
sys.stderr.write("Error: read {0} not 4 bytes\n".format(len(buf)))
sys.exit(1)
if i > 0:
sys.stdout.write(" ")
sys.stdout.write("{0:x}={1}".format(i, struct.unpack("<I", buf)[0]))
i += 1
sys.stdout.write("\n")

View file

@ -4,6 +4,12 @@ While it is capable of managing/maintaining the ADSL connection without the
module loaded, the device will sometimes stop responding after unloading the
driver and it is necessary to unplug/remove power to the device to fix this.
Note: support for cxacru-cf.bin has been removed. It was not loaded correctly
so it had no effect on the device configuration. Fixing it could have stopped
existing devices working when an invalid configuration is supplied.
There is a script cxacru-cf.py to convert an existing file to the sysfs form.
Detected devices will appear as ATM devices named "cxacru". In /sys/class/atm/
these are directories named cxacruN where N is the device number. A symlink
named device points to the USB interface device's directory which contains
@ -15,6 +21,15 @@ several sysfs attribute files for retrieving device statistics:
* adsl_headend_environment
Information about the remote headend.
* adsl_config
Configuration writing interface.
Write parameters in hexadecimal format <index>=<value>,
separated by whitespace, e.g.:
"1=0 a=5"
Up to 7 parameters at a time will be sent and the modem will restart
the ADSL connection when any value is set. These are logged for future
reference.
* downstream_attenuation (dB)
* downstream_bits_per_frame
* downstream_rate (kbps)
@ -61,6 +76,7 @@ several sysfs attribute files for retrieving device statistics:
* mac_address
* modulation
"" (when not connected)
"ANSI T1.413"
"ITU-T G.992.1 (G.DMT)"
"ITU-T G.992.2 (G.LITE)"

View file

@ -58,8 +58,10 @@ DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet
size (application payload size) in bytes, see RFC 4340, section 14.
DCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs
supported by the endpoint (see include/linux/dccp.h for symbolic constants).
The caller needs to provide a sufficiently large (> 2) array of type uint8_t.
supported by the endpoint. The option value is an array of type uint8_t whose
size is passed as option length. The minimum array size is 4 elements, the
value returned in the optlen argument always reflects the true number of
built-in CCIDs.
DCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same
time, combining the operation of the next two socket options. This option is

View file

@ -487,6 +487,30 @@ tcp_dma_copybreak - INTEGER
and CONFIG_NET_DMA is enabled.
Default: 4096
tcp_thin_linear_timeouts - BOOLEAN
Enable dynamic triggering of linear timeouts for thin streams.
If set, a check is performed upon retransmission by timeout to
determine if the stream is thin (less than 4 packets in flight).
As long as the stream is found to be thin, up to 6 linear
timeouts may be performed before exponential backoff mode is
initiated. This improves retransmission latency for
non-aggressive thin streams, often found to be time-dependent.
For more information on thin streams, see
Documentation/networking/tcp-thin.txt
Default: 0
tcp_thin_dupack - BOOLEAN
Enable dynamic triggering of retransmissions after one dupACK
for thin streams. If set, a check is performed upon reception
of a dupACK to determine if the stream is thin (less than 4
packets in flight). As long as the stream is found to be thin,
data is retransmitted on the first received dupACK. This
improves retransmission latency for non-aggressive thin
streams, often found to be time-dependent.
For more information on thin streams, see
Documentation/networking/tcp-thin.txt
Default: 0
UDP variables:
udp_mem - vector of 3 INTEGERs: min, pressure, max
@ -692,6 +716,25 @@ proxy_arp - BOOLEAN
conf/{all,interface}/proxy_arp is set to TRUE,
it will be disabled otherwise
proxy_arp_pvlan - BOOLEAN
Private VLAN proxy arp.
Basically allow proxy arp replies back to the same interface
(from which the ARP request/solicitation was received).
This is done to support (ethernet) switch features, like RFC
3069, where the individual ports are NOT allowed to
communicate with each other, but they are allowed to talk to
the upstream router. As described in RFC 3069, it is possible
to allow these hosts to communicate through the upstream
router by proxy_arp'ing. Don't need to be used together with
proxy_arp.
This technology is known by different names:
In RFC 3069 it is called VLAN Aggregation.
Cisco and Allied Telesyn call it Private VLAN.
Hewlett-Packard call it Source-Port filtering or port-isolation.
Ericsson call it MAC-Forced Forwarding (RFC Draft).
shared_media - BOOLEAN
Send(router) or accept(host) RFC1620 shared media redirects.
Overrides ip_secure_redirects.
@ -833,9 +876,18 @@ arp_notify - BOOLEAN
or hardware address changes.
arp_accept - BOOLEAN
Define behavior when gratuitous arp replies are received:
0 - drop gratuitous arp frames
1 - accept gratuitous arp frames
Define behavior for gratuitous ARP frames who's IP is not
already present in the ARP table:
0 - don't create new entries in the ARP table
1 - create new entries in the ARP table
Both replies and requests type gratuitous arp will trigger the
ARP table to be updated, if this setting is on.
If the ARP table already contains the IP address of the
gratuitous arp frame, the arp table will be updated regardless
if this setting is on or off.
app_solicit - INTEGER
The maximum number of probes to send to the user space ARP daemon

View file

@ -0,0 +1,90 @@
Linux* Base Driver for Intel(R) Network Connection
==================================================
November 24, 2009
Contents
========
- In This Release
- Identifying Your Adapter
- Known Issues/Troubleshooting
- Support
In This Release
===============
This file describes the ixgbevf Linux* Base Driver for Intel Network
Connection.
The ixgbevf driver supports 82599-based virtual function devices that can only
be activated on kernels with CONFIG_PCI_IOV enabled.
The ixgbevf driver supports virtual functions generated by the ixgbe driver
with a max_vfs value of 1 or greater.
The guest OS loading the ixgbevf driver must support MSI-X interrupts.
VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs.
Identifying Your Adapter
========================
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/network/sb/CS-008441.htm
Known Issues/Troubleshooting
============================
Unloading Physical Function (PF) Driver Causes System Reboots When VM is
Running and VF is Loaded on the VM
------------------------------------------------------------------------
Do not unload the PF driver (ixgbe) while VFs are assigned to guests.
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net
License
=======
Intel 10 Gigabit Linux driver.
Copyright(c) 1999 - 2009 Intel Corporation.
This program is free software; you can redistribute it and/or modify it
under the terms and conditions of the GNU General Public License,
version 2, as published by the Free Software Foundation.
This program is distributed in the hope it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
The full GNU General Public License is included in this distribution in
the file called "COPYING".
Trademarks
==========
Intel, Itanium, and Pentium are trademarks or registered trademarks of
Intel Corporation or its subsidiaries in the United States and other
countries.
* Other names and brands may be claimed as the property of others.

View file

@ -2,7 +2,7 @@
+ ABSTRACT
--------------------------------------------------------------------------------
This file documents the CONFIG_PACKET_MMAP option available with the PACKET
This file documents the mmap() facility available with the PACKET
socket interface on 2.4 and 2.6 kernels. This type of sockets is used for
capture network traffic with utilities like tcpdump or any other that needs
raw access to network interface.
@ -44,7 +44,7 @@ enabled. For transmission, check the MTU (Maximum Transmission Unit) used and
supported by devices of your network.
--------------------------------------------------------------------------------
+ How to use CONFIG_PACKET_MMAP to improve capture process
+ How to use mmap() to improve capture process
--------------------------------------------------------------------------------
From the user standpoint, you should use the higher level libpcap library, which
@ -64,7 +64,7 @@ the low level details or want to improve libpcap by including PACKET_MMAP
support.
--------------------------------------------------------------------------------
+ How to use CONFIG_PACKET_MMAP directly to improve capture process
+ How to use mmap() directly to improve capture process
--------------------------------------------------------------------------------
From the system calls stand point, the use of PACKET_MMAP involves
@ -105,7 +105,7 @@ also the mapping of the circular buffer in the user process and
the use of this buffer.
--------------------------------------------------------------------------------
+ How to use CONFIG_PACKET_MMAP directly to improve transmission process
+ How to use mmap() directly to improve transmission process
--------------------------------------------------------------------------------
Transmission process is similar to capture as shown below.

View file

@ -188,3 +188,27 @@ Then in some part of your code after your wiphy has been registered:
&mydriver_jp_regdom.reg_rules[i],
sizeof(struct ieee80211_reg_rule));
regulatory_struct_hint(rd);
Statically compiled regulatory database
---------------------------------------
In most situations the userland solution using CRDA as described
above is the preferred solution. However in some cases a set of
rules built into the kernel itself may be desirable. To account
for this situation, a configuration option has been provided
(i.e. CONFIG_CFG80211_INTERNAL_REGDB). With this option enabled,
the wireless database information contained in net/wireless/db.txt is
used to generate a data structure encoded in net/wireless/regdb.c.
That option also enables code in net/wireless/reg.c which queries
the data in regdb.c as an alternative to using CRDA.
The file net/wireless/db.txt should be kept up-to-date with the db.txt
file available in the git repository here:
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-regdb.git
Again, most users in most situations should be using the CRDA package
provided with their distribution, and in most other situations users
should be building and using CRDA on their own rather than using
this option. If you are not absolutely sure that you should be using
CONFIG_CFG80211_INTERNAL_REGDB then _DO_NOT_USE_IT_.

View file

@ -0,0 +1,47 @@
Thin-streams and TCP
====================
A wide range of Internet-based services that use reliable transport
protocols display what we call thin-stream properties. This means
that the application sends data with such a low rate that the
retransmission mechanisms of the transport protocol are not fully
effective. In time-dependent scenarios (like online games, control
systems, stock trading etc.) where the user experience depends
on the data delivery latency, packet loss can be devastating for
the service quality. Extreme latencies are caused by TCP's
dependency on the arrival of new data from the application to trigger
retransmissions effectively through fast retransmit instead of
waiting for long timeouts.
After analysing a large number of time-dependent interactive
applications, we have seen that they often produce thin streams
and also stay with this traffic pattern throughout its entire
lifespan. The combination of time-dependency and the fact that the
streams provoke high latencies when using TCP is unfortunate.
In order to reduce application-layer latency when packets are lost,
a set of mechanisms has been made, which address these latency issues
for thin streams. In short, if the kernel detects a thin stream,
the retransmission mechanisms are modified in the following manner:
1) If the stream is thin, fast retransmit on the first dupACK.
2) If the stream is thin, do not apply exponential backoff.
These enhancements are applied only if the stream is detected as
thin. This is accomplished by defining a threshold for the number
of packets in flight. If there are less than 4 packets in flight,
fast retransmissions can not be triggered, and the stream is prone
to experience high retransmission latencies.
Since these mechanisms are targeted at time-dependent applications,
they must be specifically activated by the application using the
TCP_THIN_LINEAR_TIMEOUTS and TCP_THIN_DUPACK IOCTLS or the
tcp_thin_linear_timeouts and tcp_thin_dupack sysctls. Both
modifications are turned off by default.
References
==========
More information on the modifications, as well as a wide range of
experimental data can be found here:
"Improving latency for interactive, thin-stream applications over
reliable transport"
http://simula.no/research/nd/publications/Simula.nd.477/simula_pdf_file

View file

@ -0,0 +1,118 @@
This file explains the locking and exclusion scheme used in the PCCARD
and PCMCIA subsystems.
A) Overview, Locking Hierarchy:
===============================
pcmcia_socket_list_rwsem - protects only the list of sockets
- skt_mutex - serializes card insert / ejection
- ops_mutex - serializes socket operation
B) Exclusion
============
The following functions and callbacks to struct pcmcia_socket must
be called with "skt_mutex" held:
socket_detect_change()
send_event()
socket_reset()
socket_shutdown()
socket_setup()
socket_remove()
socket_insert()
socket_early_resume()
socket_late_resume()
socket_resume()
socket_suspend()
struct pcmcia_callback *callback
The following functions and callbacks to struct pcmcia_socket must
be called with "ops_mutex" held:
socket_reset()
socket_setup()
struct pccard_operations *ops
struct pccard_resource_ops *resource_ops;
Note that send_event() and struct pcmcia_callback *callback must not be
called with "ops_mutex" held.
C) Protection
=============
1. Global Data:
---------------
struct list_head pcmcia_socket_list;
protected by pcmcia_socket_list_rwsem;
2. Per-Socket Data:
-------------------
The resource_ops and their data are protected by ops_mutex.
The "main" struct pcmcia_socket is protected as follows (read-only fields
or single-use fields not mentioned):
- by pcmcia_socket_list_rwsem:
struct list_head socket_list;
- by thread_lock:
unsigned int thread_events;
- by skt_mutex:
u_int suspended_state;
void (*tune_bridge);
struct pcmcia_callback *callback;
int resume_status;
- by ops_mutex:
socket_state_t socket;
u_int state;
u_short lock_count;
pccard_mem_map cis_mem;
void __iomem *cis_virt;
struct { } irq;
io_window_t io[];
pccard_mem_map win[];
struct list_head cis_cache;
size_t fake_cis_len;
u8 *fake_cis;
u_int irq_mask;
void (*zoom_video);
int (*power_hook);
u8 resource...;
struct list_head devices_list;
u8 device_count;
struct pcmcia_state;
3. Per PCMCIA-device Data:
--------------------------
The "main" struct pcmcia_devie is protected as follows (read-only fields
or single-use fields not mentioned):
- by pcmcia_socket->ops_mutex:
struct list_head socket_device_list;
struct config_t *function_config;
u16 _irq:1;
u16 _io:1;
u16 _win:4;
u16 _locked:1;
u16 allow_func_id_match:1;
u16 suspended:1;
u16 _removed:1;
- by the PCMCIA driver:
io_req_t io;
irq_req_t irq;
config_req_t conf;
window_handle_t win;

View file

@ -224,6 +224,12 @@ defined in include/linux/pm.h:
RPM_SUSPENDED, which means that each device is initially regarded by the
PM core as 'suspended', regardless of its real hardware status
unsigned int runtime_auto;
- if set, indicates that the user space has allowed the device driver to
power manage the device at run time via the /sys/devices/.../power/control
interface; it may only be modified with the help of the pm_runtime_allow()
and pm_runtime_forbid() helper functions
All of the above fields are members of the 'power' member of 'struct device'.
4. Run-time PM Device Helper Functions
@ -329,6 +335,20 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
'power.runtime_error' is set or 'power.disable_depth' is greater than
zero)
bool pm_runtime_suspended(struct device *dev);
- return true if the device's runtime PM status is 'suspended', or false
otherwise
void pm_runtime_allow(struct device *dev);
- set the power.runtime_auto flag for the device and decrease its usage
counter (used by the /sys/devices/.../power/control interface to
effectively allow the device to be power managed at run time)
void pm_runtime_forbid(struct device *dev);
- unset the power.runtime_auto flag for the device and increase its usage
counter (used by the /sys/devices/.../power/control interface to
effectively prevent the device from being power managed at run time)
It is safe to execute the following helper functions from interrupt context:
pm_request_idle()
@ -382,6 +402,18 @@ may be desirable to suspend the device as soon as ->probe() or ->remove() has
finished, so the PM core uses pm_runtime_idle_sync() to invoke the
subsystem-level idle callback for the device at that time.
The user space can effectively disallow the driver of the device to power manage
it at run time by changing the value of its /sys/devices/.../power/control
attribute to "on", which causes pm_runtime_forbid() to be called. In principle,
this mechanism may also be used by the driver to effectively turn off the
run-time power management of the device until the user space turns it on.
Namely, during the initialization the driver can make sure that the run-time PM
status of the device is 'active' and call pm_runtime_forbid(). It should be
noted, however, that if the user space has already intentionally changed the
value of /sys/devices/.../power/control to "auto" to allow the driver to power
manage the device at run time, the driver may confuse it by using
pm_runtime_forbid() this way.
6. Run-time PM and System Sleep
Run-time PM and system sleep (i.e., system suspend and hibernation, also known
@ -431,3 +463,64 @@ The PM core always increments the run-time usage counter before calling the
->prepare() callback and decrements it after calling the ->complete() callback.
Hence disabling run-time PM temporarily like this will not cause any run-time
suspend callbacks to be lost.
7. Generic subsystem callbacks
Subsystems may wish to conserve code space by using the set of generic power
management callbacks provided by the PM core, defined in
driver/base/power/generic_ops.c:
int pm_generic_runtime_idle(struct device *dev);
- invoke the ->runtime_idle() callback provided by the driver of this
device, if defined, and call pm_runtime_suspend() for this device if the
return value is 0 or the callback is not defined
int pm_generic_runtime_suspend(struct device *dev);
- invoke the ->runtime_suspend() callback provided by the driver of this
device and return its result, or return -EINVAL if not defined
int pm_generic_runtime_resume(struct device *dev);
- invoke the ->runtime_resume() callback provided by the driver of this
device and return its result, or return -EINVAL if not defined
int pm_generic_suspend(struct device *dev);
- if the device has not been suspended at run time, invoke the ->suspend()
callback provided by its driver and return its result, or return 0 if not
defined
int pm_generic_resume(struct device *dev);
- invoke the ->resume() callback provided by the driver of this device and,
if successful, change the device's runtime PM status to 'active'
int pm_generic_freeze(struct device *dev);
- if the device has not been suspended at run time, invoke the ->freeze()
callback provided by its driver and return its result, or return 0 if not
defined
int pm_generic_thaw(struct device *dev);
- if the device has not been suspended at run time, invoke the ->thaw()
callback provided by its driver and return its result, or return 0 if not
defined
int pm_generic_poweroff(struct device *dev);
- if the device has not been suspended at run time, invoke the ->poweroff()
callback provided by its driver and return its result, or return 0 if not
defined
int pm_generic_restore(struct device *dev);
- invoke the ->restore() callback provided by the driver of this device and,
if successful, change the device's runtime PM status to 'active'
These functions can be assigned to the ->runtime_idle(), ->runtime_suspend(),
->runtime_resume(), ->suspend(), ->resume(), ->freeze(), ->thaw(), ->poweroff(),
or ->restore() callback pointers in the subsystem-level dev_pm_ops structures.
If a subsystem wishes to use all of them at the same time, it can simply assign
the GENERIC_SUBSYS_PM_OPS macro, defined in include/linux/pm.h, to its
dev_pm_ops structure pointer.
Device drivers that wish to use the same function as a system suspend, freeze,
poweroff and run-time suspend callback, and similarly for system resume, thaw,
restore, and run-time resume, can achieve this with the help of the
UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its
last argument to NULL).

View file

@ -0,0 +1,53 @@
CAN Device Tree Bindings
------------------------
(c) 2006-2009 Secret Lab Technologies Ltd
Grant Likely <grant.likely@secretlab.ca>
fsl,mpc5200-mscan nodes
-----------------------
In addition to the required compatible-, reg- and interrupt-properties, you can
also specify which clock source shall be used for the controller:
- fsl,mscan-clock-source : a string describing the clock source. Valid values
are: "ip" for ip bus clock
"ref" for reference clock (XTAL)
"ref" is default in case this property is not
present.
fsl,mpc5121-mscan nodes
-----------------------
In addition to the required compatible-, reg- and interrupt-properties, you can
also specify which clock source and divider shall be used for the controller:
- fsl,mscan-clock-source : a string describing the clock source. Valid values
are: "ip" for ip bus clock
"ref" for reference clock
"sys" for system clock
If this property is not present, an optimal CAN
clock source and frequency based on the system
clock will be selected. If this is not possible,
the reference clock will be used.
- fsl,mscan-clock-divider: for the reference and system clock, an additional
clock divider can be specified. By default, a
value of 1 is used.
Note that the MPC5121 Rev. 1 processor is not supported.
Examples:
can@1300 {
compatible = "fsl,mpc5121-mscan";
interrupts = <12 0x8>;
interrupt-parent = <&ipic>;
reg = <0x1300 0x80>;
};
can@1380 {
compatible = "fsl,mpc5121-mscan";
interrupts = <13 0x8>;
interrupt-parent = <&ipic>;
reg = <0x1380 0x80>;
fsl,mscan-clock-source = "ref";
fsl,mscan-clock-divider = <3>;
};

View file

@ -44,21 +44,29 @@ Example:
compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel";
cell-index = <0>;
reg = <0 0x80>;
interrupt-parent = <&ipic>;
interrupts = <71 8>;
};
dma-channel@80 {
compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel";
cell-index = <1>;
reg = <0x80 0x80>;
interrupt-parent = <&ipic>;
interrupts = <71 8>;
};
dma-channel@100 {
compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel";
cell-index = <2>;
reg = <0x100 0x80>;
interrupt-parent = <&ipic>;
interrupts = <71 8>;
};
dma-channel@180 {
compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel";
cell-index = <3>;
reg = <0x180 0x80>;
interrupt-parent = <&ipic>;
interrupts = <71 8>;
};
};

View file

@ -2,15 +2,14 @@
Required properties :
- device_type : Should be "i2c"
- reg : Offset and length of the register set for the device
- compatible : should be "fsl,CHIP-i2c" where CHIP is the name of a
compatible processor, e.g. mpc8313, mpc8543, mpc8544, mpc5121,
mpc5200 or mpc5200b. For the mpc5121, an additional node
"fsl,mpc5121-i2c-ctrl" is required as shown in the example below.
Recommended properties :
- compatible : compatibility list with 2 entries, the first should
be "fsl,CHIP-i2c" where CHIP is the name of a compatible processor,
e.g. mpc8313, mpc8543, mpc8544, mpc5200 or mpc5200b. The second one
should be "fsl-i2c".
- interrupts : <a b> where a is the interrupt number and b is a
field that represents an encoding of the sense and level
information for the interrupt. This should be encoded based on
@ -24,25 +23,40 @@ Recommended properties :
Examples :
/* MPC5121 based board */
i2c@1740 {
#address-cells = <1>;
#size-cells = <0>;
compatible = "fsl,mpc5121-i2c", "fsl-i2c";
reg = <0x1740 0x20>;
interrupts = <11 0x8>;
interrupt-parent = <&ipic>;
clock-frequency = <100000>;
};
i2ccontrol@1760 {
compatible = "fsl,mpc5121-i2c-ctrl";
reg = <0x1760 0x8>;
};
/* MPC5200B based board */
i2c@3d00 {
#address-cells = <1>;
#size-cells = <0>;
compatible = "fsl,mpc5200b-i2c","fsl,mpc5200-i2c","fsl-i2c";
cell-index = <0>;
reg = <0x3d00 0x40>;
interrupts = <2 15 0>;
interrupt-parent = <&mpc5200_pic>;
fsl,preserve-clocking;
};
/* MPC8544 base board */
i2c@3100 {
#address-cells = <1>;
#size-cells = <0>;
cell-index = <1>;
compatible = "fsl,mpc8544-i2c", "fsl-i2c";
reg = <0x3100 0x100>;
interrupts = <43 2>;
interrupt-parent = <&mpic>;
clock-frequency = <400000>;
};

View file

@ -0,0 +1,70 @@
MPC5121 PSC Device Tree Bindings
PSC in UART mode
----------------
For PSC in UART mode the needed PSC serial devices
are specified by fsl,mpc5121-psc-uart nodes in the
fsl,mpc5121-immr SoC node. Additionally the PSC FIFO
Controller node fsl,mpc5121-psc-fifo is requered there:
fsl,mpc5121-psc-uart nodes
--------------------------
Required properties :
- compatible : Should contain "fsl,mpc5121-psc-uart" and "fsl,mpc5121-psc"
- cell-index : Index of the PSC in hardware
- reg : Offset and length of the register set for the PSC device
- interrupts : <a b> where a is the interrupt number of the
PSC FIFO Controller and b is a field that represents an
encoding of the sense and level information for the interrupt.
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
Recommended properties :
- fsl,rx-fifo-size : the size of the RX fifo slice (a multiple of 4)
- fsl,tx-fifo-size : the size of the TX fifo slice (a multiple of 4)
fsl,mpc5121-psc-fifo node
-------------------------
Required properties :
- compatible : Should be "fsl,mpc5121-psc-fifo"
- reg : Offset and length of the register set for the PSC
FIFO Controller
- interrupts : <a b> where a is the interrupt number of the
PSC FIFO Controller and b is a field that represents an
encoding of the sense and level information for the interrupt.
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
Example for a board using PSC0 and PSC1 devices in serial mode:
serial@11000 {
compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc";
cell-index = <0>;
reg = <0x11000 0x100>;
interrupts = <40 0x8>;
interrupt-parent = < &ipic >;
fsl,rx-fifo-size = <16>;
fsl,tx-fifo-size = <16>;
};
serial@11100 {
compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc";
cell-index = <1>;
reg = <0x11100 0x100>;
interrupts = <40 0x8>;
interrupt-parent = < &ipic >;
fsl,rx-fifo-size = <16>;
fsl,tx-fifo-size = <16>;
};
pscfifo@11f00 {
compatible = "fsl,mpc5121-psc-fifo";
reg = <0x11f00 0x100>;
interrupts = <40 0x8>;
interrupt-parent = < &ipic >;
};

View file

@ -195,11 +195,4 @@ External interrupts:
fsl,mpc5200-mscan nodes
-----------------------
In addition to the required compatible-, reg- and interrupt-properites, you can
also specify which clock source shall be used for the controller:
- fsl,mscan-clock-source- a string describing the clock source. Valid values
are: "ip" for ip bus clock
"ref" for reference clock (XTAL)
"ref" is default in case this property is not
present.
See file can.txt in this directory.

View file

@ -13,6 +13,11 @@ Required properties:
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
Optional properties:
- gpios : specifies the gpio pins to be used for chipselects.
The gpios will be referred to as reg = <index> in the SPI child nodes.
If unspecified, a single SPI device without a chip select can be used.
Example:
spi@4c0 {
cell-index = <0>;
@ -21,4 +26,6 @@ Example:
interrupts = <82 0>;
interrupt-parent = <700>;
mode = "cpu";
gpios = <&gpio 18 1 // device reg=<0>
&gpio 19 1>; // device reg=<1>
};

View file

@ -0,0 +1,134 @@
GDB intends to support the following hardware debug features of BookE
processors:
4 hardware breakpoints (IAC)
2 hardware watchpoints (read, write and read-write) (DAC)
2 value conditions for the hardware watchpoints (DVC)
For that, we need to extend ptrace so that GDB can query and set these
resources. Since we're extending, we're trying to create an interface
that's extendable and that covers both BookE and server processors, so
that GDB doesn't need to special-case each of them. We added the
following 3 new ptrace requests.
1. PTRACE_PPC_GETHWDEBUGINFO
Query for GDB to discover the hardware debug features. The main info to
be returned here is the minimum alignment for the hardware watchpoints.
BookE processors don't have restrictions here, but server processors have
an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid
adding special cases to GDB based on what it sees in AUXV.
Since we're at it, we added other useful info that the kernel can return to
GDB: this query will return the number of hardware breakpoints, hardware
watchpoints and whether it supports a range of addresses and a condition.
The query will fill the following structure provided by the requesting process:
struct ppc_debug_info {
unit32_t version;
unit32_t num_instruction_bps;
unit32_t num_data_bps;
unit32_t num_condition_regs;
unit32_t data_bp_alignment;
unit32_t sizeof_condition; /* size of the DVC register */
uint64_t features; /* bitmask of the individual flags */
};
features will have bits indicating whether there is support for:
#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
2. PTRACE_SETHWDEBUG
Sets a hardware breakpoint or watchpoint, according to the provided structure:
struct ppc_hw_breakpoint {
uint32_t version;
#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
#define PPC_BREAKPOINT_TRIGGER_READ 0x2
#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
uint32_t trigger_type; /* only some combinations allowed */
#define PPC_BREAKPOINT_MODE_EXACT 0x0
#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
#define PPC_BREAKPOINT_MODE_MASK 0x3
uint32_t addr_mode; /* address match mode */
#define PPC_BREAKPOINT_CONDITION_MODE 0x3
#define PPC_BREAKPOINT_CONDITION_NONE 0x0
#define PPC_BREAKPOINT_CONDITION_AND 0x1
#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
#define PPC_BREAKPOINT_CONDITION_OR 0x2
#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
uint32_t condition_mode; /* break/watchpoint condition flags */
uint64_t addr;
uint64_t addr2;
uint64_t condition_value;
};
A request specifies one event, not necessarily just one register to be set.
For instance, if the request is for a watchpoint with a condition, both the
DAC and DVC registers will be set in the same request.
With this GDB can ask for all kinds of hardware breakpoints and watchpoints
that the BookE supports. COMEFROM breakpoints available in server processors
are not contemplated, but that is out of the scope of this work.
ptrace will return an integer (handle) uniquely identifying the breakpoint or
watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG
request to ask for its removal. Return -ENOSPC if the requested breakpoint
can't be allocated on the registers.
Some examples of using the structure to:
- set a breakpoint in the first breakpoint register
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers on reads in the second watchpoint register
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers only with a specific value
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = (uint64_t) condition;
- set a ranged hardware breakpoint
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) begin_range;
p.addr2 = (uint64_t) end_range;
p.condition_value = 0;
3. PTRACE_DELHWDEBUG
Takes an integer which identifies an existing breakpoint or watchpoint
(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the
corresponding breakpoint or watchpoint..

View file

@ -87,6 +87,12 @@ Command line parameters
compatibility, by the device number in hexadecimal (0xabcd or abcd). Device
numbers given as 0xabcd will be interpreted as 0.0.abcd.
* /proc/cio_settle
A write request to this file is blocked until all queued cio actions are
handled. This will allow userspace to wait for pending work affecting
device availability after changing cio_ignore or the hardware configuration.
* For some of the information present in the /proc filesystem in 2.4 (namely,
/proc/subchannels and /proc/chpids), see driver-model.txt.
Information formerly in /proc/irq_count is now in /proc/interrupts.

View file

@ -223,8 +223,8 @@ touched by the driver - it should use the ccwgroup device's driver_data for its
private data.
To implement a ccwgroup driver, please refer to include/asm/ccwgroup.h. Keep in
mind that most drivers will need to implement both a ccwgroup and a ccw driver
(unless you have a meta ccw driver, like cu3088 for lcs and ctc).
mind that most drivers will need to implement both a ccwgroup and a ccw
driver.
2. Channel paths

View file

@ -1,3 +1,19 @@
1 Release Date : Thur. Oct 29, 2009 09:12:45 PST 2009 -
(emaild-id:megaraidlinux@lsi.com)
Bo Yang
2 Current Version : 00.00.04.17.1-rc1
3 Older Version : 00.00.04.12
1. Add the pad_0 in mfi frame structure to 0 to fix the
context value larger than 32bit value issue.
2. Add the logic drive list to the driver. Driver will
keep the logic drive list internal after driver load.
3. driver fixed the device update issue after get the AEN
PD delete/ADD, LD add/delete from FW.
1 Release Date : Tues. July 28, 2009 10:12:45 PST 2009 -
(emaild-id:megaraidlinux@lsi.com)
Bo Yang

View file

@ -482,6 +482,9 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
reference_rate - reference sample rate, 44100 or 48000 (default)
multiple - multiple to ref. sample rate, 1 or 2 (default)
subsystem - override the PCI SSID for probing; the value
consists of SSVID << 16 | SSDID. The default is
zero, which means no override.
This module supports multiple cards.
@ -1123,6 +1126,21 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
This module supports multiple cards, autoprobe and ISA PnP.
Module snd-jazz16
-------------------
Module for Media Vision Jazz16 chipset. The chipset consists of 3 chips:
MVD1216 + MVA416 + MVA514.
port - port # for SB DSP chip (0x210,0x220,0x230,0x240,0x250,0x260)
irq - IRQ # for SB DSP chip (3,5,7,9,10,15)
dma8 - DMA # for SB DSP chip (1,3)
dma16 - DMA # for SB DSP chip (5,7)
mpu_port - MPU-401 port # (0x300,0x310,0x320,0x330)
mpu_irq - MPU-401 irq # (2,3,5,7)
This module supports multiple cards.
Module snd-korg1212
-------------------
@ -1791,6 +1809,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
The power-management is supported.
Module snd-ua101
----------------
Module for the Edirol UA-101/UA-1000 audio/MIDI interfaces.
This module supports multiple devices, autoprobe and hotplugging.
Module snd-usb-audio
--------------------
@ -1923,7 +1948,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
-------------------
Module for sound cards based on the Asus AV100/AV200 chips,
i.e., Xonar D1, DX, D2, D2X, HDAV1.3 (Deluxe), Essence ST
i.e., Xonar D1, DX, D2, D2X, DS, HDAV1.3 (Deluxe), Essence ST
(Deluxe) and Essence STX.
This module supports autoprobe and multiple cards.

View file

@ -124,6 +124,8 @@ ALC882/883/885/888/889
asus-a7m ASUS A7M
macpro MacPro support
mb5 Macbook 5,1
macmini3 Macmini 3,1
mba21 Macbook Air 2,1
mbp3 Macbook Pro rev3
imac24 iMac 24'' with jack detection
imac91 iMac 9,1
@ -279,13 +281,16 @@ Conexant 5051
laptop Basic Laptop config (default)
hp HP Spartan laptop
hp-dv6736 HP dv6736
hp-f700 HP Compaq Presario F700
lenovo-x200 Lenovo X200 laptop
toshiba Toshiba Satellite M300
Conexant 5066
=============
laptop Basic Laptop config (default)
dell-laptop Dell laptops
olpc-xo-1_5 OLPC XO 1.5
ideapad Lenovo IdeaPad U150
STAC9200
========

View file

@ -452,6 +452,33 @@ Similarly, the lines after `[verb]` are parsed as `init_verbs`
sysfs entries, and the lines after `[hint]` are parsed as `hints`
sysfs entries, respectively.
Another example to override the codec vendor id from 0x12345678 to
0xdeadbeef is like below:
------------------------------------------------------------------------
[codec]
0x12345678 0xabcd1234 2
[vendor_id]
0xdeadbeef
------------------------------------------------------------------------
In the similar way, you can override the codec subsystem_id via
`[subsystem_id]`, the revision id via `[revision_id]` line.
Also, the codec chip name can be rewritten via `[chip_name]` line.
------------------------------------------------------------------------
[codec]
0x12345678 0xabcd1234 2
[subsystem_id]
0xffff1111
[revision_id]
0x10
[chip_name]
My-own NEWS-0002
------------------------------------------------------------------------
The hd-audio driver reads the file via request_firmware(). Thus,
a patch file has to be located on the appropriate firmware path,
typically, /lib/firmware. For example, when you pass the option

View file

@ -238,11 +238,10 @@ HAVE_SYSCALL_TRACEPOINTS
You need very few things to get the syscalls tracing in an arch.
- Support HAVE_ARCH_TRACEHOOK (see arch/Kconfig).
- Have a NR_syscalls variable in <asm/unistd.h> that provides the number
of syscalls supported by the arch.
- Implement arch_syscall_addr() that resolves a syscall address from a
syscall number.
- Support the TIF_SYSCALL_TRACEPOINT thread flags
- Support the TIF_SYSCALL_TRACEPOINT thread flags.
- Put the trace_sys_enter() and trace_sys_exit() tracepoints calls from ptrace
in the ptrace syscalls tracing path.
- Tag this arch as HAVE_SYSCALL_TRACEPOINTS.

View file

@ -41,8 +41,8 @@ USB-specific:
-EFBIG Host controller driver can't schedule that many ISO frames.
-EPIPE Specified endpoint is stalled. For non-control endpoints,
reset this status with usb_clear_halt().
-EPIPE The pipe type specified in the URB doesn't match the
endpoint's actual type.
-EMSGSIZE (a) endpoint maxpacket size is zero; it is not usable
in the current interface altsetting.
@ -60,6 +60,8 @@ USB-specific:
-EHOSTUNREACH URB was rejected because the device is suspended.
-ENOEXEC A control URB doesn't contain a Setup packet.
**************************************************************************
* Error codes returned by in urb->status *

View file

@ -2,7 +2,7 @@
Alan Stern <stern@rowland.harvard.edu>
November 10, 2009
December 11, 2009
@ -29,9 +29,9 @@ covered to some extent (see Documentation/power/*.txt for more
information about system PM).
Note: Dynamic PM support for USB is present only if the kernel was
built with CONFIG_USB_SUSPEND enabled. System PM support is present
only if the kernel was built with CONFIG_SUSPEND or CONFIG_HIBERNATION
enabled.
built with CONFIG_USB_SUSPEND enabled (which depends on
CONFIG_PM_RUNTIME). System PM support is present only if the kernel
was built with CONFIG_SUSPEND or CONFIG_HIBERNATION enabled.
What is Remote Wakeup?
@ -229,6 +229,11 @@ necessary operations by hand or add them to a udev script. You can
also change the idle-delay time; 2 seconds is not the best choice for
every device.
If a driver knows that its device has proper suspend/resume support,
it can enable autosuspend all by itself. For example, the video
driver for a laptop's webcam might do this, since these devices are
rarely used and so should normally be autosuspended.
Sometimes it turns out that even when a device does work okay with
autosuspend there are still problems. For example, there are
experimental patches adding autosuspend support to the usbhid driver,
@ -321,69 +326,81 @@ driver does so by calling these six functions:
void usb_autopm_get_interface_no_resume(struct usb_interface *intf);
void usb_autopm_put_interface_no_suspend(struct usb_interface *intf);
The functions work by maintaining a counter in the usb_interface
structure. When intf->pm_usage_count is > 0 then the interface is
deemed to be busy, and the kernel will not autosuspend the interface's
device. When intf->pm_usage_count is <= 0 then the interface is
considered to be idle, and the kernel may autosuspend the device.
The functions work by maintaining a usage counter in the
usb_interface's embedded device structure. When the counter is > 0
then the interface is deemed to be busy, and the kernel will not
autosuspend the interface's device. When the usage counter is = 0
then the interface is considered to be idle, and the kernel may
autosuspend the device.
(There is a similar pm_usage_count field in struct usb_device,
(There is a similar usage counter field in struct usb_device,
associated with the device itself rather than any of its interfaces.
This field is used only by the USB core.)
This counter is used only by the USB core.)
Drivers must not modify intf->pm_usage_count directly; its value
should be changed only be using the functions listed above. Drivers
are responsible for insuring that the overall change to pm_usage_count
during their lifetime balances out to 0 (it may be necessary for the
disconnect method to call usb_autopm_put_interface() one or more times
to fulfill this requirement). The first two routines use the PM mutex
in struct usb_device for mutual exclusion; drivers using the async
routines are responsible for their own synchronization and mutual
exclusion.
Drivers need not be concerned about balancing changes to the usage
counter; the USB core will undo any remaining "get"s when a driver
is unbound from its interface. As a corollary, drivers must not call
any of the usb_autopm_* functions after their diconnect() routine has
returned.
usb_autopm_get_interface() increments pm_usage_count and
attempts an autoresume if the new value is > 0 and the
device is suspended.
Drivers using the async routines are responsible for their own
synchronization and mutual exclusion.
usb_autopm_put_interface() decrements pm_usage_count and
attempts an autosuspend if the new value is <= 0 and the
device isn't suspended.
usb_autopm_get_interface() increments the usage counter and
does an autoresume if the device is suspended. If the
autoresume fails, the counter is decremented back.
usb_autopm_put_interface() decrements the usage counter and
attempts an autosuspend if the new value is = 0.
usb_autopm_get_interface_async() and
usb_autopm_put_interface_async() do almost the same things as
their non-async counterparts. The differences are: they do
not acquire the PM mutex, and they use a workqueue to do their
their non-async counterparts. The big difference is that they
use a workqueue to do the resume or suspend part of their
jobs. As a result they can be called in an atomic context,
such as an URB's completion handler, but when they return the
device will not generally not yet be in the desired state.
device will generally not yet be in the desired state.
usb_autopm_get_interface_no_resume() and
usb_autopm_put_interface_no_suspend() merely increment or
decrement the pm_usage_count value; they do not attempt to
carry out an autoresume or an autosuspend. Hence they can be
called in an atomic context.
decrement the usage counter; they do not attempt to carry out
an autoresume or an autosuspend. Hence they can be called in
an atomic context.
The conventional usage pattern is that a driver calls
The simplest usage pattern is that a driver calls
usb_autopm_get_interface() in its open routine and
usb_autopm_put_interface() in its close or release routine. But
other patterns are possible.
usb_autopm_put_interface() in its close or release routine. But other
patterns are possible.
The autosuspend attempts mentioned above will often fail for one
reason or another. For example, the power/level attribute might be
set to "on", or another interface in the same device might not be
idle. This is perfectly normal. If the reason for failure was that
the device hasn't been idle for long enough, a delayed workqueue
routine is automatically set up to carry out the operation when the
autosuspend idle-delay has expired.
the device hasn't been idle for long enough, a timer is scheduled to
carry out the operation automatically when the autosuspend idle-delay
has expired.
Autoresume attempts also can fail, although failure would mean that
the device is no longer present or operating properly. Unlike
autosuspend, there's no delay for an autoresume.
autosuspend, there's no idle-delay for an autoresume.
Other parts of the driver interface
-----------------------------------
Drivers can enable autosuspend for their devices by calling
usb_enable_autosuspend(struct usb_device *udev);
in their probe() routine, if they know that the device is capable of
suspending and resuming correctly. This is exactly equivalent to
writing "auto" to the device's power/level attribute. Likewise,
drivers can disable autosuspend by calling
usb_disable_autosuspend(struct usb_device *udev);
This is exactly the same as writing "on" to the power/level attribute.
Sometimes a driver needs to make sure that remote wakeup is enabled
during autosuspend. For example, there's not much point
autosuspending a keyboard if the user can't cause the keyboard to do a
@ -395,26 +412,27 @@ though, setting this flag won't cause the kernel to autoresume it.
Normally a driver would set this flag in its probe method, at which
time the device is guaranteed not to be autosuspended.)
The synchronous usb_autopm_* routines have to run in a sleepable
process context; they must not be called from an interrupt handler or
while holding a spinlock. In fact, the entire autosuspend mechanism
is not well geared toward interrupt-driven operation. However there
is one thing a driver can do in an interrupt handler:
If a driver does its I/O asynchronously in interrupt context, it
should call usb_autopm_get_interface_async() before starting output and
usb_autopm_put_interface_async() when the output queue drains. When
it receives an input event, it should call
usb_mark_last_busy(struct usb_device *udev);
This sets udev->last_busy to the current time. udev->last_busy is the
field used for idle-delay calculations; updating it will cause any
pending autosuspend to be moved back. The usb_autopm_* routines will
also set the last_busy field to the current time.
in the event handler. This sets udev->last_busy to the current time.
udev->last_busy is the field used for idle-delay calculations;
updating it will cause any pending autosuspend to be moved back. Most
of the usb_autopm_* routines will also set the last_busy field to the
current time.
Calling urb_mark_last_busy() from within an URB completion handler is
subject to races: The kernel may have just finished deciding the
device has been idle for long enough but not yet gotten around to
calling the driver's suspend method. The driver would have to be
responsible for synchronizing its suspend method with its URB
completion handler and causing the autosuspend to fail with -EBUSY if
an URB had completed too recently.
Asynchronous operation is always subject to races. For example, a
driver may call one of the usb_autopm_*_interface_async() routines at
a time when the core has just finished deciding the device has been
idle for long enough but not yet gotten around to calling the driver's
suspend method. The suspend method must be responsible for
synchronizing with the output request routine and the URB completion
handler; it should cause autosuspends to fail with -EBUSY if the
driver needs to use the device.
External suspend calls should never be allowed to fail in this way,
only autosuspend calls. The driver can tell them apart by checking
@ -422,75 +440,23 @@ the PM_EVENT_AUTO bit in the message.event argument to the suspend
method; this bit will be set for internal PM events (autosuspend) and
clear for external PM events.
Many of the ingredients in the autosuspend framework are oriented
towards interfaces: The usb_interface structure contains the
pm_usage_cnt field, and the usb_autopm_* routines take an interface
pointer as their argument. But somewhat confusingly, a few of the
pieces (i.e., usb_mark_last_busy()) use the usb_device structure
instead. Drivers need to keep this straight; they can call
interface_to_usbdev() to find the device structure for a given
interface.
Mutual exclusion
----------------
Locking requirements
--------------------
All three suspend/resume methods are always called while holding the
usb_device's PM mutex. For external events -- but not necessarily for
autosuspend or autoresume -- the device semaphore (udev->dev.sem) will
also be held. This implies that external suspend/resume events are
mutually exclusive with calls to probe, disconnect, pre_reset, and
post_reset; the USB core guarantees that this is true of internal
suspend/resume events as well.
For external events -- but not necessarily for autosuspend or
autoresume -- the device semaphore (udev->dev.sem) will be held when a
suspend or resume method is called. This implies that external
suspend/resume events are mutually exclusive with calls to probe,
disconnect, pre_reset, and post_reset; the USB core guarantees that
this is true of autosuspend/autoresume events as well.
If a driver wants to block all suspend/resume calls during some
critical section, it can simply acquire udev->pm_mutex. Note that
calls to resume may be triggered indirectly. Block IO due to memory
allocations can make the vm subsystem resume a device. Thus while
holding this lock you must not allocate memory with GFP_KERNEL or
GFP_NOFS.
Alternatively, if the critical section might call some of the
usb_autopm_* routines, the driver can avoid deadlock by doing:
down(&udev->dev.sem);
rc = usb_autopm_get_interface(intf);
and at the end of the critical section:
if (!rc)
usb_autopm_put_interface(intf);
up(&udev->dev.sem);
Holding the device semaphore will block all external PM calls, and the
usb_autopm_get_interface() will prevent any internal PM calls, even if
it fails. (Exercise: Why?)
The rules for locking order are:
Never acquire any device semaphore while holding any PM mutex.
Never acquire udev->pm_mutex while holding the PM mutex for
a device that isn't a descendant of udev.
In other words, PM mutexes should only be acquired going up the device
tree, and they should be acquired only after locking all the device
semaphores you need to hold. These rules don't matter to drivers very
much; they usually affect just the USB core.
Still, drivers do need to be careful. For example, many drivers use a
private mutex to synchronize their normal I/O activities with their
disconnect method. Now if the driver supports autosuspend then it
must call usb_autopm_put_interface() from somewhere -- maybe from its
close method. It should make the call while holding the private mutex,
since a driver shouldn't call any of the usb_autopm_* functions for an
interface from which it has been unbound.
But the usb_autpm_* routines always acquire the device's PM mutex, and
consequently the locking order has to be: private mutex first, PM
mutex second. Since the suspend method is always called with the PM
mutex held, it mustn't try to acquire the private mutex. It has to
synchronize with the driver's I/O activities in some other way.
critical section, the best way is to lock the device and call
usb_autopm_get_interface() (and do the reverse at the end of the
critical section). Holding the device semaphore will block all
external PM calls, and the usb_autopm_get_interface() will prevent any
internal PM calls, even if it fails. (Exercise: Why?)
Interaction between dynamic PM and system PM
@ -499,22 +465,11 @@ synchronize with the driver's I/O activities in some other way.
Dynamic power management and system power management can interact in
a couple of ways.
Firstly, a device may already be manually suspended or autosuspended
when a system suspend occurs. Since system suspends are supposed to
be as transparent as possible, the device should remain suspended
following the system resume. The 2.6.23 kernel obeys this principle
for manually suspended devices but not for autosuspended devices; they
do get resumed when the system wakes up. (Presumably they will be
autosuspended again after their idle-delay time expires.) In later
kernels this behavior will be fixed.
(There is an exception. If a device would undergo a reset-resume
instead of a normal resume, and the device is enabled for remote
wakeup, then the reset-resume takes place even if the device was
already suspended when the system suspend began. The justification is
that a reset-resume is a kind of remote-wakeup event. Or to put it
another way, a device which needs a reset won't be able to generate
normal remote-wakeup signals, so it ought to be resumed immediately.)
Firstly, a device may already be autosuspended when a system suspend
occurs. Since system suspends are supposed to be as transparent as
possible, the device should remain suspended following the system
resume. But this theory may not work out well in practice; over time
the kernel's behavior in this regard has changed.
Secondly, a dynamic power-management event may occur as a system
suspend is underway. The window for this is short, since system

View file

@ -26,3 +26,4 @@
25 -> Compro VideoMate E800 [1858:e800]
26 -> Hauppauge WinTV-HVR1290 [0070:8551]
27 -> Mygica X8558 PRO DMB-TH [14f1:8578]
28 -> LEADTEK WinFast PxTV1200 [107d:6f22]

View file

@ -174,3 +174,4 @@
173 -> Zolid Hybrid TV Tuner PCI [1131:2004]
174 -> Asus Europa Hybrid OEM [1043:4847]
175 -> Leadtek Winfast DTV1000S [107d:6655]
176 -> Beholder BeholdTV 505 RDS [0000:5051]

View file

@ -81,3 +81,4 @@ tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough
tuner=81 - Partsnic (Daewoo) PTI-5NF05
tuner=82 - Philips CU1216L
tuner=83 - NXP TDA18271
tuner=84 - Sony BTF-Pxn01Z

View file

@ -0,0 +1,47 @@
tlg2300 release notes
====================
This is a v4l2/dvb device driver for the tlg2300 chip.
current status
==============
video
- support mmap and read().(no overlay)
audio
- The driver will register a ALSA card for the audio input.
vbi
- Works for almost TV norms.
dvb-t
- works for DVB-T
FM
- Works for radio.
---------------------------------------------------------------------------
TESTED APPLICATIONS:
-VLC1.0.4 test the video and dvb. The GUI is friendly to use.
-Mplayer test the video.
-Mplayer test the FM. The mplayer should be compiled with --enable-radio and
--enable-radio-capture.
The command runs as this(The alsa audio registers to card 1):
#mplayer radio://103.7/capture/ -radio adevice=hw=1,0:arate=48000 \
-rawaudio rate=48000:channels=2
---------------------------------------------------------------------------
KNOWN PROBLEMS:
about preemphasis:
You can set the preemphasis for radio by the following command:
#v4l2-ctl -d /dev/radio0 --set-ctrl=pre_emphasis_settings=1
"pre_emphasis_settings=1" means that you select the 50us. If you want
to select the 75us, please use "pre_emphasis_settings=2"

View file

@ -42,6 +42,7 @@ ov519 041e:4064 Creative Live! VISTA VF0420
ov519 041e:4067 Creative Live! Cam Video IM (VF0350)
ov519 041e:4068 Creative Live! VISTA VF0470
spca561 0458:7004 Genius VideoCAM Express V2
sn9c2028 0458:7005 Genius Smart 300, version 2
sunplus 0458:7006 Genius Dsc 1.3 Smart
zc3xx 0458:7007 Genius VideoCam V2
zc3xx 0458:700c Genius VideoCam V3
@ -109,6 +110,7 @@ sunplus 04a5:3003 Benq DC 1300
sunplus 04a5:3008 Benq DC 1500
sunplus 04a5:300a Benq DC 3410
spca500 04a5:300c Benq DC 1016
benq 04a5:3035 Benq DC E300
finepix 04cb:0104 Fujifilm FinePix 4800
finepix 04cb:0109 Fujifilm FinePix A202
finepix 04cb:010b Fujifilm FinePix A203
@ -142,6 +144,7 @@ sunplus 04fc:5360 Sunplus Generic
spca500 04fc:7333 PalmPixDC85
sunplus 04fc:ffff Pure DigitalDakota
spca501 0506:00df 3Com HomeConnect Lite
sunplus 052b:1507 Megapixel 5 Pretec DC-1007
sunplus 052b:1513 Megapix V4
sunplus 052b:1803 MegaImage VI
tv8532 0545:808b Veo Stingray
@ -151,6 +154,7 @@ sunplus 0546:3191 Polaroid Ion 80
sunplus 0546:3273 Polaroid PDC2030
ov519 054c:0154 Sonny toy4
ov519 054c:0155 Sonny toy5
cpia1 0553:0002 CPIA CPiA (version1) based cameras
zc3xx 055f:c005 Mustek Wcam300A
spca500 055f:c200 Mustek Gsmart 300
sunplus 055f:c211 Kowa Bs888e Microcamera
@ -188,8 +192,7 @@ spca500 06bd:0404 Agfa CL20
spca500 06be:0800 Optimedia
sunplus 06d6:0031 Trust 610 LCD PowerC@m Zoom
spca506 06e1:a190 ADS Instant VCD
ov534 06f8:3002 Hercules Blog Webcam
ov534 06f8:3003 Hercules Dualpix HD Weblog
ov534_9 06f8:3003 Hercules Dualpix HD Weblog
sonixj 06f8:3004 Hercules Classic Silver
sonixj 06f8:3008 Hercules Deluxe Optical Glass
pac7302 06f8:3009 Hercules Classic Link
@ -204,6 +207,7 @@ sunplus 0733:2221 Mercury Digital Pro 3.1p
sunplus 0733:3261 Concord 3045 spca536a
sunplus 0733:3281 Cyberpix S550V
spca506 0734:043b 3DeMon USB Capture aka
cpia1 0813:0001 QX3 camera
ov519 0813:0002 Dual Mode USB Camera Plus
spca500 084d:0003 D-Link DSC-350
spca500 08ca:0103 Aiptek PocketDV
@ -225,7 +229,8 @@ sunplus 08ca:2050 Medion MD 41437
sunplus 08ca:2060 Aiptek PocketDV5300
tv8532 0923:010f ICM532 cams
mars 093a:050f Mars-Semi Pc-Camera
mr97310a 093a:010f Sakar Digital no. 77379
mr97310a 093a:010e All known CIF cams with this ID
mr97310a 093a:010f All known VGA cams with this ID
pac207 093a:2460 Qtec Webcam 100
pac207 093a:2461 HP Webcam
pac207 093a:2463 Philips SPC 220 NC
@ -302,6 +307,7 @@ sonixj 0c45:613b Surfer SN-206
sonixj 0c45:613c Sonix Pccam168
sonixj 0c45:6143 Sonix Pccam168
sonixj 0c45:6148 Digitus DA-70811/ZSMC USB PC Camera ZS211/Microdia
sonixj 0c45:614a Frontech E-Ccam (JIL-2225)
sn9c20x 0c45:6240 PC Camera (SN9C201 + MT9M001)
sn9c20x 0c45:6242 PC Camera (SN9C201 + MT9M111)
sn9c20x 0c45:6248 PC Camera (SN9C201 + OV9655)
@ -324,6 +330,10 @@ sn9c20x 0c45:62b0 PC Camera (SN9C202 + MT9V011/MT9V111/MT9V112)
sn9c20x 0c45:62b3 PC Camera (SN9C202 + OV9655)
sn9c20x 0c45:62bb PC Camera (SN9C202 + OV7660)
sn9c20x 0c45:62bc PC Camera (SN9C202 + HV7131R)
sn9c2028 0c45:8001 Wild Planet Digital Spy Camera
sn9c2028 0c45:8003 Sakar #11199, #6637x, #67480 keychain cams
sn9c2028 0c45:8008 Mini-Shotz ms-350
sn9c2028 0c45:800a Vivitar Vivicam 3350B
sunplus 0d64:0303 Sunplus FashionCam DXG
ov519 0e96:c001 TRUST 380 USB2 SPACEC@M
etoms 102c:6151 Qcam Sangha CIF
@ -341,10 +351,11 @@ spca501 1776:501c Arowana 300K CMOS Camera
t613 17a1:0128 TASCORP JPEG Webcam, NGS Cyclops
vc032x 17ef:4802 Lenovo Vc0323+MI1310_SOC
pac207 2001:f115 D-Link DSB-C120
sq905c 2770:9050 sq905c
sq905c 2770:905c DualCamera
sq905 2770:9120 Argus Digital Camera DC1512
sq905c 2770:913d sq905c
sq905c 2770:9050 Disney pix micro (CIF)
sq905c 2770:9052 Disney pix micro 2 (VGA)
sq905c 2770:905c All 11 known cameras with this ID
sq905 2770:9120 All 24 known cameras with this ID
sq905c 2770:913d All 4 known cameras with this ID
spca500 2899:012c Toptro Industrial
ov519 8020:ef04 ov519
spca508 8086:0110 Intel Easy PC Camera

View file

@ -599,99 +599,13 @@ video_device::minor fields.
video buffer helper functions
-----------------------------
The v4l2 core API provides a standard method for dealing with video
buffers. Those methods allow a driver to implement read(), mmap() and
overlay() on a consistent way.
The v4l2 core API provides a set of standard methods (called "videobuf")
for dealing with video buffers. Those methods allow a driver to implement
read(), mmap() and overlay() in a consistent way. There are currently
methods for using video buffers on devices that supports DMA with
scatter/gather method (videobuf-dma-sg), DMA with linear access
(videobuf-dma-contig), and vmalloced buffers, mostly used on USB drivers
(videobuf-vmalloc).
There are currently methods for using video buffers on devices that
supports DMA with scatter/gather method (videobuf-dma-sg), DMA with
linear access (videobuf-dma-contig), and vmalloced buffers, mostly
used on USB drivers (videobuf-vmalloc).
Any driver using videobuf should provide operations (callbacks) for
four handlers:
ops->buf_setup - calculates the size of the video buffers and avoid they
to waste more than some maximum limit of RAM;
ops->buf_prepare - fills the video buffer structs and calls
videobuf_iolock() to alloc and prepare mmaped memory;
ops->buf_queue - advices the driver that another buffer were
requested (by read() or by QBUF);
ops->buf_release - frees any buffer that were allocated.
In order to use it, the driver need to have a code (generally called at
interrupt context) that will properly handle the buffer request lists,
announcing that a new buffer were filled.
The irq handling code should handle the videobuf task lists, in order
to advice videobuf that a new frame were filled, in order to honor to a
request. The code is generally like this one:
if (list_empty(&dma_q->active))
return;
buf = list_entry(dma_q->active.next, struct vbuffer, vb.queue);
if (!waitqueue_active(&buf->vb.done))
return;
/* Some logic to handle the buf may be needed here */
list_del(&buf->vb.queue);
do_gettimeofday(&buf->vb.ts);
wake_up(&buf->vb.done);
Those are the videobuffer functions used on drivers, implemented on
videobuf-core:
- Videobuf init functions
videobuf_queue_sg_init()
Initializes the videobuf infrastructure. This function should be
called before any other videobuf function on drivers that uses DMA
Scatter/Gather buffers.
videobuf_queue_dma_contig_init
Initializes the videobuf infrastructure. This function should be
called before any other videobuf function on drivers that need DMA
contiguous buffers.
videobuf_queue_vmalloc_init()
Initializes the videobuf infrastructure. This function should be
called before any other videobuf function on USB (and other drivers)
that need a vmalloced type of videobuf.
- videobuf_iolock()
Prepares the videobuf memory for the proper method (read, mmap, overlay).
- videobuf_queue_is_busy()
Checks if a videobuf is streaming.
- videobuf_queue_cancel()
Stops video handling.
- videobuf_mmap_free()
frees mmap buffers.
- videobuf_stop()
Stops video handling, ends mmap and frees mmap and other buffers.
- V4L2 api functions. Those functions correspond to VIDIOC_foo ioctls:
videobuf_reqbufs(), videobuf_querybuf(), videobuf_qbuf(),
videobuf_dqbuf(), videobuf_streamon(), videobuf_streamoff().
- V4L1 api function (corresponds to VIDIOCMBUF ioctl):
videobuf_cgmbuf()
This function is used to provide backward compatibility with V4L1
API.
- Some help functions for read()/poll() operations:
videobuf_read_stream()
For continuous stream read()
videobuf_read_one()
For snapshot read()
videobuf_poll_stream()
polling help function
The better way to understand it is to take a look at vivi driver. One
of the main reasons for vivi is to be a videobuf usage example. the
vivi_thread_tick() does the task that the IRQ callback would do on PCI
drivers (or the irq callback on USB).
Please see Documentation/video4linux/videobuf for more information on how
to use the videobuf layer.

View file

@ -0,0 +1,360 @@
An introduction to the videobuf layer
Jonathan Corbet <corbet@lwn.net>
Current as of 2.6.33
The videobuf layer functions as a sort of glue layer between a V4L2 driver
and user space. It handles the allocation and management of buffers for
the storage of video frames. There is a set of functions which can be used
to implement many of the standard POSIX I/O system calls, including read(),
poll(), and, happily, mmap(). Another set of functions can be used to
implement the bulk of the V4L2 ioctl() calls related to streaming I/O,
including buffer allocation, queueing and dequeueing, and streaming
control. Using videobuf imposes a few design decisions on the driver
author, but the payback comes in the form of reduced code in the driver and
a consistent implementation of the V4L2 user-space API.
Buffer types
Not all video devices use the same kind of buffers. In fact, there are (at
least) three common variations:
- Buffers which are scattered in both the physical and (kernel) virtual
address spaces. (Almost) all user-space buffers are like this, but it
makes great sense to allocate kernel-space buffers this way as well when
it is possible. Unfortunately, it is not always possible; working with
this kind of buffer normally requires hardware which can do
scatter/gather DMA operations.
- Buffers which are physically scattered, but which are virtually
contiguous; buffers allocated with vmalloc(), in other words. These
buffers are just as hard to use for DMA operations, but they can be
useful in situations where DMA is not available but virtually-contiguous
buffers are convenient.
- Buffers which are physically contiguous. Allocation of this kind of
buffer can be unreliable on fragmented systems, but simpler DMA
controllers cannot deal with anything else.
Videobuf can work with all three types of buffers, but the driver author
must pick one at the outset and design the driver around that decision.
[It's worth noting that there's a fourth kind of buffer: "overlay" buffers
which are located within the system's video memory. The overlay
functionality is considered to be deprecated for most use, but it still
shows up occasionally in system-on-chip drivers where the performance
benefits merit the use of this technique. Overlay buffers can be handled
as a form of scattered buffer, but there are very few implementations in
the kernel and a description of this technique is currently beyond the
scope of this document.]
Data structures, callbacks, and initialization
Depending on which type of buffers are being used, the driver should
include one of the following files:
<media/videobuf-dma-sg.h> /* Physically scattered */
<media/videobuf-vmalloc.h> /* vmalloc() buffers */
<media/videobuf-dma-contig.h> /* Physically contiguous */
The driver's data structure describing a V4L2 device should include a
struct videobuf_queue instance for the management of the buffer queue,
along with a list_head for the queue of available buffers. There will also
need to be an interrupt-safe spinlock which is used to protect (at least)
the queue.
The next step is to write four simple callbacks to help videobuf deal with
the management of buffers:
struct videobuf_queue_ops {
int (*buf_setup)(struct videobuf_queue *q,
unsigned int *count, unsigned int *size);
int (*buf_prepare)(struct videobuf_queue *q,
struct videobuf_buffer *vb,
enum v4l2_field field);
void (*buf_queue)(struct videobuf_queue *q,
struct videobuf_buffer *vb);
void (*buf_release)(struct videobuf_queue *q,
struct videobuf_buffer *vb);
};
buf_setup() is called early in the I/O process, when streaming is being
initiated; its purpose is to tell videobuf about the I/O stream. The count
parameter will be a suggested number of buffers to use; the driver should
check it for rationality and adjust it if need be. As a practical rule, a
minimum of two buffers are needed for proper streaming, and there is
usually a maximum (which cannot exceed 32) which makes sense for each
device. The size parameter should be set to the expected (maximum) size
for each frame of data.
Each buffer (in the form of a struct videobuf_buffer pointer) will be
passed to buf_prepare(), which should set the buffer's size, width, height,
and field fields properly. If the buffer's state field is
VIDEOBUF_NEEDS_INIT, the driver should pass it to:
int videobuf_iolock(struct videobuf_queue* q, struct videobuf_buffer *vb,
struct v4l2_framebuffer *fbuf);
Among other things, this call will usually allocate memory for the buffer.
Finally, the buf_prepare() function should set the buffer's state to
VIDEOBUF_PREPARED.
When a buffer is queued for I/O, it is passed to buf_queue(), which should
put it onto the driver's list of available buffers and set its state to
VIDEOBUF_QUEUED. Note that this function is called with the queue spinlock
held; if it tries to acquire it as well things will come to a screeching
halt. Yes, this is the voice of experience. Note also that videobuf may
wait on the first buffer in the queue; placing other buffers in front of it
could again gum up the works. So use list_add_tail() to enqueue buffers.
Finally, buf_release() is called when a buffer is no longer intended to be
used. The driver should ensure that there is no I/O active on the buffer,
then pass it to the appropriate free routine(s):
/* Scatter/gather drivers */
int videobuf_dma_unmap(struct videobuf_queue *q,
struct videobuf_dmabuf *dma);
int videobuf_dma_free(struct videobuf_dmabuf *dma);
/* vmalloc drivers */
void videobuf_vmalloc_free (struct videobuf_buffer *buf);
/* Contiguous drivers */
void videobuf_dma_contig_free(struct videobuf_queue *q,
struct videobuf_buffer *buf);
One way to ensure that a buffer is no longer under I/O is to pass it to:
int videobuf_waiton(struct videobuf_buffer *vb, int non_blocking, int intr);
Here, vb is the buffer, non_blocking indicates whether non-blocking I/O
should be used (it should be zero in the buf_release() case), and intr
controls whether an interruptible wait is used.
File operations
At this point, much of the work is done; much of the rest is slipping
videobuf calls into the implementation of the other driver callbacks. The
first step is in the open() function, which must initialize the
videobuf queue. The function to use depends on the type of buffer used:
void videobuf_queue_sg_init(struct videobuf_queue *q,
struct videobuf_queue_ops *ops,
struct device *dev,
spinlock_t *irqlock,
enum v4l2_buf_type type,
enum v4l2_field field,
unsigned int msize,
void *priv);
void videobuf_queue_vmalloc_init(struct videobuf_queue *q,
struct videobuf_queue_ops *ops,
struct device *dev,
spinlock_t *irqlock,
enum v4l2_buf_type type,
enum v4l2_field field,
unsigned int msize,
void *priv);
void videobuf_queue_dma_contig_init(struct videobuf_queue *q,
struct videobuf_queue_ops *ops,
struct device *dev,
spinlock_t *irqlock,
enum v4l2_buf_type type,
enum v4l2_field field,
unsigned int msize,
void *priv);
In each case, the parameters are the same: q is the queue structure for the
device, ops is the set of callbacks as described above, dev is the device
structure for this video device, irqlock is an interrupt-safe spinlock to
protect access to the data structures, type is the buffer type used by the
device (cameras will use V4L2_BUF_TYPE_VIDEO_CAPTURE, for example), field
describes which field is being captured (often V4L2_FIELD_NONE for
progressive devices), msize is the size of any containing structure used
around struct videobuf_buffer, and priv is a private data pointer which
shows up in the priv_data field of struct videobuf_queue. Note that these
are void functions which, evidently, are immune to failure.
V4L2 capture drivers can be written to support either of two APIs: the
read() system call and the rather more complicated streaming mechanism. As
a general rule, it is necessary to support both to ensure that all
applications have a chance of working with the device. Videobuf makes it
easy to do that with the same code. To implement read(), the driver need
only make a call to one of:
ssize_t videobuf_read_one(struct videobuf_queue *q,
char __user *data, size_t count,
loff_t *ppos, int nonblocking);
ssize_t videobuf_read_stream(struct videobuf_queue *q,
char __user *data, size_t count,
loff_t *ppos, int vbihack, int nonblocking);
Either one of these functions will read frame data into data, returning the
amount actually read; the difference is that videobuf_read_one() will only
read a single frame, while videobuf_read_stream() will read multiple frames
if they are needed to satisfy the count requested by the application. A
typical driver read() implementation will start the capture engine, call
one of the above functions, then stop the engine before returning (though a
smarter implementation might leave the engine running for a little while in
anticipation of another read() call happening in the near future).
The poll() function can usually be implemented with a direct call to:
unsigned int videobuf_poll_stream(struct file *file,
struct videobuf_queue *q,
poll_table *wait);
Note that the actual wait queue eventually used will be the one associated
with the first available buffer.
When streaming I/O is done to kernel-space buffers, the driver must support
the mmap() system call to enable user space to access the data. In many
V4L2 drivers, the often-complex mmap() implementation simplifies to a
single call to:
int videobuf_mmap_mapper(struct videobuf_queue *q,
struct vm_area_struct *vma);
Everything else is handled by the videobuf code.
The release() function requires two separate videobuf calls:
void videobuf_stop(struct videobuf_queue *q);
int videobuf_mmap_free(struct videobuf_queue *q);
The call to videobuf_stop() terminates any I/O in progress - though it is
still up to the driver to stop the capture engine. The call to
videobuf_mmap_free() will ensure that all buffers have been unmapped; if
so, they will all be passed to the buf_release() callback. If buffers
remain mapped, videobuf_mmap_free() returns an error code instead. The
purpose is clearly to cause the closing of the file descriptor to fail if
buffers are still mapped, but every driver in the 2.6.32 kernel cheerfully
ignores its return value.
ioctl() operations
The V4L2 API includes a very long list of driver callbacks to respond to
the many ioctl() commands made available to user space. A number of these
- those associated with streaming I/O - turn almost directly into videobuf
calls. The relevant helper functions are:
int videobuf_reqbufs(struct videobuf_queue *q,
struct v4l2_requestbuffers *req);
int videobuf_querybuf(struct videobuf_queue *q, struct v4l2_buffer *b);
int videobuf_qbuf(struct videobuf_queue *q, struct v4l2_buffer *b);
int videobuf_dqbuf(struct videobuf_queue *q, struct v4l2_buffer *b,
int nonblocking);
int videobuf_streamon(struct videobuf_queue *q);
int videobuf_streamoff(struct videobuf_queue *q);
int videobuf_cgmbuf(struct videobuf_queue *q, struct video_mbuf *mbuf,
int count);
So, for example, a VIDIOC_REQBUFS call turns into a call to the driver's
vidioc_reqbufs() callback which, in turn, usually only needs to locate the
proper struct videobuf_queue pointer and pass it to videobuf_reqbufs().
These support functions can replace a great deal of buffer management
boilerplate in a lot of V4L2 drivers.
The vidioc_streamon() and vidioc_streamoff() functions will be a bit more
complex, of course, since they will also need to deal with starting and
stopping the capture engine. videobuf_cgmbuf(), called from the driver's
vidiocgmbuf() function, only exists if the V4L1 compatibility module has
been selected with CONFIG_VIDEO_V4L1_COMPAT, so its use must be surrounded
with #ifdef directives.
Buffer allocation
Thus far, we have talked about buffers, but have not looked at how they are
allocated. The scatter/gather case is the most complex on this front. For
allocation, the driver can leave buffer allocation entirely up to the
videobuf layer; in this case, buffers will be allocated as anonymous
user-space pages and will be very scattered indeed. If the application is
using user-space buffers, no allocation is needed; the videobuf layer will
take care of calling get_user_pages() and filling in the scatterlist array.
If the driver needs to do its own memory allocation, it should be done in
the vidioc_reqbufs() function, *after* calling videobuf_reqbufs(). The
first step is a call to:
struct videobuf_dmabuf *videobuf_to_dma(struct videobuf_buffer *buf);
The returned videobuf_dmabuf structure (defined in
<media/videobuf-dma-sg.h>) includes a couple of relevant fields:
struct scatterlist *sglist;
int sglen;
The driver must allocate an appropriately-sized scatterlist array and
populate it with pointers to the pieces of the allocated buffer; sglen
should be set to the length of the array.
Drivers using the vmalloc() method need not (and cannot) concern themselves
with buffer allocation at all; videobuf will handle those details. The
same is normally true of contiguous-DMA drivers as well; videobuf will
allocate the buffers (with dma_alloc_coherent()) when it sees fit. That
means that these drivers may be trying to do high-order allocations at any
time, an operation which is not always guaranteed to work. Some drivers
play tricks by allocating DMA space at system boot time; videobuf does not
currently play well with those drivers.
As of 2.6.31, contiguous-DMA drivers can work with a user-supplied buffer,
as long as that buffer is physically contiguous. Normal user-space
allocations will not meet that criterion, but buffers obtained from other
kernel drivers, or those contained within huge pages, will work with these
drivers.
Filling the buffers
The final part of a videobuf implementation has no direct callback - it's
the portion of the code which actually puts frame data into the buffers,
usually in response to interrupts from the device. For all types of
drivers, this process works approximately as follows:
- Obtain the next available buffer and make sure that somebody is actually
waiting for it.
- Get a pointer to the memory and put video data there.
- Mark the buffer as done and wake up the process waiting for it.
Step (1) above is done by looking at the driver-managed list_head structure
- the one which is filled in the buf_queue() callback. Because starting
the engine and enqueueing buffers are done in separate steps, it's possible
for the engine to be running without any buffers available - in the
vmalloc() case especially. So the driver should be prepared for the list
to be empty. It is equally possible that nobody is yet interested in the
buffer; the driver should not remove it from the list or fill it until a
process is waiting on it. That test can be done by examining the buffer's
done field (a wait_queue_head_t structure) with waitqueue_active().
A buffer's state should be set to VIDEOBUF_ACTIVE before being mapped for
DMA; that ensures that the videobuf layer will not try to do anything with
it while the device is transferring data.
For scatter/gather drivers, the needed memory pointers will be found in the
scatterlist structure described above. Drivers using the vmalloc() method
can get a memory pointer with:
void *videobuf_to_vmalloc(struct videobuf_buffer *buf);
For contiguous DMA drivers, the function to use is:
dma_addr_t videobuf_to_dma_contig(struct videobuf_buffer *buf);
The contiguous DMA API goes out of its way to hide the kernel-space address
of the DMA buffer from drivers.
The final step is to set the size field of the relevant videobuf_buffer
structure to the actual size of the captured image, set state to
VIDEOBUF_DONE, then call wake_up() on the done queue. At this point, the
buffer is owned by the videobuf layer and the driver should not touch it
again.
Developers who are interested in more information can go into the relevant
header files; there are a few low-level functions declared there which have
not been talked about here. Also worthwhile is the vivi driver
(drivers/media/video/vivi.c), which is maintained as an example of how V4L2
drivers should be written. Vivi only uses the vmalloc() API, but it's good
enough to get started with. Note also that all of these calls are exported
GPL-only, so they will not be available to non-GPL kernel modules.

View file

@ -41,6 +41,7 @@ Possible debug options are
P Poisoning (object and padding)
U User tracking (free and alloc)
T Trace (please only use on single slabs)
A Toggle failslab filter mark for the cache
O Switch debugging off for caches that would have
caused higher minimum slab orders
- Switch all debugging off (useful if the kernel is

View file

@ -166,19 +166,13 @@ NUMA
numa=noacpi Don't parse the SRAT table for NUMA setup
numa=fake=CMDLINE
If a number, fakes CMDLINE nodes and ignores NUMA setup of the
actual machine. Otherwise, system memory is configured
depending on the sizes and coefficients listed. For example:
numa=fake=2*512,1024,4*256,*128
gives two 512M nodes, a 1024M node, four 256M nodes, and the
rest split into 128M chunks. If the last character of CMDLINE
is a *, the remaining memory is divided up equally among its
coefficient:
numa=fake=2*512,2*
gives two 512M nodes and the rest split into two nodes.
Otherwise, the remaining system RAM is allocated to an
additional node.
numa=fake=<size>[MG]
If given as a memory unit, fills all system RAM with nodes of
size interleaved over physical nodes.
numa=fake=<N>
If given as an integer, fills all system RAM with N fake nodes
interleaved over physical nodes.
ACPI

Some files were not shown because too many files have changed in this diff Show more