59dfc3f8fb
Documentation update for PAT. Reflect the latest API details. Also, adds details about ways to get more info in order to debug PAT. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
136 lines
6.7 KiB
Text
136 lines
6.7 KiB
Text
|
|
PAT (Page Attribute Table)
|
|
|
|
x86 Page Attribute Table (PAT) allows for setting the memory attribute at the
|
|
page level granularity. PAT is complementary to the MTRR settings which allows
|
|
for setting of memory types over physical address ranges. However, PAT is
|
|
more flexible than MTRR due to its capability to set attributes at page level
|
|
and also due to the fact that there are no hardware limitations on number of
|
|
such attribute settings allowed. Added flexibility comes with guidelines for
|
|
not having memory type aliasing for the same physical memory with multiple
|
|
virtual addresses.
|
|
|
|
PAT allows for different types of memory attributes. The most commonly used
|
|
ones that will be supported at this time are Write-back, Uncached,
|
|
Write-combined and Uncached Minus.
|
|
|
|
|
|
PAT APIs
|
|
--------
|
|
|
|
There are many different APIs in the kernel that allows setting of memory
|
|
attributes at the page level. In order to avoid aliasing, these interfaces
|
|
should be used thoughtfully. Below is a table of interfaces available,
|
|
their intended usage and their memory attribute relationships. Internally,
|
|
these APIs use a reserve_memtype()/free_memtype() interface on the physical
|
|
address range to avoid any aliasing.
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
API | RAM | ACPI,... | Reserved/Holes |
|
|
-----------------------|----------|------------|------------------|
|
|
| | | |
|
|
ioremap | -- | UC- | UC- |
|
|
| | | |
|
|
ioremap_cache | -- | WB | WB |
|
|
| | | |
|
|
ioremap_nocache | -- | UC- | UC- |
|
|
| | | |
|
|
ioremap_wc | -- | -- | WC |
|
|
| | | |
|
|
set_memory_uc | UC- | -- | -- |
|
|
set_memory_wb | | | |
|
|
| | | |
|
|
set_memory_wc | WC | -- | -- |
|
|
set_memory_wb | | | |
|
|
| | | |
|
|
pci sysfs resource | -- | -- | UC- |
|
|
| | | |
|
|
pci sysfs resource_wc | -- | -- | WC |
|
|
is IORESOURCE_PREFETCH| | | |
|
|
| | | |
|
|
pci proc | -- | -- | UC- |
|
|
!PCIIOC_WRITE_COMBINE | | | |
|
|
| | | |
|
|
pci proc | -- | -- | WC |
|
|
PCIIOC_WRITE_COMBINE | | | |
|
|
| | | |
|
|
/dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
|
|
read-write | | | |
|
|
| | | |
|
|
/dev/mem | -- | UC- | UC- |
|
|
mmap SYNC flag | | | |
|
|
| | | |
|
|
/dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
|
|
mmap !SYNC flag | |(from exist-| (from exist- |
|
|
and | | ing alias)| ing alias) |
|
|
any alias to this area| | | |
|
|
| | | |
|
|
/dev/mem | -- | WB | WB |
|
|
mmap !SYNC flag | | | |
|
|
no alias to this area | | | |
|
|
and | | | |
|
|
MTRR says WB | | | |
|
|
| | | |
|
|
/dev/mem | -- | -- | UC- |
|
|
mmap !SYNC flag | | | |
|
|
no alias to this area | | | |
|
|
and | | | |
|
|
MTRR says !WB | | | |
|
|
| | | |
|
|
-------------------------------------------------------------------
|
|
|
|
Notes:
|
|
|
|
-- in the above table mean "Not suggested usage for the API". Some of the --'s
|
|
are strictly enforced by the kernel. Some others are not really enforced
|
|
today, but may be enforced in future.
|
|
|
|
For ioremap and pci access through /sys or /proc - The actual type returned
|
|
can be more restrictive, in case of any existing aliasing for that address.
|
|
For example: If there is an existing uncached mapping, a new ioremap_wc can
|
|
return uncached mapping in place of write-combine requested.
|
|
|
|
set_memory_[uc|wc] and set_memory_wb should be used in pairs, where driver will
|
|
first make a region uc or wc and switch it back to wb after use.
|
|
|
|
Over time writes to /proc/mtrr will be deprecated in favor of using PAT based
|
|
interfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
|
|
|
|
Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access
|
|
types.
|
|
|
|
Drivers should use set_memory_[uc|wc] to set access type for RAM ranges.
|
|
|
|
|
|
PAT debugging
|
|
-------------
|
|
|
|
With CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by
|
|
|
|
# mount -t debugfs debugfs /sys/kernel/debug
|
|
# cat /sys/kernel/debug/x86/pat_memtype_list
|
|
PAT memtype list:
|
|
uncached-minus @ 0x7fadf000-0x7fae0000
|
|
uncached-minus @ 0x7fb19000-0x7fb1a000
|
|
uncached-minus @ 0x7fb1a000-0x7fb1b000
|
|
uncached-minus @ 0x7fb1b000-0x7fb1c000
|
|
uncached-minus @ 0x7fb1c000-0x7fb1d000
|
|
uncached-minus @ 0x7fb1d000-0x7fb1e000
|
|
uncached-minus @ 0x7fb1e000-0x7fb25000
|
|
uncached-minus @ 0x7fb25000-0x7fb26000
|
|
uncached-minus @ 0x7fb26000-0x7fb27000
|
|
uncached-minus @ 0x7fb27000-0x7fb28000
|
|
uncached-minus @ 0x7fb28000-0x7fb2e000
|
|
uncached-minus @ 0x7fb2e000-0x7fb2f000
|
|
uncached-minus @ 0x7fb2f000-0x7fb30000
|
|
uncached-minus @ 0x7fb31000-0x7fb32000
|
|
uncached-minus @ 0x80000000-0x90000000
|
|
|
|
This list shows physical address ranges and various PAT settings used to
|
|
access those physical address ranges.
|
|
|
|
Another, more verbose way of getting PAT related debug messages is with
|
|
"debugpat" boot parameter. With this parameter, various debug messages are
|
|
printed to dmesg log.
|
|
|