docs/vm: idle_page_tracking.txt: convert to ReST format
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This commit is contained in:
parent
b53ba58845
commit
e3f2025a57
1 changed files with 36 additions and 19 deletions
|
@ -1,4 +1,11 @@
|
|||
MOTIVATION
|
||||
.. _idle_page_tracking:
|
||||
|
||||
==================
|
||||
Idle Page Tracking
|
||||
==================
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
The idle page tracking feature allows to track which memory pages are being
|
||||
accessed by a workload and which are idle. This information can be useful for
|
||||
|
@ -8,10 +15,14 @@ or deciding where to place the workload within a compute cluster.
|
|||
|
||||
It is enabled by CONFIG_IDLE_PAGE_TRACKING=y.
|
||||
|
||||
USER API
|
||||
.. _user_api:
|
||||
|
||||
The idle page tracking API is located at /sys/kernel/mm/page_idle. Currently,
|
||||
it consists of the only read-write file, /sys/kernel/mm/page_idle/bitmap.
|
||||
User API
|
||||
========
|
||||
|
||||
The idle page tracking API is located at ``/sys/kernel/mm/page_idle``.
|
||||
Currently, it consists of the only read-write file,
|
||||
``/sys/kernel/mm/page_idle/bitmap``.
|
||||
|
||||
The file implements a bitmap where each bit corresponds to a memory page. The
|
||||
bitmap is represented by an array of 8-byte integers, and the page at PFN #i is
|
||||
|
@ -19,8 +30,9 @@ mapped to bit #i%64 of array element #i/64, byte order is native. When a bit is
|
|||
set, the corresponding page is idle.
|
||||
|
||||
A page is considered idle if it has not been accessed since it was marked idle
|
||||
(for more details on what "accessed" actually means see the IMPLEMENTATION
|
||||
DETAILS section). To mark a page idle one has to set the bit corresponding to
|
||||
(for more details on what "accessed" actually means see the :ref:`Implementation
|
||||
Details <impl_details>` section).
|
||||
To mark a page idle one has to set the bit corresponding to
|
||||
the page by writing to the file. A value written to the file is OR-ed with the
|
||||
current bitmap value.
|
||||
|
||||
|
@ -30,9 +42,9 @@ page types (e.g. SLAB pages) an attempt to mark a page idle is silently ignored,
|
|||
and hence such pages are never reported idle.
|
||||
|
||||
For huge pages the idle flag is set only on the head page, so one has to read
|
||||
/proc/kpageflags in order to correctly count idle huge pages.
|
||||
``/proc/kpageflags`` in order to correctly count idle huge pages.
|
||||
|
||||
Reading from or writing to /sys/kernel/mm/page_idle/bitmap will return
|
||||
Reading from or writing to ``/sys/kernel/mm/page_idle/bitmap`` will return
|
||||
-EINVAL if you are not starting the read/write on an 8-byte boundary, or
|
||||
if the size of the read/write is not a multiple of 8 bytes. Writing to
|
||||
this file beyond max PFN will return -ENXIO.
|
||||
|
@ -41,21 +53,25 @@ That said, in order to estimate the amount of pages that are not used by a
|
|||
workload one should:
|
||||
|
||||
1. Mark all the workload's pages as idle by setting corresponding bits in
|
||||
/sys/kernel/mm/page_idle/bitmap. The pages can be found by reading
|
||||
/proc/pid/pagemap if the workload is represented by a process, or by
|
||||
filtering out alien pages using /proc/kpagecgroup in case the workload is
|
||||
placed in a memory cgroup.
|
||||
``/sys/kernel/mm/page_idle/bitmap``. The pages can be found by reading
|
||||
``/proc/pid/pagemap`` if the workload is represented by a process, or by
|
||||
filtering out alien pages using ``/proc/kpagecgroup`` in case the workload
|
||||
is placed in a memory cgroup.
|
||||
|
||||
2. Wait until the workload accesses its working set.
|
||||
|
||||
3. Read /sys/kernel/mm/page_idle/bitmap and count the number of bits set. If
|
||||
one wants to ignore certain types of pages, e.g. mlocked pages since they
|
||||
are not reclaimable, he or she can filter them out using /proc/kpageflags.
|
||||
3. Read ``/sys/kernel/mm/page_idle/bitmap`` and count the number of bits set.
|
||||
If one wants to ignore certain types of pages, e.g. mlocked pages since they
|
||||
are not reclaimable, he or she can filter them out using
|
||||
``/proc/kpageflags``.
|
||||
|
||||
See Documentation/vm/pagemap.txt for more information about /proc/pid/pagemap,
|
||||
/proc/kpageflags, and /proc/kpagecgroup.
|
||||
See Documentation/vm/pagemap.txt for more information about
|
||||
``/proc/pid/pagemap``, ``/proc/kpageflags``, and ``/proc/kpagecgroup``.
|
||||
|
||||
IMPLEMENTATION DETAILS
|
||||
.. _impl_details:
|
||||
|
||||
Implementation Details
|
||||
======================
|
||||
|
||||
The kernel internally keeps track of accesses to user memory pages in order to
|
||||
reclaim unreferenced pages first on memory shortage conditions. A page is
|
||||
|
@ -77,7 +93,8 @@ When a dirty page is written to swap or disk as a result of memory reclaim or
|
|||
exceeding the dirty memory limit, it is not marked referenced.
|
||||
|
||||
The idle memory tracking feature adds a new page flag, the Idle flag. This flag
|
||||
is set manually, by writing to /sys/kernel/mm/page_idle/bitmap (see the USER API
|
||||
is set manually, by writing to ``/sys/kernel/mm/page_idle/bitmap`` (see the
|
||||
:ref:`User API <user_api>`
|
||||
section), and cleared automatically whenever a page is referenced as defined
|
||||
above.
|
||||
|
||||
|
|
Loading…
Reference in a new issue