Document lowmem_reserve_ratio
Though the lower_zone_protection was changed to lowmem_reserve_ratio, the document has been not changed. The lowmem_reserve_ratio seems quite hard to estimate, but there is no guidance. This patch is to change document for it. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Andrea Arcangeli <andrea@cpushare.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
b5beb1caff
commit
7786fa9ac5
1 changed files with 60 additions and 14 deletions
|
@ -1336,7 +1336,7 @@ legacy_va_layout
|
|||
If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel
|
||||
will use the legacy (2.4) layout for all processes.
|
||||
|
||||
lower_zone_protection
|
||||
lowmem_reserve_ratio
|
||||
---------------------
|
||||
|
||||
For some specialised workloads on highmem machines it is dangerous for
|
||||
|
@ -1356,25 +1356,71 @@ captured into pinned user memory.
|
|||
mechanism will also defend that region from allocations which could use
|
||||
highmem or lowmem).
|
||||
|
||||
The `lower_zone_protection' tunable determines how aggressive the kernel is
|
||||
in defending these lower zones. The default value is zero - no
|
||||
protection at all.
|
||||
The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is
|
||||
in defending these lower zones.
|
||||
|
||||
If you have a machine which uses highmem or ISA DMA and your
|
||||
applications are using mlock(), or if you are running with no swap then
|
||||
you probably should increase the lower_zone_protection setting.
|
||||
you probably should change the lowmem_reserve_ratio setting.
|
||||
|
||||
The units of this tunable are fairly vague. It is approximately equal
|
||||
to "megabytes," so setting lower_zone_protection=100 will protect around 100
|
||||
megabytes of the lowmem zone from user allocations. It will also make
|
||||
those 100 megabytes unavailable for use by applications and by
|
||||
pagecache, so there is a cost.
|
||||
The lowmem_reserve_ratio is an array. You can see them by reading this file.
|
||||
-
|
||||
% cat /proc/sys/vm/lowmem_reserve_ratio
|
||||
256 256 32
|
||||
-
|
||||
Note: # of this elements is one fewer than number of zones. Because the highest
|
||||
zone's value is not necessary for following calculation.
|
||||
|
||||
The effects of this tunable may be observed by monitoring
|
||||
/proc/meminfo:LowFree. Write a single huge file and observe the point
|
||||
at which LowFree ceases to fall.
|
||||
But, these values are not used directly. The kernel calculates # of protection
|
||||
pages for each zones from them. These are shown as array of protection pages
|
||||
in /proc/zoneinfo like followings. (This is an example of x86-64 box).
|
||||
Each zone has an array of protection pages like this.
|
||||
|
||||
A reasonable value for lower_zone_protection is 100.
|
||||
-
|
||||
Node 0, zone DMA
|
||||
pages free 1355
|
||||
min 3
|
||||
low 3
|
||||
high 4
|
||||
:
|
||||
:
|
||||
numa_other 0
|
||||
protection: (0, 2004, 2004, 2004)
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
pagesets
|
||||
cpu: 0 pcp: 0
|
||||
:
|
||||
-
|
||||
These protections are added to score to judge whether this zone should be used
|
||||
for page allocation or should be reclaimed.
|
||||
|
||||
In this example, if normal pages (index=2) are required to this DMA zone and
|
||||
pages_high is used for watermark, the kernel judges this zone should not be
|
||||
used because pages_free(1355) is smaller than watermark + protection[2]
|
||||
(4 + 2004 = 2008). If this protection value is 0, this zone would be used for
|
||||
normal page requirement. If requirement is DMA zone(index=0), protection[0]
|
||||
(=0) is used.
|
||||
|
||||
zone[i]'s protection[j] is calculated by following exprssion.
|
||||
|
||||
(i < j):
|
||||
zone[i]->protection[j]
|
||||
= (total sums of present_pages from zone[i+1] to zone[j] on the node)
|
||||
/ lowmem_reserve_ratio[i];
|
||||
(i = j):
|
||||
(should not be protected. = 0;
|
||||
(i > j):
|
||||
(not necessary, but looks 0)
|
||||
|
||||
The default values of lowmem_reserve_ratio[i] are
|
||||
256 (if zone[i] means DMA or DMA32 zone)
|
||||
32 (others).
|
||||
As above expression, they are reciprocal number of ratio.
|
||||
256 means 1/256. # of protection pages becomes about "0.39%" of total present
|
||||
pages of higher zones on the node.
|
||||
|
||||
If you would like to protect more pages, smaller values are effective.
|
||||
The minimum value is 1 (1/1 -> 100%).
|
||||
|
||||
page-cluster
|
||||
------------
|
||||
|
|
Loading…
Reference in a new issue