2005-12-16 14:43:46 -07:00
|
|
|
#ifndef _ASM_POWERPC_PGTABLE_64K_H
|
|
|
|
#define _ASM_POWERPC_PGTABLE_64K_H
|
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
#include <asm-generic/pgtable-nopud.h>
|
|
|
|
|
|
|
|
|
|
|
|
#define PTE_INDEX_SIZE 12
|
|
|
|
#define PMD_INDEX_SIZE 12
|
|
|
|
#define PUD_INDEX_SIZE 0
|
|
|
|
#define PGD_INDEX_SIZE 4
|
|
|
|
|
|
|
|
#define PTE_TABLE_SIZE (sizeof(real_pte_t) << PTE_INDEX_SIZE)
|
|
|
|
#define PMD_TABLE_SIZE (sizeof(pmd_t) << PMD_INDEX_SIZE)
|
|
|
|
#define PGD_TABLE_SIZE (sizeof(pgd_t) << PGD_INDEX_SIZE)
|
|
|
|
|
|
|
|
#define PTRS_PER_PTE (1 << PTE_INDEX_SIZE)
|
|
|
|
#define PTRS_PER_PMD (1 << PMD_INDEX_SIZE)
|
|
|
|
#define PTRS_PER_PGD (1 << PGD_INDEX_SIZE)
|
|
|
|
|
[PATCH] ppc64: Fix bug in SLB miss handler for hugepages
This patch, however, should be applied on top of the 64k-page-size patch to
fix some problems with hugepage (some pre-existing, another introduced by
this patch).
The patch fixes a bug in the SLB miss handler for hugepages on ppc64
introduced by the dynamic hugepage patch (commit id
c594adad5653491813959277fb87a2fef54c4e05) due to a misunderstanding of the
srd instruction's behaviour (mea culpa). The problem arises when a 64-bit
process maps some hugepages in the low 4GB of the address space (unusual).
In this case, as well as the 256M segment in question being marked for
hugepages, other segments at 32G intervals will be incorrectly marked for
hugepages.
In the process, this patch tweaks the semantics of the hugepage bitmaps to
be more sensible. Previously, an address below 4G was marked for hugepages
if the appropriate segment bit in the "low areas" bitmask was set *or* if
the low bit in the "high areas" bitmap was set (which would mark all
addresses below 1TB for hugepage). With this patch, any given address is
governed by a single bitmap. Addresses below 4GB are marked for hugepage
if and only if their bit is set in the "low areas" bitmap (256M
granularity). Addresses between 4GB and 1TB are marked for hugepage iff
the low bit in the "high areas" bitmap is set. Higher addresses are marked
for hugepage iff their bit in the "high areas" bitmap is set (1TB
granularity).
To avoid conflicts, this patch must be applied on top of BenH's pending
patch for 64k base page size [0]. As such, this patch also addresses a
hugepage problem introduced by that patch. That patch allows hugepages of
1MB in size on hardware which supports it, however, that won't work when
using 4k pages (4 level pagetable), because in that case hugepage PTEs are
stored at the PMD level, and each PMD entry maps 2MB. This patch simply
disallows hugepages in that case (we can do something cleverer to re-enable
them some other day).
Built, booted, and a handful of hugepage related tests passed on POWER5
LPAR (both ARCH=powerpc and ARCH=ppc64).
[0] http://gate.crashing.org/~benh/ppc64-64k-pages.diff
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 01:57:52 -07:00
|
|
|
/* With 4k base page size, hugepage PTEs go at the PMD level */
|
|
|
|
#define MIN_HUGEPTE_SHIFT PAGE_SHIFT
|
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
/* PMD_SHIFT determines what a second-level page table entry can map */
|
|
|
|
#define PMD_SHIFT (PAGE_SHIFT + PTE_INDEX_SIZE)
|
|
|
|
#define PMD_SIZE (1UL << PMD_SHIFT)
|
|
|
|
#define PMD_MASK (~(PMD_SIZE-1))
|
|
|
|
|
|
|
|
/* PGDIR_SHIFT determines what a third-level page table entry can map */
|
|
|
|
#define PGDIR_SHIFT (PMD_SHIFT + PMD_INDEX_SIZE)
|
|
|
|
#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
|
|
|
|
#define PGDIR_MASK (~(PGDIR_SIZE-1))
|
|
|
|
|
|
|
|
/* Additional PTE bits (don't change without checking asm in hash_low.S) */
|
|
|
|
#define _PAGE_HPTE_SUB 0x0ffff000 /* combo only: sub pages HPTE bits */
|
|
|
|
#define _PAGE_HPTE_SUB0 0x08000000 /* combo only: first sub page */
|
|
|
|
#define _PAGE_COMBO 0x10000000 /* this is a combo 4k page */
|
[POWERPC] Allow drivers to map individual 4k pages to userspace
Some drivers have resources that they want to be able to map into
userspace that are 4k in size. On a kernel configured with 64k pages
we currently end up mapping the 4k we want plus another 60k of
physical address space, which could contain anything. This can
introduce security problems, for example in the case of an infiniband
adaptor where the other 60k could contain registers that some other
program is using for its communications.
This patch adds a new function, remap_4k_pfn, which drivers can use to
map a single 4k page to userspace regardless of whether the kernel is
using a 4k or a 64k page size. Like remap_pfn_range, it would
typically be called in a driver's mmap function. It only maps a
single 4k page, which on a 64k page kernel appears replicated 16 times
throughout a 64k page. On a 4k page kernel it reduces to a call to
remap_pfn_range.
The way this works on a 64k kernel is that a new bit, _PAGE_4K_PFN,
gets set on the linux PTE. This alters the way that __hash_page_4K
computes the real address to put in the HPTE. The RPN field of the
linux PTE becomes the 4k RPN directly rather than being interpreted as
a 64k RPN. Since the RPN field is 32 bits, this means that physical
addresses being mapped with remap_4k_pfn have to be below 2^44,
i.e. 0x100000000000.
The patch also factors out the code in arch/powerpc/mm/hash_utils_64.c
that deals with demoting a process to use 4k pages into one function
that gets called in the various different places where we need to do
that. There were some discrepancies between exactly what was done in
the various places, such as a call to spu_flush_all_slbs in one case
but not in others.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-03 05:24:02 -06:00
|
|
|
#define _PAGE_4K_PFN 0x20000000 /* PFN is for a single 4k page */
|
2007-05-08 00:27:28 -06:00
|
|
|
|
|
|
|
/* Note the full page bits must be in the same location as for normal
|
|
|
|
* 4k pages as the same asssembly will be used to insert 64K pages
|
|
|
|
* wether the kernel has CONFIG_PPC_64K_PAGES or not
|
|
|
|
*/
|
2005-11-06 17:06:55 -07:00
|
|
|
#define _PAGE_F_SECOND 0x00008000 /* full page: hidx bits */
|
|
|
|
#define _PAGE_F_GIX 0x00007000 /* full page: hidx bits */
|
|
|
|
|
|
|
|
/* PTE flags to conserve for HPTE identification */
|
|
|
|
#define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | _PAGE_HPTE_SUB |\
|
|
|
|
_PAGE_COMBO)
|
|
|
|
|
|
|
|
/* Shift to put page number into pte.
|
|
|
|
*
|
|
|
|
* That gives us a max RPN of 32 bits, which means a max of 48 bits
|
|
|
|
* of addressable physical space.
|
|
|
|
* We could get 3 more bits here by setting PTE_RPN_SHIFT to 29 but
|
|
|
|
* 32 makes PTEs more readable for debugging for now :)
|
|
|
|
*/
|
|
|
|
#define PTE_RPN_SHIFT (32)
|
|
|
|
#define PTE_RPN_MAX (1UL << (64 - PTE_RPN_SHIFT))
|
|
|
|
#define PTE_RPN_MASK (~((1UL<<PTE_RPN_SHIFT)-1))
|
|
|
|
|
|
|
|
/* _PAGE_CHG_MASK masks of bits that are to be preserved accross
|
|
|
|
* pgprot changes
|
|
|
|
*/
|
|
|
|
#define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
|
|
|
|
_PAGE_ACCESSED)
|
|
|
|
|
|
|
|
/* Bits to mask out from a PMD to get to the PTE page */
|
|
|
|
#define PMD_MASKED_BITS 0x1ff
|
|
|
|
/* Bits to mask out from a PGD/PUD to get to the PMD page */
|
|
|
|
#define PUD_MASKED_BITS 0x1ff
|
|
|
|
|
|
|
|
/* Manipulate "rpte" values */
|
|
|
|
#define __real_pte(e,p) ((real_pte_t) { \
|
|
|
|
(e), pte_val(*((p) + PTRS_PER_PTE)) })
|
|
|
|
#define __rpte_to_hidx(r,index) ((pte_val((r).pte) & _PAGE_COMBO) ? \
|
|
|
|
(((r).hidx >> ((index)<<2)) & 0xf) : ((pte_val((r).pte) >> 12) & 0xf))
|
|
|
|
#define __rpte_to_pte(r) ((r).pte)
|
|
|
|
#define __rpte_sub_valid(rpte, index) \
|
|
|
|
(pte_val(rpte.pte) & (_PAGE_HPTE_SUB0 >> (index)))
|
|
|
|
|
|
|
|
|
|
|
|
/* Trick: we set __end to va + 64k, which happens works for
|
|
|
|
* a 16M page as well as we want only one iteration
|
|
|
|
*/
|
|
|
|
#define pte_iterate_hashed_subpages(rpte, psize, va, index, shift) \
|
|
|
|
do { \
|
|
|
|
unsigned long __end = va + PAGE_SIZE; \
|
|
|
|
unsigned __split = (psize == MMU_PAGE_4K || \
|
|
|
|
psize == MMU_PAGE_64K_AP); \
|
|
|
|
shift = mmu_psize_defs[psize].shift; \
|
|
|
|
for (index = 0; va < __end; index++, va += (1 << shift)) { \
|
|
|
|
if (!__split || __rpte_sub_valid(rpte, index)) do { \
|
|
|
|
|
|
|
|
#define pte_iterate_hashed_end() } while(0); } } while(0)
|
|
|
|
|
2007-05-08 00:27:28 -06:00
|
|
|
#define pte_pagesize_index(mm, addr, pte) \
|
2006-06-14 18:45:18 -06:00
|
|
|
(((pte) & _PAGE_COMBO)? MMU_PAGE_4K: MMU_PAGE_64K)
|
2005-11-06 17:06:55 -07:00
|
|
|
|
[POWERPC] Allow drivers to map individual 4k pages to userspace
Some drivers have resources that they want to be able to map into
userspace that are 4k in size. On a kernel configured with 64k pages
we currently end up mapping the 4k we want plus another 60k of
physical address space, which could contain anything. This can
introduce security problems, for example in the case of an infiniband
adaptor where the other 60k could contain registers that some other
program is using for its communications.
This patch adds a new function, remap_4k_pfn, which drivers can use to
map a single 4k page to userspace regardless of whether the kernel is
using a 4k or a 64k page size. Like remap_pfn_range, it would
typically be called in a driver's mmap function. It only maps a
single 4k page, which on a 64k page kernel appears replicated 16 times
throughout a 64k page. On a 4k page kernel it reduces to a call to
remap_pfn_range.
The way this works on a 64k kernel is that a new bit, _PAGE_4K_PFN,
gets set on the linux PTE. This alters the way that __hash_page_4K
computes the real address to put in the HPTE. The RPN field of the
linux PTE becomes the 4k RPN directly rather than being interpreted as
a 64k RPN. Since the RPN field is 32 bits, this means that physical
addresses being mapped with remap_4k_pfn have to be below 2^44,
i.e. 0x100000000000.
The patch also factors out the code in arch/powerpc/mm/hash_utils_64.c
that deals with demoting a process to use 4k pages into one function
that gets called in the various different places where we need to do
that. There were some discrepancies between exactly what was done in
the various places, such as a call to spu_flush_all_slbs in one case
but not in others.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-03 05:24:02 -06:00
|
|
|
#define remap_4k_pfn(vma, addr, pfn, prot) \
|
|
|
|
remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, \
|
|
|
|
__pgprot(pgprot_val((prot)) | _PAGE_4K_PFN))
|
|
|
|
|
2005-12-16 14:43:46 -07:00
|
|
|
#endif /* _ASM_POWERPC_PGTABLE_64K_H */
|