2005-04-16 16:20:36 -06:00
|
|
|
/*
|
|
|
|
* PowerPC64 SLB support.
|
|
|
|
*
|
|
|
|
* Copyright (C) 2004 David Gibson <dwg@au.ibm.com>, IBM
|
|
|
|
* Based on earlier code writteh by:
|
|
|
|
* Dave Engebretsen and Mike Corrigan {engebret|mikejc}@us.ibm.com
|
|
|
|
* Copyright (c) 2001 Dave Engebretsen
|
|
|
|
* Copyright (C) 2002 Anton Blanchard <anton@au.ibm.com>, IBM
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU General Public License
|
|
|
|
* as published by the Free Software Foundation; either version
|
|
|
|
* 2 of the License, or (at your option) any later version.
|
|
|
|
*/
|
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
#undef DEBUG
|
|
|
|
|
2005-04-16 16:20:36 -06:00
|
|
|
#include <asm/pgtable.h>
|
|
|
|
#include <asm/mmu.h>
|
|
|
|
#include <asm/mmu_context.h>
|
|
|
|
#include <asm/paca.h>
|
|
|
|
#include <asm/cputable.h>
|
2005-11-06 17:06:55 -07:00
|
|
|
#include <asm/cacheflush.h>
|
2006-08-07 00:19:19 -06:00
|
|
|
#include <asm/smp.h>
|
2006-11-13 18:57:38 -07:00
|
|
|
#include <asm/firmware.h>
|
2006-08-07 00:19:19 -06:00
|
|
|
#include <linux/compiler.h>
|
2007-10-29 13:24:19 -06:00
|
|
|
#include <asm/udbg.h>
|
2005-11-06 17:06:55 -07:00
|
|
|
|
|
|
|
#ifdef DEBUG
|
[POWERPC] vmemmap fixes to use smaller pages
This changes vmemmap to use a different region (region 0xf) of the
address space, and to configure the page size of that region
dynamically at boot.
The problem with the current approach of always using 16M pages is that
it's not well suited to machines that have small amounts of memory such
as small partitions on pseries, or PS3's.
In fact, on the PS3, failure to allocate the 16M page backing vmmemmap
tends to prevent hotplugging the HV's "additional" memory, thus limiting
the available memory even more, from my experience down to something
like 80M total, which makes it really not very useable.
The logic used by my match to choose the vmemmap page size is:
- If 16M pages are available and there's 1G or more RAM at boot,
use that size.
- Else if 64K pages are available, use that
- Else use 4K pages
I've tested on a POWER6 (16M pages) and on an iSeries POWER3 (4K pages)
and it seems to work fine.
Note that I intend to change the way we organize the kernel regions &
SLBs so the actual region will change from 0xf back to something else at
one point, as I simplify the SLB miss handler, but that will be for a
later patch.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-29 23:41:48 -06:00
|
|
|
#define DBG(fmt...) printk(fmt)
|
2005-11-06 17:06:55 -07:00
|
|
|
#else
|
2008-04-30 16:24:58 -06:00
|
|
|
#define DBG pr_debug
|
2005-11-06 17:06:55 -07:00
|
|
|
#endif
|
2005-04-16 16:20:36 -06:00
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
extern void slb_allocate_realmode(unsigned long ea);
|
|
|
|
extern void slb_allocate_user(unsigned long ea);
|
|
|
|
|
|
|
|
static void slb_allocate(unsigned long ea)
|
|
|
|
{
|
|
|
|
/* Currently, we do real mode for all SLBs including user, but
|
|
|
|
* that will change if we bring back dynamic VSIDs
|
|
|
|
*/
|
|
|
|
slb_allocate_realmode(ea);
|
|
|
|
}
|
2005-04-16 16:20:36 -06:00
|
|
|
|
[POWERPC] Bolt in SLB entry for kernel stack on secondary cpus
This fixes a regression reported by Kamalesh Bulabel where a POWER4
machine would crash because of an SLB miss at a point where the SLB
miss exception was unrecoverable. This regression is tracked at:
http://bugzilla.kernel.org/show_bug.cgi?id=10082
SLB misses at such points shouldn't happen because the kernel stack is
the only memory accessed other than things in the first segment of the
linear mapping (which is mapped at all times by entry 0 of the SLB).
The context switch code ensures that SLB entry 2 covers the kernel
stack, if it is not already covered by entry 0. None of entries 0
to 2 are ever replaced by the SLB miss handler.
Where this went wrong is that the context switch code assumes it
doesn't have to write to SLB entry 2 if the new kernel stack is in the
same segment as the old kernel stack, since entry 2 should already be
correct. However, when we start up a secondary cpu, it calls
slb_initialize, which doesn't set up entry 2. This is correct for
the boot cpu, where we will be using a stack in the kernel BSS at this
point (i.e. init_thread_union), but not necessarily for secondary
cpus, whose initial stack can be allocated anywhere. This doesn't
cause any immediate problem since the SLB miss handler will just
create an SLB entry somewhere else to cover the initial stack.
In fact it's possible for the cpu to go quite a long time without SLB
entry 2 being valid. Eventually, though, the entry created by the SLB
miss handler will get overwritten by some other entry, and if the next
access to the stack is at an unrecoverable point, we get the crash.
This fixes the problem by making slb_initialize create a suitable
entry for the kernel stack, if we are on a secondary cpu and the stack
isn't covered by SLB entry 0. This requires initializing the
get_paca()->kstack field earlier, so I do that in smp_create_idle
where the current field is initialized. This also abstracts a bit of
the computation that mk_esid_data in slb.c does so that it can be used
in slb_initialize.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-01 22:29:12 -06:00
|
|
|
#define slb_esid_mask(ssize) \
|
|
|
|
(((ssize) == MMU_SEGSIZE_256M)? ESID_MASK: ESID_MASK_1T)
|
|
|
|
|
2007-10-11 04:37:10 -06:00
|
|
|
static inline unsigned long mk_esid_data(unsigned long ea, int ssize,
|
|
|
|
unsigned long slot)
|
2005-04-16 16:20:36 -06:00
|
|
|
{
|
[POWERPC] Bolt in SLB entry for kernel stack on secondary cpus
This fixes a regression reported by Kamalesh Bulabel where a POWER4
machine would crash because of an SLB miss at a point where the SLB
miss exception was unrecoverable. This regression is tracked at:
http://bugzilla.kernel.org/show_bug.cgi?id=10082
SLB misses at such points shouldn't happen because the kernel stack is
the only memory accessed other than things in the first segment of the
linear mapping (which is mapped at all times by entry 0 of the SLB).
The context switch code ensures that SLB entry 2 covers the kernel
stack, if it is not already covered by entry 0. None of entries 0
to 2 are ever replaced by the SLB miss handler.
Where this went wrong is that the context switch code assumes it
doesn't have to write to SLB entry 2 if the new kernel stack is in the
same segment as the old kernel stack, since entry 2 should already be
correct. However, when we start up a secondary cpu, it calls
slb_initialize, which doesn't set up entry 2. This is correct for
the boot cpu, where we will be using a stack in the kernel BSS at this
point (i.e. init_thread_union), but not necessarily for secondary
cpus, whose initial stack can be allocated anywhere. This doesn't
cause any immediate problem since the SLB miss handler will just
create an SLB entry somewhere else to cover the initial stack.
In fact it's possible for the cpu to go quite a long time without SLB
entry 2 being valid. Eventually, though, the entry created by the SLB
miss handler will get overwritten by some other entry, and if the next
access to the stack is at an unrecoverable point, we get the crash.
This fixes the problem by making slb_initialize create a suitable
entry for the kernel stack, if we are on a secondary cpu and the stack
isn't covered by SLB entry 0. This requires initializing the
get_paca()->kstack field earlier, so I do that in smp_create_idle
where the current field is initialized. This also abstracts a bit of
the computation that mk_esid_data in slb.c does so that it can be used
in slb_initialize.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-01 22:29:12 -06:00
|
|
|
return (ea & slb_esid_mask(ssize)) | SLB_ESID_V | slot;
|
2005-04-16 16:20:36 -06:00
|
|
|
}
|
|
|
|
|
2007-10-11 04:37:10 -06:00
|
|
|
#define slb_vsid_shift(ssize) \
|
|
|
|
((ssize) == MMU_SEGSIZE_256M? SLB_VSID_SHIFT: SLB_VSID_SHIFT_1T)
|
|
|
|
|
|
|
|
static inline unsigned long mk_vsid_data(unsigned long ea, int ssize,
|
|
|
|
unsigned long flags)
|
2005-04-16 16:20:36 -06:00
|
|
|
{
|
2007-10-11 04:37:10 -06:00
|
|
|
return (get_kernel_vsid(ea, ssize) << slb_vsid_shift(ssize)) | flags |
|
|
|
|
((unsigned long) ssize << SLB_VSID_SSIZE_SHIFT);
|
2005-04-16 16:20:36 -06:00
|
|
|
}
|
|
|
|
|
2007-10-11 04:37:10 -06:00
|
|
|
static inline void slb_shadow_update(unsigned long ea, int ssize,
|
2007-08-02 19:55:39 -06:00
|
|
|
unsigned long flags,
|
2006-08-07 00:19:19 -06:00
|
|
|
unsigned long entry)
|
2005-04-16 16:20:36 -06:00
|
|
|
{
|
2006-08-07 00:19:19 -06:00
|
|
|
/*
|
|
|
|
* Clear the ESID first so the entry is not valid while we are
|
2007-08-24 00:58:37 -06:00
|
|
|
* updating it. No write barriers are needed here, provided
|
|
|
|
* we only update the current CPU's SLB shadow buffer.
|
2006-08-07 00:19:19 -06:00
|
|
|
*/
|
|
|
|
get_slb_shadow()->save_area[entry].esid = 0;
|
2007-10-11 04:37:10 -06:00
|
|
|
get_slb_shadow()->save_area[entry].vsid = mk_vsid_data(ea, ssize, flags);
|
|
|
|
get_slb_shadow()->save_area[entry].esid = mk_esid_data(ea, ssize, entry);
|
2006-08-07 00:19:19 -06:00
|
|
|
}
|
|
|
|
|
2007-08-10 05:04:07 -06:00
|
|
|
static inline void slb_shadow_clear(unsigned long entry)
|
2006-08-07 00:19:19 -06:00
|
|
|
{
|
2007-08-10 05:04:07 -06:00
|
|
|
get_slb_shadow()->save_area[entry].esid = 0;
|
2005-04-16 16:20:36 -06:00
|
|
|
}
|
|
|
|
|
2007-10-11 04:37:10 -06:00
|
|
|
static inline void create_shadowed_slbe(unsigned long ea, int ssize,
|
|
|
|
unsigned long flags,
|
2007-08-24 21:14:28 -06:00
|
|
|
unsigned long entry)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Updating the shadow buffer before writing the SLB ensures
|
|
|
|
* we don't get a stale entry here if we get preempted by PHYP
|
|
|
|
* between these two statements.
|
|
|
|
*/
|
2007-10-11 04:37:10 -06:00
|
|
|
slb_shadow_update(ea, ssize, flags, entry);
|
2007-08-24 21:14:28 -06:00
|
|
|
|
|
|
|
asm volatile("slbmte %0,%1" :
|
2007-10-11 04:37:10 -06:00
|
|
|
: "r" (mk_vsid_data(ea, ssize, flags)),
|
|
|
|
"r" (mk_esid_data(ea, ssize, entry))
|
2007-08-24 21:14:28 -06:00
|
|
|
: "memory" );
|
|
|
|
}
|
|
|
|
|
2006-06-14 18:45:18 -06:00
|
|
|
void slb_flush_and_rebolt(void)
|
2005-04-16 16:20:36 -06:00
|
|
|
{
|
|
|
|
/* If you change this make sure you change SLB_NUM_BOLTED
|
|
|
|
* appropriately too. */
|
2006-06-14 18:45:18 -06:00
|
|
|
unsigned long linear_llp, vmalloc_llp, lflags, vflags;
|
2007-10-11 04:37:10 -06:00
|
|
|
unsigned long ksp_esid_data, ksp_vsid_data;
|
2005-04-16 16:20:36 -06:00
|
|
|
|
|
|
|
WARN_ON(!irqs_disabled());
|
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
linear_llp = mmu_psize_defs[mmu_linear_psize].sllp;
|
2006-06-14 18:45:18 -06:00
|
|
|
vmalloc_llp = mmu_psize_defs[mmu_vmalloc_psize].sllp;
|
2005-11-06 17:06:55 -07:00
|
|
|
lflags = SLB_VSID_KERNEL | linear_llp;
|
2006-06-14 18:45:18 -06:00
|
|
|
vflags = SLB_VSID_KERNEL | vmalloc_llp;
|
2005-04-16 16:20:36 -06:00
|
|
|
|
2007-10-11 04:37:10 -06:00
|
|
|
ksp_esid_data = mk_esid_data(get_paca()->kstack, mmu_kernel_ssize, 2);
|
|
|
|
if ((ksp_esid_data & ~0xfffffffUL) <= PAGE_OFFSET) {
|
2005-04-16 16:20:36 -06:00
|
|
|
ksp_esid_data &= ~SLB_ESID_V;
|
2007-10-11 04:37:10 -06:00
|
|
|
ksp_vsid_data = 0;
|
2007-08-10 05:04:07 -06:00
|
|
|
slb_shadow_clear(2);
|
|
|
|
} else {
|
|
|
|
/* Update stack entry; others don't change */
|
2007-10-11 04:37:10 -06:00
|
|
|
slb_shadow_update(get_paca()->kstack, mmu_kernel_ssize, lflags, 2);
|
|
|
|
ksp_vsid_data = get_slb_shadow()->save_area[2].vsid;
|
2007-08-10 05:04:07 -06:00
|
|
|
}
|
2006-08-07 00:19:19 -06:00
|
|
|
|
2008-03-16 22:27:09 -06:00
|
|
|
/*
|
|
|
|
* We can't take a PMU exception in the following code, so hard
|
|
|
|
* disable interrupts.
|
|
|
|
*/
|
|
|
|
hard_irq_disable();
|
|
|
|
|
2005-04-16 16:20:36 -06:00
|
|
|
/* We need to do this all in asm, so we're sure we don't touch
|
|
|
|
* the stack between the slbia and rebolting it. */
|
|
|
|
asm volatile("isync\n"
|
|
|
|
"slbia\n"
|
|
|
|
/* Slot 1 - first VMALLOC segment */
|
|
|
|
"slbmte %0,%1\n"
|
|
|
|
/* Slot 2 - kernel stack */
|
|
|
|
"slbmte %2,%3\n"
|
|
|
|
"isync"
|
2007-10-11 04:37:10 -06:00
|
|
|
:: "r"(mk_vsid_data(VMALLOC_START, mmu_kernel_ssize, vflags)),
|
|
|
|
"r"(mk_esid_data(VMALLOC_START, mmu_kernel_ssize, 1)),
|
|
|
|
"r"(ksp_vsid_data),
|
2005-04-16 16:20:36 -06:00
|
|
|
"r"(ksp_esid_data)
|
|
|
|
: "memory");
|
|
|
|
}
|
|
|
|
|
2007-08-02 19:55:39 -06:00
|
|
|
void slb_vmalloc_update(void)
|
|
|
|
{
|
|
|
|
unsigned long vflags;
|
|
|
|
|
|
|
|
vflags = SLB_VSID_KERNEL | mmu_psize_defs[mmu_vmalloc_psize].sllp;
|
2007-10-11 04:37:10 -06:00
|
|
|
slb_shadow_update(VMALLOC_START, mmu_kernel_ssize, vflags, 1);
|
2007-08-02 19:55:39 -06:00
|
|
|
slb_flush_and_rebolt();
|
|
|
|
}
|
|
|
|
|
2007-10-30 12:59:33 -06:00
|
|
|
/* Helper function to compare esids. There are four cases to handle.
|
|
|
|
* 1. The system is not 1T segment size capable. Use the GET_ESID compare.
|
|
|
|
* 2. The system is 1T capable, both addresses are < 1T, use the GET_ESID compare.
|
|
|
|
* 3. The system is 1T capable, only one of the two addresses is > 1T. This is not a match.
|
|
|
|
* 4. The system is 1T capable, both addresses are > 1T, use the GET_ESID_1T macro to compare.
|
|
|
|
*/
|
|
|
|
static inline int esids_match(unsigned long addr1, unsigned long addr2)
|
|
|
|
{
|
|
|
|
int esid_1t_count;
|
|
|
|
|
|
|
|
/* System is not 1T segment size capable. */
|
|
|
|
if (!cpu_has_feature(CPU_FTR_1T_SEGMENT))
|
|
|
|
return (GET_ESID(addr1) == GET_ESID(addr2));
|
|
|
|
|
|
|
|
esid_1t_count = (((addr1 >> SID_SHIFT_1T) != 0) +
|
|
|
|
((addr2 >> SID_SHIFT_1T) != 0));
|
|
|
|
|
|
|
|
/* both addresses are < 1T */
|
|
|
|
if (esid_1t_count == 0)
|
|
|
|
return (GET_ESID(addr1) == GET_ESID(addr2));
|
|
|
|
|
|
|
|
/* One address < 1T, the other > 1T. Not a match */
|
|
|
|
if (esid_1t_count == 1)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/* Both addresses are > 1T. */
|
|
|
|
return (GET_ESID_1T(addr1) == GET_ESID_1T(addr2));
|
|
|
|
}
|
|
|
|
|
2005-04-16 16:20:36 -06:00
|
|
|
/* Flush all user entries from the segment table of the current processor. */
|
|
|
|
void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
unsigned long offset = get_paca()->slb_cache_ptr;
|
2007-10-11 04:37:10 -06:00
|
|
|
unsigned long slbie_data = 0;
|
2005-04-16 16:20:36 -06:00
|
|
|
unsigned long pc = KSTK_EIP(tsk);
|
|
|
|
unsigned long stack = KSTK_ESP(tsk);
|
|
|
|
unsigned long unmapped_base;
|
|
|
|
|
2007-10-15 08:58:59 -06:00
|
|
|
if (!cpu_has_feature(CPU_FTR_NO_SLBIE_B) &&
|
|
|
|
offset <= SLB_CACHE_ENTRIES) {
|
2005-04-16 16:20:36 -06:00
|
|
|
int i;
|
|
|
|
asm volatile("isync" : : : "memory");
|
|
|
|
for (i = 0; i < offset; i++) {
|
2007-10-11 04:37:10 -06:00
|
|
|
slbie_data = (unsigned long)get_paca()->slb_cache[i]
|
|
|
|
<< SID_SHIFT; /* EA */
|
|
|
|
slbie_data |= user_segment_size(slbie_data)
|
|
|
|
<< SLBIE_SSIZE_SHIFT;
|
|
|
|
slbie_data |= SLBIE_C; /* C set for user addresses */
|
|
|
|
asm volatile("slbie %0" : : "r" (slbie_data));
|
2005-04-16 16:20:36 -06:00
|
|
|
}
|
|
|
|
asm volatile("isync" : : : "memory");
|
|
|
|
} else {
|
|
|
|
slb_flush_and_rebolt();
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Workaround POWER5 < DD2.1 issue */
|
|
|
|
if (offset == 1 || offset > SLB_CACHE_ENTRIES)
|
2007-10-11 04:37:10 -06:00
|
|
|
asm volatile("slbie %0" : : "r" (slbie_data));
|
2005-04-16 16:20:36 -06:00
|
|
|
|
|
|
|
get_paca()->slb_cache_ptr = 0;
|
|
|
|
get_paca()->context = mm->context;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* preload some userspace segments into the SLB.
|
|
|
|
*/
|
|
|
|
if (test_tsk_thread_flag(tsk, TIF_32BIT))
|
|
|
|
unmapped_base = TASK_UNMAPPED_BASE_USER32;
|
|
|
|
else
|
|
|
|
unmapped_base = TASK_UNMAPPED_BASE_USER64;
|
|
|
|
|
2005-12-04 00:39:15 -07:00
|
|
|
if (is_kernel_addr(pc))
|
2005-04-16 16:20:36 -06:00
|
|
|
return;
|
|
|
|
slb_allocate(pc);
|
|
|
|
|
2007-10-30 12:59:33 -06:00
|
|
|
if (esids_match(pc,stack))
|
2005-04-16 16:20:36 -06:00
|
|
|
return;
|
|
|
|
|
2005-12-04 00:39:15 -07:00
|
|
|
if (is_kernel_addr(stack))
|
2005-04-16 16:20:36 -06:00
|
|
|
return;
|
|
|
|
slb_allocate(stack);
|
|
|
|
|
2007-10-30 12:59:33 -06:00
|
|
|
if (esids_match(pc,unmapped_base) || esids_match(stack,unmapped_base))
|
2005-04-16 16:20:36 -06:00
|
|
|
return;
|
|
|
|
|
2005-12-04 00:39:15 -07:00
|
|
|
if (is_kernel_addr(unmapped_base))
|
2005-04-16 16:20:36 -06:00
|
|
|
return;
|
|
|
|
slb_allocate(unmapped_base);
|
|
|
|
}
|
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
static inline void patch_slb_encoding(unsigned int *insn_addr,
|
|
|
|
unsigned int immed)
|
|
|
|
{
|
|
|
|
/* Assume the instruction had a "0" immediate value, just
|
|
|
|
* "or" in the new value
|
|
|
|
*/
|
|
|
|
*insn_addr |= immed;
|
|
|
|
flush_icache_range((unsigned long)insn_addr, 4+
|
|
|
|
(unsigned long)insn_addr);
|
|
|
|
}
|
|
|
|
|
2005-04-16 16:20:36 -06:00
|
|
|
void slb_initialize(void)
|
|
|
|
{
|
2006-06-14 18:45:18 -06:00
|
|
|
unsigned long linear_llp, vmalloc_llp, io_llp;
|
2006-11-13 18:57:38 -07:00
|
|
|
unsigned long lflags, vflags;
|
2005-11-06 17:06:55 -07:00
|
|
|
static int slb_encoding_inited;
|
|
|
|
extern unsigned int *slb_miss_kernel_load_linear;
|
2006-06-14 18:45:18 -06:00
|
|
|
extern unsigned int *slb_miss_kernel_load_io;
|
2007-12-05 23:24:48 -07:00
|
|
|
extern unsigned int *slb_compare_rr_to_size;
|
[POWERPC] vmemmap fixes to use smaller pages
This changes vmemmap to use a different region (region 0xf) of the
address space, and to configure the page size of that region
dynamically at boot.
The problem with the current approach of always using 16M pages is that
it's not well suited to machines that have small amounts of memory such
as small partitions on pseries, or PS3's.
In fact, on the PS3, failure to allocate the 16M page backing vmmemmap
tends to prevent hotplugging the HV's "additional" memory, thus limiting
the available memory even more, from my experience down to something
like 80M total, which makes it really not very useable.
The logic used by my match to choose the vmemmap page size is:
- If 16M pages are available and there's 1G or more RAM at boot,
use that size.
- Else if 64K pages are available, use that
- Else use 4K pages
I've tested on a POWER6 (16M pages) and on an iSeries POWER3 (4K pages)
and it seems to work fine.
Note that I intend to change the way we organize the kernel regions &
SLBs so the actual region will change from 0xf back to something else at
one point, as I simplify the SLB miss handler, but that will be for a
later patch.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-29 23:41:48 -06:00
|
|
|
#ifdef CONFIG_SPARSEMEM_VMEMMAP
|
|
|
|
extern unsigned int *slb_miss_kernel_load_vmemmap;
|
|
|
|
unsigned long vmemmap_llp;
|
|
|
|
#endif
|
2005-11-06 17:06:55 -07:00
|
|
|
|
|
|
|
/* Prepare our SLB miss handler based on our page size */
|
|
|
|
linear_llp = mmu_psize_defs[mmu_linear_psize].sllp;
|
2006-06-14 18:45:18 -06:00
|
|
|
io_llp = mmu_psize_defs[mmu_io_psize].sllp;
|
|
|
|
vmalloc_llp = mmu_psize_defs[mmu_vmalloc_psize].sllp;
|
|
|
|
get_paca()->vmalloc_sllp = SLB_VSID_KERNEL | vmalloc_llp;
|
[POWERPC] vmemmap fixes to use smaller pages
This changes vmemmap to use a different region (region 0xf) of the
address space, and to configure the page size of that region
dynamically at boot.
The problem with the current approach of always using 16M pages is that
it's not well suited to machines that have small amounts of memory such
as small partitions on pseries, or PS3's.
In fact, on the PS3, failure to allocate the 16M page backing vmmemmap
tends to prevent hotplugging the HV's "additional" memory, thus limiting
the available memory even more, from my experience down to something
like 80M total, which makes it really not very useable.
The logic used by my match to choose the vmemmap page size is:
- If 16M pages are available and there's 1G or more RAM at boot,
use that size.
- Else if 64K pages are available, use that
- Else use 4K pages
I've tested on a POWER6 (16M pages) and on an iSeries POWER3 (4K pages)
and it seems to work fine.
Note that I intend to change the way we organize the kernel regions &
SLBs so the actual region will change from 0xf back to something else at
one point, as I simplify the SLB miss handler, but that will be for a
later patch.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-29 23:41:48 -06:00
|
|
|
#ifdef CONFIG_SPARSEMEM_VMEMMAP
|
|
|
|
vmemmap_llp = mmu_psize_defs[mmu_vmemmap_psize].sllp;
|
|
|
|
#endif
|
2005-11-06 17:06:55 -07:00
|
|
|
if (!slb_encoding_inited) {
|
|
|
|
slb_encoding_inited = 1;
|
|
|
|
patch_slb_encoding(slb_miss_kernel_load_linear,
|
|
|
|
SLB_VSID_KERNEL | linear_llp);
|
2006-06-14 18:45:18 -06:00
|
|
|
patch_slb_encoding(slb_miss_kernel_load_io,
|
|
|
|
SLB_VSID_KERNEL | io_llp);
|
2007-12-05 23:24:48 -07:00
|
|
|
patch_slb_encoding(slb_compare_rr_to_size,
|
|
|
|
mmu_slb_size);
|
2005-11-06 17:06:55 -07:00
|
|
|
|
2008-04-30 16:24:58 -06:00
|
|
|
DBG("SLB: linear LLP = %04lx\n", linear_llp);
|
|
|
|
DBG("SLB: io LLP = %04lx\n", io_llp);
|
[POWERPC] vmemmap fixes to use smaller pages
This changes vmemmap to use a different region (region 0xf) of the
address space, and to configure the page size of that region
dynamically at boot.
The problem with the current approach of always using 16M pages is that
it's not well suited to machines that have small amounts of memory such
as small partitions on pseries, or PS3's.
In fact, on the PS3, failure to allocate the 16M page backing vmmemmap
tends to prevent hotplugging the HV's "additional" memory, thus limiting
the available memory even more, from my experience down to something
like 80M total, which makes it really not very useable.
The logic used by my match to choose the vmemmap page size is:
- If 16M pages are available and there's 1G or more RAM at boot,
use that size.
- Else if 64K pages are available, use that
- Else use 4K pages
I've tested on a POWER6 (16M pages) and on an iSeries POWER3 (4K pages)
and it seems to work fine.
Note that I intend to change the way we organize the kernel regions &
SLBs so the actual region will change from 0xf back to something else at
one point, as I simplify the SLB miss handler, but that will be for a
later patch.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-29 23:41:48 -06:00
|
|
|
|
|
|
|
#ifdef CONFIG_SPARSEMEM_VMEMMAP
|
|
|
|
patch_slb_encoding(slb_miss_kernel_load_vmemmap,
|
|
|
|
SLB_VSID_KERNEL | vmemmap_llp);
|
|
|
|
DBG("SLB: vmemmap LLP = %04lx\n", vmemmap_llp);
|
|
|
|
#endif
|
2005-11-06 17:06:55 -07:00
|
|
|
}
|
|
|
|
|
2006-11-13 18:57:38 -07:00
|
|
|
get_paca()->stab_rr = SLB_NUM_BOLTED;
|
|
|
|
|
2005-04-16 16:20:36 -06:00
|
|
|
/* On iSeries the bolted entries have already been set up by
|
|
|
|
* the hypervisor from the lparMap data in head.S */
|
2006-11-13 18:57:38 -07:00
|
|
|
if (firmware_has_feature(FW_FEATURE_ISERIES))
|
|
|
|
return;
|
2005-04-16 16:20:36 -06:00
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
lflags = SLB_VSID_KERNEL | linear_llp;
|
2006-06-14 18:45:18 -06:00
|
|
|
vflags = SLB_VSID_KERNEL | vmalloc_llp;
|
2005-04-16 16:20:36 -06:00
|
|
|
|
2005-11-06 17:06:55 -07:00
|
|
|
/* Invalidate the entire SLB (even slot 0) & all the ERATS */
|
2007-08-24 21:14:28 -06:00
|
|
|
asm volatile("isync":::"memory");
|
|
|
|
asm volatile("slbmte %0,%0"::"r" (0) : "memory");
|
|
|
|
asm volatile("isync; slbia; isync":::"memory");
|
2007-10-11 04:37:10 -06:00
|
|
|
create_shadowed_slbe(PAGE_OFFSET, mmu_kernel_ssize, lflags, 0);
|
2007-08-24 21:14:28 -06:00
|
|
|
|
2007-10-11 04:37:10 -06:00
|
|
|
create_shadowed_slbe(VMALLOC_START, mmu_kernel_ssize, vflags, 1);
|
2007-08-24 21:14:28 -06:00
|
|
|
|
[POWERPC] Bolt in SLB entry for kernel stack on secondary cpus
This fixes a regression reported by Kamalesh Bulabel where a POWER4
machine would crash because of an SLB miss at a point where the SLB
miss exception was unrecoverable. This regression is tracked at:
http://bugzilla.kernel.org/show_bug.cgi?id=10082
SLB misses at such points shouldn't happen because the kernel stack is
the only memory accessed other than things in the first segment of the
linear mapping (which is mapped at all times by entry 0 of the SLB).
The context switch code ensures that SLB entry 2 covers the kernel
stack, if it is not already covered by entry 0. None of entries 0
to 2 are ever replaced by the SLB miss handler.
Where this went wrong is that the context switch code assumes it
doesn't have to write to SLB entry 2 if the new kernel stack is in the
same segment as the old kernel stack, since entry 2 should already be
correct. However, when we start up a secondary cpu, it calls
slb_initialize, which doesn't set up entry 2. This is correct for
the boot cpu, where we will be using a stack in the kernel BSS at this
point (i.e. init_thread_union), but not necessarily for secondary
cpus, whose initial stack can be allocated anywhere. This doesn't
cause any immediate problem since the SLB miss handler will just
create an SLB entry somewhere else to cover the initial stack.
In fact it's possible for the cpu to go quite a long time without SLB
entry 2 being valid. Eventually, though, the entry created by the SLB
miss handler will get overwritten by some other entry, and if the next
access to the stack is at an unrecoverable point, we get the crash.
This fixes the problem by making slb_initialize create a suitable
entry for the kernel stack, if we are on a secondary cpu and the stack
isn't covered by SLB entry 0. This requires initializing the
get_paca()->kstack field earlier, so I do that in smp_create_idle
where the current field is initialized. This also abstracts a bit of
the computation that mk_esid_data in slb.c does so that it can be used
in slb_initialize.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-01 22:29:12 -06:00
|
|
|
/* For the boot cpu, we're running on the stack in init_thread_union,
|
|
|
|
* which is in the first segment of the linear mapping, and also
|
|
|
|
* get_paca()->kstack hasn't been initialized yet.
|
|
|
|
* For secondary cpus, we need to bolt the kernel stack entry now.
|
|
|
|
*/
|
2008-01-14 23:29:33 -07:00
|
|
|
slb_shadow_clear(2);
|
[POWERPC] Bolt in SLB entry for kernel stack on secondary cpus
This fixes a regression reported by Kamalesh Bulabel where a POWER4
machine would crash because of an SLB miss at a point where the SLB
miss exception was unrecoverable. This regression is tracked at:
http://bugzilla.kernel.org/show_bug.cgi?id=10082
SLB misses at such points shouldn't happen because the kernel stack is
the only memory accessed other than things in the first segment of the
linear mapping (which is mapped at all times by entry 0 of the SLB).
The context switch code ensures that SLB entry 2 covers the kernel
stack, if it is not already covered by entry 0. None of entries 0
to 2 are ever replaced by the SLB miss handler.
Where this went wrong is that the context switch code assumes it
doesn't have to write to SLB entry 2 if the new kernel stack is in the
same segment as the old kernel stack, since entry 2 should already be
correct. However, when we start up a secondary cpu, it calls
slb_initialize, which doesn't set up entry 2. This is correct for
the boot cpu, where we will be using a stack in the kernel BSS at this
point (i.e. init_thread_union), but not necessarily for secondary
cpus, whose initial stack can be allocated anywhere. This doesn't
cause any immediate problem since the SLB miss handler will just
create an SLB entry somewhere else to cover the initial stack.
In fact it's possible for the cpu to go quite a long time without SLB
entry 2 being valid. Eventually, though, the entry created by the SLB
miss handler will get overwritten by some other entry, and if the next
access to the stack is at an unrecoverable point, we get the crash.
This fixes the problem by making slb_initialize create a suitable
entry for the kernel stack, if we are on a secondary cpu and the stack
isn't covered by SLB entry 0. This requires initializing the
get_paca()->kstack field earlier, so I do that in smp_create_idle
where the current field is initialized. This also abstracts a bit of
the computation that mk_esid_data in slb.c does so that it can be used
in slb_initialize.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-01 22:29:12 -06:00
|
|
|
if (raw_smp_processor_id() != boot_cpuid &&
|
|
|
|
(get_paca()->kstack & slb_esid_mask(mmu_kernel_ssize)) > PAGE_OFFSET)
|
|
|
|
create_shadowed_slbe(get_paca()->kstack,
|
|
|
|
mmu_kernel_ssize, lflags, 2);
|
2008-01-14 23:29:33 -07:00
|
|
|
|
2007-08-24 21:14:28 -06:00
|
|
|
asm volatile("isync":::"memory");
|
2005-04-16 16:20:36 -06:00
|
|
|
}
|