kernel-fxtec-pro1x

Author	SHA1	Message	Date
Eduardo Habkost	0814e0bace	x86, 64-bit: split set_pte_vaddr() We will need to set a pte on l3_user_pgt. Extract set_pte_vaddr_pud() from set_pte_vaddr(), that will accept the l3 page table as parameter. This change should be a no-op for existing code. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:09 +02:00
Jeremy Fitzhardinge	7c934d3990	x86, 64-bit: create small vmemmap mappings if PSE not available If PSE is not available, then fall back to 4k page mappings for the vmemmap area. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:08 +02:00
Jeremy Fitzhardinge	4f9c11dd49	x86, 64-bit: adjust mapping of physical pagetables to work with Xen This makes a few of changes to the construction of the initial pagetables to work better with paravirt_ops/Xen. The main areas are: 1. Support non-PSE mapping of memory, since Xen doesn't currently allow 2M pages to be mapped in guests. 2. Make sure that the ioremap alias of all pages are dropped before attaching the new page to the pagetable. This avoids having writable aliases of pagetable pages. 3. Preserve existing pagetable entries, rather than overwriting. Its possible that a fair amount of pagetable has already been constructed, so reuse what's already in place rather than ignoring and overwriting it. The algorithm relies on the invariant that any page which is part of the kernel pagetable is itself mapped in the linear memory area. This way, it can avoid using ioremap on a pagetable page. The invariant holds because it maps memory from low to high addresses, and also allocates memory from low to high. Each allocated page can map at least 2M of address space, so the mapped area will always progress much faster than the allocated area. It relies on the early boot code mapping enough pages to get started. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:07 +02:00
Jeremy Fitzhardinge	d8d5900ef8	x86: preallocate and prepopulate separately Jan Beulich points out that vmalloc_sync_all() assumes that the kernel's pmd is always expected to be present in the pgd. The current pgd construction code will add the pgd to the pgd_list before its pmds have been pre-populated, thereby making it visible to vmalloc_sync_all(). However, because pgd_prepopulate_pmd also does the allocation, it may block and cannot be done under spinlock. The solution is to preallocate the pmds out of the spinlock, then populate them while holding the pgd_list lock. This patch also pulls the pmd preallocation and mop-up functions out to be common, assuming that the compiler will generate no code for them when PREALLOCTED_PMDS is 0. Also, there's no need for pgd_ctor to clear the pgd again, since it's allocated as a zeroed page. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:02 +02:00
Jeremy Fitzhardinge	eba0045ff8	x86/paravirt: add a pgd_alloc/free hooks Add hooks which are called at pgd_alloc/free time. The pgd_alloc hook may return an error code, which if non-zero, causes the pgd allocation to be failed. The hooks may be used to allocate/free auxillary per-pgd information. also fix: > * Ingo Molnar <mingo@elte.hu> wrote: > > include/asm/pgalloc.h: In function ‘paravirt_pgd_free': > include/asm/pgalloc.h:14: error: parameter name omitted > arch/x86/kernel/entry_64.S: In file included from > arch/x86/kernel/traps_64.c:51:include/asm/pgalloc.h: In function ‘paravirt_pgd_free': > include/asm/pgalloc.h:14: error: parameter name omitted Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:01 +02:00
Jeremy Fitzhardinge	67350a5c45	x86: simplify vmalloc_sync_all vmalloc_sync_all() is only called from register_die_notifier and alloc_vm_area. Neither is on any performance-critical paths, so vmalloc_sync_all() itself is not on any hot paths. Given that the optimisations in vmalloc_sync_all add a fair amount of code and complexity, and are fairly hard to evaluate for correctness, it's better to just remove them to simplify the code rather than worry about its absolute performance. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:59 +02:00
Yinghai Lu	c987d12f84	x86: remove end_pfn in 64bit and use max_pfn directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:38 +02:00
Yinghai Lu	d86623a0d5	x86: add table_top check for alloc_low_page in 64 bit that range is from find_e820_area, so don't try to use end_pfn to see if out of boundary...use table_top instead to avoid possible strange result while cross the boundary... also change early_printk to printk, because init_memory_mapping is after early param parsing, and console=uart8250 already working at that time. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:36 +02:00
Yinghai Lu	1a0db38e5f	x86: get max_pfn_mapped in init_memory_mapping so don't shift that in the loop Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:35 +02:00
Yinghai Lu	3a58a2a6c8	x86: introduce init_memory_mapping for 32bit #3 move kva related early backto initmem_init for numa32 Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:33 +02:00
Yinghai Lu	cfb0e53b05	x86: introduce init_memory_mapping for 32bit #2 moving relocate_initrd early Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:32 +02:00
Yinghai Lu	4e29684c40	x86: introduce init_memory_mapping for 32bit #1 ... so can we use mem below max_low_pfn earlier. this allows us to move several functions more early instead of waiting to after paging_init. That includes moving relocate_initrd() earlier in the bootup, and kva related early setup done in initmem_init. (in followup patches) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:32 +02:00
Jeremy Fitzhardinge	4583ed514e	x86, 64-bit: unify early_ioremap The 32-bit early_ioremap will work equally well for 64-bit, so just use it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:28 +02:00
Jeremy Fitzhardinge	bb23e403e5	x86, 64-bit: use p??_populate() to attach pages to pagetable Use the _populate() functions to attach new pages to a pagetable, to make sure the right paravirt_ops calls get called. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:27 +02:00
Yinghai Lu	11cd0bc140	x86: move some func calling from setup_arch to paging_init those function depend on paging setup pgtable, so they could access the ram in bootmem region but just get mapped. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:24 +02:00
Yinghai Lu	c09434571d	x86: numa32 pfn print out using hex instead Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:23 +02:00
Yinghai Lu	6a07a0edac	x86: fix compile warning in init_64.c len is long and ret is only for NUMA Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:23 +02:00
Yinghai Lu	346cafecde	x86: clean up min_low_pfn for 32bit we already had early_res support, so don't need to track min_low_pfn. keep it to 0 always. also use init_bootmem_node instead of init_bootmem, so don't touch min_low_pfn. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:21 +02:00
Yinghai Lu	2ec65f8b89	x86: clean up using max_low_pfn on 32-bit so that max_low_pfn is not changed after it is set. so we can move that early and out of initmem_init. could call find_low_pfn_range just after max_pfn is set. also could move reserve_initrd out of setup_bootmem_allocator so 32bit is more like 64bit. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:20 +02:00
Yinghai Lu	bef1568d97	x86: move reservetop and vmalloc parsing to pgtable_32.c also change reserve_top_address to __init attibute Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:19 +02:00
Yinghai Lu	90d967e0ef	x86: move find_max_low_pfn to init_32.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:18 +02:00
Yinghai Lu	225c37d71b	x86: introduce reserve_initrd Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:16 +02:00
Yinghai Lu	b2ac82a090	x86: introduce initmem_init for 32 bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:15 +02:00
Yinghai Lu	1f75d7e32e	x86: introduce initmem_init for 64 bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:14 +02:00
Yinghai Lu	d52d53b8a5	RFC x86: try to remove arch_get_ram_range want to remove arch_get_ram_range, and use early_node_map instead. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:27 +02:00
Jeremy Fitzhardinge	a7bf0bd5e6	build: add __page_aligned_data and __page_aligned_bss Making a variable page-aligned by using __attribute__((section(".data.page_aligned"))) is fragile because if sizeof(variable) is not also a multiple of page size, it leaves variables in the remainder of the section unaligned. This patch introduces two new qualifiers, __page_aligned_data and __page_aligned_bss to set the section and the alignment of variables. This makes page-aligned variables more robust because the linker will make sure they're aligned properly. Unfortunately it requires all page-aligned data to use these macros... Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:13 +02:00
Ingo Molnar	6236af82d8	Merge branch 'x86/fixmap' into x86/devel Conflicts: arch/x86/mm/init_64.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:24:29 +02:00
Ingo Molnar	2b4fa851b2	Merge branch 'x86/numa' into x86/devel Conflicts: arch/x86/Kconfig arch/x86/kernel/e820.c arch/x86/kernel/efi_64.c arch/x86/kernel/mpparse.c arch/x86/kernel/setup.c arch/x86/kernel/setup_32.c arch/x86/mm/init_64.c include/asm-x86/proto.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:59:23 +02:00
Bernhard Walle	3fd052b1b4	x86: add flags parameter to reserve_bootmem_generic() This patch adds a 'flags' parameter to reserve_bootmem_generic() like it already has been added in reserve_bootmem() with commit `72a7fe3967`. It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively change the behaviour. Since the change is x86-specific, I don't think it's necessary to add a new API for migration. There are only 4 users of that function. The change is necessary for the next patch, using reserve_bootmem_generic() for crashkernel reservation. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:49:49 +02:00
Thomas Gleixner	886533a3e3	x86: numa_64.c fix shadowed variable sparse mutters: arch/x86/mm/numa_64.c:195:27: warning: symbol 'end_pfn' shadows an earlier one Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:31:29 +02:00
Thomas Gleixner	864fc31ea5	x86: numa_64.c make local variables static plat_node_bdata, cmdline, nodemap_addr, nodemap_size are local to numa_64.c. Make them static Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:31:28 +02:00
Mike Travis	9f248bde9d	x86: remove the static 256k node_to_cpumask_map * Consolidate node_to_cpumask operations and remove the 256k byte node_to_cpumask_map. This is done by allocating the node_to_cpumask_map array after the number of possible nodes (nr_node_ids) is known. * Debug printouts when CONFIG_DEBUG_PER_CPU_MAPS is active have been increased. It now shows faults when calling node_to_cpumask() and node_to_cpumask_ptr(). For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:24 +02:00
Mike Travis	23ca4bba3e	x86: cleanup early per cpu variables/accesses v4 * Introduce a new PER_CPU macro called "EARLY_PER_CPU". This is used by some per_cpu variables that are initialized and accessed before there are per_cpu areas allocated. ["Early" in respect to per_cpu variables is "earlier than the per_cpu areas have been setup".] This patchset adds these new macros: DEFINE_EARLY_PER_CPU(_type, _name, _initvalue) EXPORT_EARLY_PER_CPU_SYMBOL(_name) DECLARE_EARLY_PER_CPU(_type, _name) early_per_cpu_ptr(_name) early_per_cpu_map(_name, _idx) early_per_cpu(_name, _cpu) The DEFINE macro defines the per_cpu variable as well as the early map and pointer. It also initializes the per_cpu variable and map elements to "_initvalue". The early_* macros provide access to the initial map (usually setup during system init) and the early pointer. This pointer is initialized to point to the early map but is then NULL'ed when the actual per_cpu areas are setup. After that the per_cpu variable is the correct access to the variable. The early_per_cpu() macro is not very efficient but does show how to access the variable if you have a function that can be called both "early" and "late". It tests the early ptr to be NULL, and if not then it's still valid. Otherwise, the per_cpu variable is used instead: #define early_per_cpu(_name, _cpu) \ (early_per_cpu_ptr(_name) ? \ early_per_cpu_ptr(_name)[_cpu] : \ per_cpu(_name, _cpu)) A better method is to actually check the pointer manually. In the case below, numa_set_node can be called both "early" and "late": void __cpuinit numa_set_node(int cpu, int node) { int cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map); if (cpu_to_node_map) cpu_to_node_map[cpu] = node; else per_cpu(x86_cpu_to_node_map, cpu) = node; } Add a flag "arch_provides_topology_pointers" that indicates pointers to topology cpumask_t maps are available. Otherwise, use the function returning the cpumask_t value. This is useful if cpumask_t set size is very large to avoid copying data on to/off of the stack. * The coverage of CONFIG_DEBUG_PER_CPU_MAPS has been increased while the non-debug case has been optimized a bit. * Remove an unreferenced compiler warning in drivers/base/topology.c * Clean up #ifdef in setup.c For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:20 +02:00
Ingo Molnar	3de352bbd8	Merge branch 'x86/mpparse' into x86/devel Conflicts: arch/x86/Kconfig arch/x86/kernel/io_apic_32.c arch/x86/kernel/setup_64.c arch/x86/mm/init_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:14:58 +02:00
Yinghai Lu	a4caa18efe	x86: fix compiling when CONFIG_X86_MPPARSE is not set Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:39:19 +02:00
Yinghai Lu	064d25f120	x86: merge setup_memory_map with e820 ... and kill e820_32/64.c and e820_32/64.h Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:25 +02:00
Yinghai Lu	cc9f7a0ccf	x86: kill bad_ppro so don't punish all other cpus without that problem when init highmem Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:19 +02:00
Yinghai Lu	b5bc6c0e55	x86, mm: use add_highpages_with_active_regions() for high pages init v2 use early_node_map to init high pages, so we can remove page_is_ram() and page_is_reserved_early() in the big loop with add_one_highpage also remove page_is_reserved_early(), it is not needed anymore. v2: fix the build of other platforms Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:37:25 +02:00
Yinghai Lu	cc1050bafe	x86: replace shrink_active_range() with remove_active_range() in case we have kva before ramdisk on a node, we still need to use those ranges. v2: reserve_early kva ram area, in case there are holes in highmem, to avoid those area could be treat as free high pages. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:29 +02:00
Yinghai Lu	d2dbf34332	x86: clean up reserve_bootmem_generic() and port it to 32-bit 1. add reserve_bootmem_generic for 32bit 2. change len to unsigned long 3. make early_res_to_bootmem to use it Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:17 +02:00
Bernhard Walle	8b2ef1d728	x86: add flags parameter to reserve_bootmem_generic() This patch adds a 'flags' parameter to reserve_bootmem_generic() like it already has been added in reserve_bootmem() with commit `72a7fe3967`. It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively change the behaviour. Since the change is x86-specific, I don't think it's necessary to add a new API for migration. There are only 4 users of that function. The change is necessary for the next patch, using reserve_bootmem_generic() for crashkernel reservation. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:34:54 +02:00
Ingo Molnar	896395c290	Merge branch 'linus' into tmp.x86.mpparse.new	2008-07-08 10:32:56 +02:00
Ingo Molnar	58cf35228f	Merge branches 'x86/mmio', 'x86/delay', 'x86/idle', 'x86/oprofile', 'x86/debug', 'x86/ptrace' and 'x86/amd-iommu' into x86/devel	2008-07-08 09:46:15 +02:00
Ingo Molnar	6924d1ab8b	Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel	2008-07-08 09:16:56 +02:00
Jiri Slaby	684eb0163a	x86_64: use PAGE_OFFSET in dump_pagetables Use PAGE_OFFSET macro instead of using 0xffff810000000000UL directly. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 08:12:06 +02:00
Thomas Gleixner	6e92a5a615	x86: add sparse annotations to ioremap arch/x86/mm/ioremap.c:308:11: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 08:12:05 +02:00
Thomas Gleixner	65280e613f	x86: janitor CPA statistics patch 1) Remove __meminit from update_pages_count. It is used inside split_pages() 2) Make the code depend on PROC_FS. Doing statistics for nothing is useless and not adding useless code is nice to the Linux tiny folks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 08:12:05 +02:00
Andi Kleen	ce0c0e50f9	x86, generic: CPA add statistics about state of direct mapping v4 Add information about the mapping state of the direct mapping to /proc/meminfo. I chose /proc/meminfo because that is where all the other memory statistics are too and it is a generally useful metric even outside debugging situations. A lot of split kernel pages means the kernel will run slower. This way we can see how many large pages are really used for it and how many are split. Useful for general insight into the kernel. v2: Add hotplug locking to 64bit to plug a very obscure theoretical race. 32bit doesn't need it because it doesn't support hotadd for lowmem. Fix some typos v3: Rename dpages_cnt Add CONFIG ifdef for count update as requested by tglx Expand description v4: Fix stupid bugs added in v3 Move update_page_count to pageattr.c Signed-off-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 08:11:45 +02:00
Ingo Molnar	4d51c7587b	Merge branch 'tracing/mmiotrace-mergefixups' into tracing/mmiotrace	2008-07-07 08:08:19 +02:00
Ingo Molnar	d763d5edf9	Merge branch 'linus' into tracing/mmiotrace	2008-07-07 08:07:35 +02:00
Gustavo Fernando Padovan	95c60b08c6	x86: remove unnecessary #ifdef CONFIG_X86_32...#else Remove the #ifdef conditional because this comparison is already done in user_mode_vm(). Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Cc: akpm@osdl.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-03 15:09:10 +02:00
Andrew Morton	27df66a406	arch/x86/mm/init_64.c: early_memtest(): fix types fix this warning: arch/x86/mm/init_64.c: In function 'early_memtest': arch/x86/mm/init_64.c:524: warning: passing argument 2 of 'find_e820_area_size' from incompatible pointer type Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-03 10:15:51 +02:00
Vegard Nossum	f294a8ce21	x86: small unifications of address printing 'man 3 printf' tells me that %p should be printed as if by %#x, but this is not true for the kernel, which does not use the '0x' prefix for the %p conversion specifier. A small cast to (void *) is also prettier than #ifdef/#else/#endif. Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-01 16:21:41 +02:00
Daniel J Blueman	0b1faeef5f	x86: section/warning fixes WARNING: arch/x86/mm/built-in.o(.text+0x3a1): Section mismatch in reference from the function set_pte_phys() to the function .init.text:spp_getpage() The function set_pte_phys() references the function __init spp_getpage(). This is often because set_pte_phys lacks a __init annotation or the annotation of spp_getpage is wrong. arch/x86/mm/init_64.c: In function 'early_memtest': arch/x86/mm/init_64.c:520: warning: passing argument 2 of 'find_e820_area_size' from incompatible pointer type Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Cc: "Linus Torvalds" <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-26 15:33:09 +02:00
Jens Axboe	15c8b6c1aa	on_each_cpu(): kill unused 'retry' parameter It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:24:38 +02:00
Andreas Herrmann	33af9039cb	x86: pat.c final cleanup of loop body in reserve_memtype Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:05:59 +02:00
Andreas Herrmann	64fe44c38b	x86: pat.c introduce function to check for conflicts with existing memtypes ... to strip down loop body in reserve_memtype. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:05:57 +02:00
Andreas Herrmann	f6887264de	x86: pat.c consolidate list_add handling in reserve_memtype Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:05:55 +02:00
Andreas Herrmann	3e9c83b309	x86: pat.c consolidate error/debug messages in reserve_memtype ... and move last debug message out of locked section. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:05:51 +02:00
Andreas Herrmann	69e26be9b1	x86: pat.c more trivial changes - add BUG statement to catch invalid start and end parameters - No need to track the actual type in both req_type and actual_type -- keep req_type unchanged. - removed (IMHO) superfluous comments Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:05:50 +02:00
Andreas Herrmann	ac97991ec9	x86: pat.c choose more crisp variable names - parse/ml => entry (within list_for_each and friends) - new_entry => new - ret_type => new_type (to avoid confusion with req_type) (... to make it more readable...) Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:05:49 +02:00
Andreas Herrmann	bcc643dc28	x86: introduce macro to check whether an address range is in the ISA range Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:05:48 +02:00
Ingo Molnar	a1d5a8691f	x86: unify __set_fixmap, fix fix build failure: arch/x86/mm/pgtable.c:280: warning: ‘enum fixed_addresses’ declared inside parameter list arch/x86/mm/pgtable.c:280: warning: its scope is only this definition or declaration, which is probably not what you want arch/x86/mm/pgtable.c:280: error: parameter 1 (‘idx’) has incomplete type Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 15:36:57 +02:00
Jeremy Fitzhardinge	aeaaa59c7e	x86/paravirt/xen: add set_fixmap pv_mmu_ops Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 15:09:56 +02:00
Jeremy Fitzhardinge	d494a96125	x86: implement set_pte_vaddr Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 15:09:54 +02:00
Jeremy Fitzhardinge	7c7e6e07e2	x86: unify __set_fixmap In both cases, I went with the 32-bit behaviour. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 15:09:51 +02:00
Jiri Slaby	a1bf9631be	x86, MM: virtual address debug, v2 I've removed the test from phys_to_nid and made a function from __phys_addr only when the debugging is enabled (on x86_32). Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: tglx@linutronix.de Cc: hpa@zytor.com Cc: Mike Travis <travis@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <x86@kernel.org> Cc: linux-mm@kvack.org Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 13:32:22 +02:00
Jiri Slaby	59ea746337	MM: virtual address debug Add some (configurable) expensive sanity checking to catch wrong address translations on x86. - create linux/mmdebug.h file to be able include this file in asm headers to not get unsolvable loops in header files - __phys_addr on x86_32 became a function in ioremap.c since PAGE_OFFSET, is_vmalloc_addr and VMALLOC_* non-constasts are undefined if declared in page_32.h - add __phys_addr_const for initializing doublefault_tss.__cr3 Tested on 386, 386pae, x86_64 and x86_64 numa=fake=2. Contains Andi's enable numa virtual address debug patch. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 13:31:42 +02:00
Andreas Herrmann	dd0c7c4903	x86: shrink pat_x_mtrr_type to its essentials Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 12:57:40 +02:00
Hugh Dickins	6cf514fce1	x86: PAT: make pat_x_mtrr_type() more readable Clean up over-complications in pat_x_mtrr_type(). And if reserve_memtype() ignores stray req_type bits when pat_enabled, it's better to mask them off when not also. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-18 11:51:08 +02:00
Ingo Molnar	faeca31d06	Merge branch 'linus' into x86/pat	2008-06-16 11:20:28 +02:00
Ingo Molnar	064a32d82c	Merge branch 'linus' into x86/memtest	2008-06-16 11:19:53 +02:00
Ingo Molnar	1791a78c0b	Merge branch 'linus' into x86/cleanups	2008-06-16 11:17:50 +02:00
Ingo Molnar	cb9aa97c21	Merge branch 'linus' into tracing/mmiotrace-mergefixups	2008-06-16 11:16:46 +02:00
Linus Torvalds	cbfa66b88d	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest x86, lockdep: fix "WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()" x86: fix an incompatible pointer type warning on 64-bit compilations x86: fix lockdep warning during suspend-to-ram x86: fix unused variable 'loops' warning in arch/x86/boot/a20.c Revert "x86: fix ioapic bug again" x86: fix asm warning in head_32.S x86: fix endless page faults in mount_block_root for Linux 2.6 geode: fix modular build	2008-06-12 12:55:32 -07:00
Kevin Winchester	f8a45704f5	x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest Changed the call to find_e820_area_size to pass u64 instead of unsigned long. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-12 21:36:23 +02:00
David Howells	eb53e9f3ea	x86: fix an incompatible pointer type warning on 64-bit compilations Fix an incompatible pointer type warning on x86_64 compilations. early_memtest() is passing a u64* to find_e820_area_size() which is expecting an unsigned long. Change t_start and t_size to unsigned long as those are also 64-bit types on x88_64. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-12 21:27:15 +02:00
Henry Nestler	b29c701dea	x86: fix endless page faults in mount_block_root for Linux 2.6 Page faults in kernel address space between PAGE_OFFSET up to VMALLOC_START should not try to map as vmalloc. Fix rarely endless page faults inside mount_block_root for root filesystem at boot time. All 32bit kernels up to 2.6.25 can fail into this hole. I can not present this under native linux kernel. I see, that the 64bit has fixed the problem. I copied the same lines into 32bit part. Recorded debugs are from coLinux kernel 2.6.22.18 (virtualisation): http://www.henrynestler.com/colinux/testing/pfn-check-0.7.3/20080410-antinx/bug16-recursive-page-fault-endless.txt The physicaly memory was trimmed down to 192MB to better catch the bug. More memory gets the bug more rarely. Details, how every x86 32bit system can fail: Start from "mount_block_root", http://lxr.linux.no/linux/init/do_mounts.c#L297 There the variable "fs_names" got one memory page with 4096 bytes. Variable "p" walks through the existing file system types. The first string is no problem. But, with the second loop in mount_block_root the offset of "p" is not at beginning of page, the offset is for example +9, if "reiserfs" is the first in list. Than calls do_mount_root, and lands in sys_mount. Remember: Variable "type_page" contains now "fs_type+9" and not contains a full page. The sys_mount copies 4096 bytes with function "exact_copy_from_user()": http://lxr.linux.no/linux/fs/namespace.c#L1540 Mostly exist pages after the buffer "fs_names+4096+9" and the page fault handler was not called. No problem. In the case, if the page after "fs_names+4096" is not mapped, the page fault handler was called from http://lxr.linux.no/linux/fs/namespace.c#L1320 The do_page_fault gots an address 0xc03b4000. It's kernel address, address >= TASK_SIZE, but not from vmalloc! It's from "__getname()" alias "kmem_cache_alloc". The "error_code" is 0. "vmalloc_fault" will be call: http://lxr.linux.no/linux/arch/i386/mm/fault.c#L332 "vmalloc_fault" tryed to find the physical page for a non existing virtual memory area. The macro "pte_present" in vmalloc_fault() got a next page fault for 0xc0000ed0 at: http://lxr.linux.no/linux/arch/i386/mm/fault.c#L282 No PTE exist for such virtual address. The page fault handler was trying to sync the physical page for the PTE lockup. This called vmalloc_fault() again for address 0xc000000, and that also was not existing. The endless began... In normal case the cpu would still loop with disabled interrrupts. Under coLinux this was catched by a stack overflow inside printk debugs. Signed-off-by: Henry Nestler <henry.nestler@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-12 21:26:07 +02:00
Andreas Herrmann	499f8f84b8	x86: rename pat_wc_enabled to pat_enabled BTW, what does pat_wc_enabled stand for? Does it mean "write-combining"? Currently it is used to globally switch on or off PAT support. Thus I renamed it to pat_enabled. I think this increases readability (and hope that I didn't miss something). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-12 10:14:27 +02:00
Andreas Herrmann	cd7a4e936d	x86: PAT: fixed checkpatch errors (and whitespaces) x86: PAT: fixed checkpatch errors (and whitespaces) Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-12 10:14:24 +02:00
Andreas Herrmann	97cfab6ac4	x86: PAT: fix ambiguous paranoia check in pat_init() Starting with commit `8d4a430085` (x86: cleanup PAT cpu validation) the PAT CPU feature flag is not cleared anymore. Now the error message "PAT enabled, but CPU feature cleared" in pat_init() is misleading. Furthermore the current code does not check for existence of the PAT CPU feature flag if a CPU is whitelisted in validate_pat_support. This patch clears pat_wc_enabled if boot CPU has no PAT feature flag and adapts the paranoia check. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-12 10:14:22 +02:00
Venki Pallipadi	c26421d019	x86: fix Xorg crash with xf86MapVidMem error Clarify the usage of mtrr_lookup() in PAT code, and to make PAT code resilient to mtrr lookup problems. Specifically, pat_x_mtrr_type() is restructured to highlight, under what conditions we look for mtrr hint. pat_x_mtrr_type() uses a default type when there are any errors in mtrr lookup (still maintaining the pat consistency). And, reserve_memtype() highlights its usage ot mtrr_lookup for request type of '-1' and also defaults in a sane way on any mtrr lookup failure. pat.c looks at mtrr type of a range to get a hint on what mapping type to request when user/API: (1) hasn't specified any type (/dev/mem mapping) and we do not want to take performance hit by always mapping UC_MINUS. This will be the case for /dev/mem mappings used to map BIOS area or ACPI region which are WB'able. In this case, as long as MTRR is not WB, PAT will request UC_MINUS for such mappings. (2) user/API requests WB mapping while in reality MTRR may have UC or WC. In this case, PAT can map as WB (without checking MTRR) and still effective type will be UC or WC. But, a subsequent request to map same region as UC or WC may fail, as the region will get trackked as WB in PAT list. Looking at MTRR hint helps us to track based on effective type rather than what user requested. Again, here mtrr_lookup is only used as hint and we fallback to WB mapping (as requested by user) as default. In both cases, after using the mtrr hint, we still go through the memtype list to make sure there are no inconsistencies among multiple users. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Rufus & Azrael <rufus-azrael@numericable.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-12 10:03:55 +02:00
Fenghua Yu	39b8931b5c	ACPI: handle invalid ACPI SLIT table This is a SLIT sanity checking patch. It moves slit_valid() function to generic ACPI code and does sanity checking for both x86 and ia64. It sets up node_distance with LOCAL_DISTANCE and REMOTE_DISTANCE when hitting invalid SLIT table on ia64. It also cleans up unused variable localities in acpi_parse_slit() on x86. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>	2008-06-11 19:13:46 -04:00
Yinghai Lu	9043f00796	x86, numa, 32-bit: use find_e820_area() to find KVA RAM on node don't assume we can use RAM near the end of every node. Esp systems that have few memory and they could have kva address and kva RAM all below max_low_pfn. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-10 11:31:52 +02:00
Yinghai Lu	cc1a9d86ce	mm, x86: shrink_active_range() should check all Now we are using register_e820_active_regions() instead of add_active_range() directly. So end_pfn could be different between the value in early_node_map to node_end_pfn. So we need to make shrink_active_range() smarter. shrink_active_range() is a generic MM function in mm/page_alloc.c but it is only used on 32-bit x86. Should we move it back to some file in arch/x86? Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-10 11:31:44 +02:00
Huang, Ying	d0ec2c6f2c	x86: reserve highmem pages via reserve_early This patch makes early reserved highmem pages become reserved pages. This can be used for highmem pages allocated by bootloader such as EFI memory map, linked list of setup_data, etc. Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: andi@firstfloor.org Cc: mingo@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-05 15:10:02 +02:00
Andrew Morton	be524fb960	x86: section mismatch fix Fix this: WARNING: vmlinux.o(.text+0x114bb): Section mismatch in reference from the function nopat() to the function .cpuinit.text:pat_disable() The function nopat() references the function __cpuinit pat_disable(). This is often because nopat lacks a __cpuinit annotation or the annotation of pat_disable is wrong. Reported-by: "Fabio Comolli" <fabio.comolli@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 13:11:47 +02:00
Venki Pallipadi	282c454cd3	x86: fix Xorg crash with xf86MapVidMem error Clarify the usage of mtrr_lookup() in PAT code, and to make PAT code resilient to mtrr lookup problems. Specifically, pat_x_mtrr_type() is restructured to highlight, under what conditions we look for mtrr hint. pat_x_mtrr_type() uses a default type when there are any errors in mtrr lookup (still maintaining the pat consistency). And, reserve_memtype() highlights its usage ot mtrr_lookup for request type of '-1' and also defaults in a sane way on any mtrr lookup failure. pat.c looks at mtrr type of a range to get a hint on what mapping type to request when user/API: (1) hasn't specified any type (/dev/mem mapping) and we do not want to take performance hit by always mapping UC_MINUS. This will be the case for /dev/mem mappings used to map BIOS area or ACPI region which are WB'able. In this case, as long as MTRR is not WB, PAT will request UC_MINUS for such mappings. (2) user/API requests WB mapping while in reality MTRR may have UC or WC. In this case, PAT can map as WB (without checking MTRR) and still effective type will be UC or WC. But, a subsequent request to map same region as UC or WC may fail, as the region will get trackked as WB in PAT list. Looking at MTRR hint helps us to track based on effective type rather than what user requested. Again, here mtrr_lookup is only used as hint and we fallback to WB mapping (as requested by user) as default. In both cases, after using the mtrr hint, we still go through the memtype list to make sure there are no inconsistencies among multiple users. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Rufus & Azrael <rufus-azrael@numericable.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 13:11:47 +02:00
Kevin Winchester	511631011d	x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest Changed the call to find_e820_area_size to pass u64 instead of unsigned long. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 13:11:47 +02:00
Hugh Dickins	2884f110d5	x86: fix bad pmd ffff810000207xxx(9090909090909090) OGAWA Hirofumi and Fede have reported rare pmd_ERROR messages: mm/memory.c:127: bad pmd ffff810000207xxx(9090909090909090). Initialization's cleanup_highmap was leaving alignment filler behind in the pmd for MODULES_VADDR: when vmalloc's guard page would occupy a new page table, it's not allocated, and then module unload's vfree hits the bad 9090 pmd entry left over. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 13:11:47 +02:00
Ingo Molnar	226e9a93a2	x86: ioremap fix failing nesting check Mika Kukkonen noticed that the nesting check in early_iounmap() is not actually done. Reported-by: Mika Kukkonen <mikukkon@srv1-m700-lanp.koti> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: torvalds@linux-foundation.org Cc: arjan@linux.intel.com Cc: mikukkon@iki.fi Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-04 13:11:47 +02:00
Yinghai Lu	7b2a0a6c48	x86: make 32-bit use e820_register_active_regions() this way 32-bit is more similar to 64-bit, and smarter e820 and numa. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 12:01:58 +02:00
Yinghai Lu	84b56fa46b	x86, numa, 32-bit: make sure get we kva space when 1/3 user/kernel split is used, and less memory is installed, or if we have a big hole below 4g, max_low_pfn is still using 3g-128m try to go down from max_low_pfn until we get it. otherwise will panic. need to make 32-bit code to use register_e820_active_regions ... later. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 12:01:55 +02:00
Yinghai Lu	6af61a7614	x86: clean up max_pfn_mapped usage - 32-bit on 32-bit in head_32.S after initial page table is done, we get initial max_pfn_mapped, and then kernel_physical_mapping_init will give us a final one. We need to use that to make sure find_e820_area will get valid addresses for boot_map and for NODE_DATA(0) on numa32. XEN PV and lguest may need to assign max_pfn_mapped too. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-03 13:26:28 +02:00
Yinghai Lu	287572cb38	x86, numa, 32-bit: avoid clash between ramdisk and kva use find_e820_area to get address space... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-03 13:26:28 +02:00
Yinghai Lu	e8c27ac919	x86, numa, 32-bit: print out debug info on all kvas also fix the print out of node_remap_end_vaddr Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-03 13:26:26 +02:00
Yinghai Lu	0596152388	x86, 32-bit: change propagate_e820_map() back to find_max_pfn() we don't need to call memory_present that early. numa and sparse will call memory_present later and might even fail, it will call memory_present for the full range. also for sparse it will call alloc_bootmem ... before we set up bootmem. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-03 13:26:25 +02:00
Yinghai Lu	b66cd72073	x86: set node_remap_size[0] in fallback path ... otherwise alloc_remap will not get node_mem_map from kva area, and alloc_node_mem_map has to alloc_bootmem_node to get mem_map. It will use two low address copies ... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-03 13:26:25 +02:00
Yinghai Lu	ba924c81dd	x86, numa, 32-bit: increase max_elements to 1024 so every element will represent 64M instead of 256M. AMD opteron could have HW memory hole remapping, so could have [0, 8g + 64M) on node0. Reduce element size to 64M to keep that on node 0 Later we need to use find_e820_area() to allocate memory_node_map like on 64-bit. But need to move memory_present out of populate_mem_map... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-03 13:26:24 +02:00
Ingo Molnar	471b3c1b01	x86, numaq 32-bit: build fix fix: drivers/built-in.o: In function `acpi_numa_init': : undefined reference to `acpi_numa_arch_fixup' which can happen with ACPI && NUMAQ.	2008-06-03 12:49:06 +02:00
Ingo Molnar	cf3d0cb8a1	x86: 32-bit numa, build fix on Summit it's possible to have: CONFIG_ACPI_SRAT=y CONFIG_HAVE_ARCH_PARSE_SRAT=y in which case acpi.h defines the acpi_numa_slit_init() and acpi_numa_processor_affinity_init() methods as a macro.	2008-06-03 11:45:09 +02:00
Ingo Molnar	b28852d670	x86: add dummy acpi_numa_processor_affinity_init() implementation on 32-bit allow CONFIG_ACPI_NUMA builds to succeed.	2008-06-03 10:23:53 +02:00
Ingo Molnar	2772f54bf3	x86: add acpi_numa_slit_init() dummy implementation on 32-bit allow CONFIG_ACPI_NUMA builds to succeed on 32-bit.	2008-06-03 10:23:48 +02:00
Yinghai Lu	a5481280b2	x86: extend e820 early_res support 32bit -fix #5 reserve early numa kva, so it will not clash with new RAMDISK Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-31 09:55:53 +02:00
Yinghai Lu	163872950d	x86: extend e820 early_res support 32bit -fix #4 reserve_early pgdata for 32bit numa Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-31 09:55:50 +02:00
Paul Jackson	e9197bf011	x86 boot: remove some unused extern function declarations Remove three extern declarations for routines that don't exist. Fix a typo in a comment. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 10:55:10 +02:00
Ingo Molnar	668a6c3654	- fix mmioftrace + rcu merge interaction Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-25 09:51:43 +02:00
Thomas Gleixner	48e2395722	x86: fixup the fallout of the bitops changes Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-25 08:58:36 +02:00
Jan Beulich	4e50e62ce5	x86: eliminate duplicate consistency checks in init_32.c Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-25 08:58:30 +02:00
Thomas Gleixner	6a1673ae22	x86: make memory_add_physaddr_to_nid depend on MEMORY_HOTPLUG memory_add_physaddr_to_nid() is only used in the CONFIG_MEMORY_HOTPLUG_SPARSE \|\| CONFIG_ACPI_HOTPLUG_MEMORY case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:29 +02:00
Thomas Gleixner	11034d5597	x86: init64.c include initrd.h free_initrd_mem needs a prototype, which is in linux/initrd.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:27 +02:00
Thomas Gleixner	55d4f22abc	x86: k8topology cleanup variable declarations Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:26 +02:00
Thomas Gleixner	d34c08958f	x86: k8topology fix shadow variable sparse mutters: arch/x86/mm/k8topology_64.c:108:7: warning: symbol 'nodeid' shadows an earlier one Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:26 +02:00
Thomas Gleixner	0eafe234a2	x86: k8topology add missing header k8_scan_nodes is global and needs a prototype. Add the header file which contains it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:26 +02:00
Andrew Morton	46dd98a5c0	arch/x86/mm/pat.c: use boot_cpu_has() arch/x86/mm/pat.c: In function 'phys_mem_access_prot_allowed': arch/x86/mm/pat.c:526: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:526: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type arch/x86/mm/pat.c:527: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:527: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type arch/x86/mm/pat.c:528: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:528: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type arch/x86/mm/pat.c:529: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:529: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type Don't open-code test_bit() on a __u32 Cc: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:50:25 +02:00
Pekka Paalanen	790e2a290b	x86 mmiotrace: page level is unsigned Fixes some sparse warnings. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-24 11:27:47 +02:00
Pekka Paalanen	a50445d76c	mmiotrace: rename kmmio_probe::user_data to :private. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-24 11:27:41 +02:00
Pekka Paalanen	dee310d0ad	x86 mmiotrace: use resource_size_t for phys addresses Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-24 11:27:36 +02:00
Pekka Paalanen	87e547fe41	x86 mmiotrace: fix page-unaligned ioremaps mmiotrace_ioremap() expects to receive the original unaligned map phys address and size. Also fix {un,}register_kmmio_probe() to deal properly with unaligned size. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-24 11:27:32 +02:00
Pekka Paalanen	970e6fa038	mmiotrace: code style cleanups From c2da03771e29159627c5c7b9509ec70bce9f91ee Mon Sep 17 00:00:00 2001 From: Pekka Paalanen <pq@iki.fi> Date: Mon, 28 Apr 2008 21:25:22 +0300 Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-24 11:27:28 +02:00
Pekka Paalanen	7423d1115f	x86 mmiotrace: dynamically disable non-boot CPUs From 8979ee55cb6a429c4edd72ebec2244b849f6a79a Mon Sep 17 00:00:00 2001 From: Pekka Paalanen <pq@iki.fi> Date: Sat, 12 Apr 2008 00:18:57 +0300 Mmiotrace is not reliable with multiple CPUs and may miss events. Drop to single CPU when mmiotrace is activated. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-24 11:26:52 +02:00
Randy Dunlap	0663bb6cd9	mmiotrace: fix printk format Fix gcc printk format warnings: next-20080415/arch/x86/mm/mmio-mod.c: In function 'print_pte': next-20080415/arch/x86/mm/mmio-mod.c:154: warning: format '%lx' expects type 'long unsigned int', but argument 3 has type 'pteval_t' next-20080415/arch/x86/mm/mmio-mod.c:154: warning: format '%lx' expects type 'long unsigned int', but argument 4 has type 'pteval_t' next-20080415/arch/x86/mm/mmio-mod.c: At top level: next-20080415/arch/x86/mm/mmio-mod.c:403: warning: 'downed_cpus' defined but not used Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:26:26 +02:00
Pekka Paalanen	e4b37ee686	x86 mmiotrace: remove ISA_trace parameter. This had become a no-op. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:25:44 +02:00
Pekka Paalanen	ff3a3e9ba5	x86 mmiotrace: move files into arch/x86/mm/. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:25:37 +02:00
Pekka Paalanen	d61fc44853	x86: mmiotrace, preview 2 Kconfig.debug, Makefile and testmmiotrace.c style fixes. Use real mutex instead of mutex. Fix failure path in register probe func. kmmio: RCU read-locked over single stepping. Generate mapping id's. Make mmio-mod.c built-in and rewrite its locking. Add debugfs file to enable/disable mmiotracing. kmmio: use irqsave spinlocks. Lots of cleanups in mmio-mod.c Marker file moved from /proc into debugfs. Call mmiotrace entrypoints directly from ioremap.c. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:22:24 +02:00
Pekka Paalanen	0fd0e3da45	x86: mmiotrace full patch, preview 1 kmmio.c handles the list of mmio probes with callbacks, list of traced pages, and attaching into the page fault handler and die notifier. It arms, traps and disarms the given pages, this is the core of mmiotrace. mmio-mod.c is a user interface, hooking into ioremap functions and registering the mmio probes. It also decodes the required information from trapped mmio accesses via the pre and post callbacks in each probe. Currently, hooking into ioremap functions works by redefining the symbols of the target (binary) kernel module, so that it calls the traced versions of the functions. The most notable changes done since the last discussion are: - kmmio.c is a built-in, not part of the module - direct call from fault.c to kmmio.c, removing all dynamic hooks - prepare for unregistering probes at any time - make kmmio re-initializable and accessible to more than one user - rewrite kmmio locking to remove all spinlocks from page fault path Can I abuse call_rcu() like I do in kmmio.c:unregister_kmmio_probe() or is there a better way? The function called via call_rcu() itself calls call_rcu() again, will this work or break? There I need a second grace period for RCU after the first grace period for page faults. Mmiotrace itself (mmio-mod.c) is still a module, I am going to attack that next. At some point I will start looking into how to make mmiotrace a tracer component of ftrace (thanks for the hint, Ingo). Ftrace should make the user space part of mmiotracing as simple as 'cat /debug/trace/mmio > dump.txt'. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:22:12 +02:00
Pekka Paalanen	10c43d2eb5	x86: explicit call to mmiotrace in do_page_fault() The custom page fault handler list is replaced with a single function pointer. All related functions and variables are renamed for mmiotrace. Signed-off-by: Pekka Paalanen <pq@iki.fi> Cc: Christoph Hellwig <hch@infradead.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: pq@iki.fi Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:21:55 +02:00
Pekka Paalanen	75bb88350e	x86 mmiotrace: use lookup_address() Use lookup_address() from pageattr.c instead of doing the same manually. Also had to EXPORT_SYMBOL_GPL(lookup_address) to make this work for modules. This also fixes "undefined symbol 'init_mm'" compile error for x86_32. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:21:32 +02:00
Pekka Paalanen	72b59d67f8	x86_64: fix kernel rodata NX setting Without CONFIG_DYNAMIC_FTRACE, mark_rodata_ro() would mark a wrong number of pages as no-execute. The bug was introduced in the patch "ftrace: dont write protect kernel text". The symptom was machine reboot after a CPU hotplug. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 21:53:07 +02:00
Pekka Paalanen	86069782d6	x86: add a list for custom page fault handlers. Provides kernel modules a way to register custom page fault handlers. On every page fault this will call a list of registered functions. The functions may handle the fault and force do_page_fault() to return immediately. This functionality is similar to the now removed page fault notifiers. Custom page fault handlers are used by debugging and reverse engineering tools. Mmiotrace is one such tool and a patch to add it into the tree will follow. The custom page fault handlers are called earlier in do_page_fault() than the page fault notifiers were. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 21:16:38 +02:00
Steven Rostedt	8f0f996e80	ftrace: dont write protect kernel text Dynamic ftrace cant work when the kernel has its text write protected. This patch keeps the kernel from being write protected when dynamic ftrace is in place. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 21:16:22 +02:00
Andy Whitcroft	84d6bd0e27	x86: cope with no remap space being allocated for a numa node When allocating the pgdat's for numa nodes on x86_32 we attempt to place them in the numa remap space for that node. However should the node not have any remap space allocated (such as due to having non-ram pages in the remap location in the node) then we will incorrectly place the pgdat at zero. Check we have remap available, falling back to node 0 memory where we do not. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-20 14:10:50 +02:00
Andy Whitcroft	b9ada4281c	x86: reinstate numa remap for SPARSEMEM on x86 NUMA systems Recent kernels have been panic'ing trying to allocate memory early in boot, in __alloc_pages: BUG: unable to handle kernel paging request at 00001568 IP: [<c10407b6>] __alloc_pages+0x33/0x2cc pdpt = 00000000013a5001 pde = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 1, comm: swapper Not tainted (2.6.25 #78) EIP: 0060:[<c10407b6>] EFLAGS: 00010246 CPU: 0 EIP is at __alloc_pages+0x33/0x2cc EAX: 00001564 EBX: 000412d0 ECX: 00001564 EDX: 000005c3 ESI: f78012a0 EDI: 00000001 EBP: 00001564 ESP: f7871e50 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 1, ti=f7870000 task=f786f670 task.ti=f7870000) Stack: 00000000 f786f670 00000010 00000000 0000b700 000412d0 f78012a0 00000001 00000000 c105b64d 00000000 000412d0 f78012a0 f7803120 00000000 c105c1c5 00000010 f7803144 000412d0 00000001 f7803130 f7803120 f78012a0 00000001 Call Trace: [<c105b64d>] kmem_getpages+0x94/0x129 [<c105c1c5>] cache_grow+0x8f/0x123 [<c105c689>] ____cache_alloc_node+0xb9/0xe4 [<c105c999>] kmem_cache_alloc_node+0x92/0xd2 [<c1018929>] build_sched_domains+0x536/0x70d [<c100b63c>] do_flush_tlb_all+0x0/0x3f [<c100b63c>] do_flush_tlb_all+0x0/0x3f [<c10572d6>] interleave_nodes+0x23/0x5a [<c105c44f>] alternate_node_alloc+0x43/0x5b [<c1018b47>] arch_init_sched_domains+0x46/0x51 [<c136e85e>] kernel_init+0x0/0x82 [<c137ac19>] sched_init_smp+0x10/0xbb [<c136e8a1>] kernel_init+0x43/0x82 [<c10035cf>] kernel_thread_helper+0x7/0x10 Debugging this showed that the NODE_DATA() for nodes other than node 0 were all NULL. Tracing this back showed that the NODE_DATA() pointers were being initialised to each nodes remap space. However under SPARSEMEM remap is disabled which leads to the pgdat's being placed incorrectly at kernel virtual address 0. Leading to the panic when attempting to allocate memory from these nodes. Numa remap was disabled in the commit below. This occured while fixing problems triggered when attempting to boot x86_32 NUMA SPARSEMEM kernels on non-numa hardware. x86: make NUMA work on 32-bit commit `1b000a5dbe` The real problem is believed to be related to other alignment issues in the regions blocked out from the bootmem allocator for small memory systems, and has been fixed separately. Therefore re-enable remap for SPARSMEM, which fixes pgdat allocation issues. Testing confirms that SPARSMEM NUMA kernels will boot correctly with this part of the change reverted. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-20 14:10:48 +02:00
Ingo Molnar	a8ac1ae3a2	Merge branch 'linus' into x86/pat	2008-05-19 09:23:07 +02:00
Thomas Gleixner	b4ef290d7c	Merge branch 'linus' into x86/pat	2008-05-18 11:54:43 +02:00
Avi Kivity	31f4d870b0	x86: fix crash on cpu hotplug on pat-incapable machines pat_disable() is __init, which means it goes away after booting is complete. Unfortunately it is used by the hotplug code if the machine is not pat-capable, causing a crash. Fix by marking pat_disable() as __cpuinit. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-17 22:57:20 +02:00
Pranith Kumar	afc8534380	x86: arch/x86/mm/pat.c - fix warning fix this warning: arch/x86/mm/pat.c: In function `phys_mem_access_prot_allowed': arch/x86/mm/pat.c:558: warning: long long unsigned int format, long unsigned int arg (arg 6) arch/x86/mm/pat.c: In function `map_devmem': arch/x86/mm/pat.c:580: warning: long long unsigned int format, long unsigned int arg (arg 6) Signed-off-by: D Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-13 19:39:30 +02:00
Hugh Dickins	61165d7a03	x86: fix app crashes after SMP resume After resume on a 2cpu laptop, kernel builds collapse with a sed hang, sh or make segfault (often on 20295564), real-time signal to cc1 etc. Several hurdles to jump, but a manually-assisted bisect led to -rc1's `d2bcbad5f3` x86: do not zap_low_mappings in __smp_prepare_cpus. Though the low mappings were removed at bootup, they were left behind (with Global flags helping to keep them in TLB) after resume or cpu online, causing the crashes seen. Reinstate zap_low_mappings (with local __flush_tlb_all) for each cpu_up on x86_32. This used to be serialized by smp_commenced_mask: that's now gone, but a low_mappings flag will do. No need for native_smp_cpus_done to repeat the zap: let mem_init zap BSP's low mappings just like on UP. (In passing, fix error code from native_cpu_up: do_boot_cpu returns a variety of diagnostic values, Dprintk what it says but convert to -EIO. And save_pg_dir separately before zap_low_mappings: doesn't matter now, but zapping twice in succession wiped out resume's swsusp_pg_dir.) That worked well on the duo and one quad, but wouldn't boot 3rd or 4th cpu on P4 Xeon, oopsing just after unlock_ipi_call_lock. The TLB flush IPI now being sent reveals a long-standing bug: the booting cpu has its APIC readied in smp_callin at the top of start_secondary, but isn't put into the cpu_online_map until just before that unlock_ipi_call_lock. So native_smp_call_function_mask to online cpus would send_IPI_allbutself, including the cpu just coming up, though it has been excluded from the count to wait for: by the time it handles the IPI, the call data on native_smp_call_function_mask's stack may well have been overwritten. So fall back to send_IPI_mask while cpu_online_map does not match cpu_callout_map: perhaps there's a better APICological fix to be made at the start_secondary end, but I wouldn't know that. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-13 19:36:12 +02:00
Yinghai Lu	0327318445	x86_64: simplify the memtest parameter setting use CONFIG_MEMTEST only. if it is set, will have memtest=0 (disabled) need to have memtest=4 in command line to test more patterns. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-12 21:28:15 +02:00
Venki Pallipadi	77b52b4c5c	x86: add "debugpat" boot option enable debug messages by a boot option "debugpat". Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-12 21:28:07 +02:00
Thomas Gleixner	8d4a430085	x86: cleanup PAT cpu validation Move the scattered checks for PAT support to a single function. Its moved to addon_cpuid_features.c as this file is shared between 32 and 64 bit. Remove the manipulation of the PAT feature bit and just disable PAT in the PAT layer, based on the PAT bit provided by the CPU and the current CPU version/model white list. Change the boot CPU check so it works on Voyager somewhere in the future as well :) Also panic, when a secondary has PAT disabled but the primary one has alrady switched to PAT. We have no way to undo that. The white list is kept for now to ensure that we can rely on known to work CPU types and concentrate on the software induced problems instead of fighthing CPU erratas and subtle wreckage caused by not yet verified CPUs. Once the PAT code has stabilized enough, we can remove the white list and open the can of worms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-08 15:43:51 +02:00
Hugh Dickins	aeed5fce37	x86: fix PAE pmd_bad bootup warning Fix warning from pmd_bad() at bootup on a HIGHMEM64G HIGHPTE x86_32. That came from `9fc34113f6` x86: debug pmd_bad(); but we understand now that the typecasting was wrong for PAE in the previous version: pagetable pages above 4GB looked bad and stopped Arjan from booting. And revert that `cded932b75` x86: fix pmd_bad and pud_bad to support huge pages. It was the wrong way round: we shouldn't weaken every pmd_bad and pud_bad check to let huge pages slip through - in part they check that we _don't_ have a huge page where it's not expected. Put the x86 pmd_bad() and pud_bad() definitions back to what they have long been: they can be improved (x86_32 should use PTE_MASK, to stop PAE thinking junk in the upper word is good; and x86_64 should follow x86_32's stricter comparison, to stop thinking any subset of required bits is good); but that should be a later patch. Fix Hans' good observation that follow_page() will never find pmd_huge() because that would have already failed the pmd_bad test: test pmd_huge in between the pmd_none and pmd_bad tests. Tighten x86's pmd_huge() check? No, once it's a hugepage entry, it can get quite far from a good pmd: for example, PROT_NONE leaves it with only ACCESSED of the KERN_PGTABLE bits. However... though follow_page() contains this and another test for huge pages, so it's nice to keep it working on them, where does it actually get called on a huge page? get_user_pages() checks is_vm_hugetlb_page(vma) to to call alternative hugetlb processing, as does unmap_vmas() and others. Signed-off-by: Hugh Dickins <hugh@veritas.com> Earlier-version-tested-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeff Chua <jeff.chua.linux@gmail.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-06 13:08:58 -07:00
Thomas Gleixner	48b83d2425	x86: undo visws/numaq build changes arch/x86/pci/Makefile_32 has a nasty detail. VISWS and NUMAQ build override the generic pci-y rules. This needs a proper cleanup, but that needs more thoughts. Undo commit `895d30935e` x86: numaq fix do not override the existing pci-y rule when adding visws or numaq rules. There is also a stupid init function ordering problem vs. acpi.o Add comments to the Makefile to avoid tripping over this again. Remove the srat stub code in discontig_32.c to allow a proper NUMAQ build. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-04 20:04:45 +02:00
Andres Salomon	cb8ab687c3	x86: ioremap ram check fix `bdd3cee2e4` (x86: ioremap(), extend check to all RAM pages) breaks OLPC's ioremap call. The ioremap that OLPC uses is: romsig = ioremap(0xffffffc0, 16); The commit that breaks it is basically: - for (pfn = phys_addr >> PAGE_SHIFT; pfn < max_pfn_mapped && - (pfn << PAGE_SHIFT) < last_addr; pfn++) { + for (pfn = phys_addr >> PAGE_SHIFT; + (pfn << PAGE_SHIFT) < last_addr; pfn++) { + Previously, the 'pfn < max_pfn_mapped' check would've caused us to not enter the loop. Removing that check means we loop infinitely. The reason for that is because pfn is 0xfffff, and last_addr is 0xffffffcf. The remaining check that is used to exit the loop is not sufficient; when pfn<<PAGE_SHIFT is 0xfffff000, that is less than 0xffffffcf; when we increment pfn and it overflows (pfn == 0x100000), pfn<<PAGE_SHIFT ends up being 0. That, of course, is less than last_addr. In effect, pfn<<PAGE_SHIFT is never lower than last_addr. The simple fix for this is to limit the last_addr check to the PAGE_MASK; a patch is below. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:35 +02:00
Suresh Siddha	de33c442ed	x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range() Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC. Once all the X drivers move to ioremap_wc(), we can go back to strong UC semantics for ioremap() and ioremap_nocache(). To avoid attribute aliasing issues, pci_mmap_page_range() will also use UC_MINUS for default non write-combining mapping request. Next steps: a) change all the video drivers using ioremap() or ioremap_nocache() and adding WC MTTR using mttr_add() to ioremap_wc() b) for strict usage, we can go back to strong uc semantics for ioremap() and ioremap_nocache() after some grace period for completing step-a. c) user level X server needs to use the appropriate method for setting up WC mapping (like using resourceX_wc sysfs file instead of adding MTRR for WC and using /dev/mem or resourceX under /sys) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:35 +02:00
Ingo Molnar	2544a873ab	revert: "x86: ioremap(), extend check to all RAM pages" Vegard Nossum reported a large (150 seconds) boot delay during bootup, and bisected it to "x86: ioremap(), extend check to all RAM pages" (commit `bdd3cee2e4`). Revert this commit for now. Bisected-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:34 +02:00
Adrian Bunk	b9e017e04b	x86: unexport kmap_atomic_to_page This patch removes the no longer used export of kmap_atomic_to_page. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:34 +02:00
Linus Torvalds	5f78e4d339	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-bigbox-pci * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-bigbox-pci: x86: add pci=check_enable_amd_mmconf and dmi check x86: work around io allocation overlap of HT links acpi: get boot_cpu_id as early for k8_scan_nodes x86_64: don't need set default res if only have one root bus x86: double check the multi root bus with fam10h mmconf x86: multi pci root bus with different io resource range, on 64-bit x86: use bus conf in NB conf fun1 to get bus range on, on 64-bit x86: get mp_bus_to_node early x86 pci: remove checking type for mmconfig probe x86: remove unneeded check in mmconf reject driver core: try parent numa_node at first before using default x86: seperate mmconf for fam10h out from setup_64.c x86: if acpi=off, force setting the mmconf for fam10h x86_64: check MSR to get MMCONFIG for AMD Family 10h x86_64: check and enable MMCONFIG for AMD Family 10h x86_64: set cfg_size for AMD Family 10h in case MMCONFIG x86: mmconf enable mcfg early x86: clear pci_mmcfg_virt when mmcfg get rejected x86: validate against acpi motherboard resources Fixed up fairly trivial conflicts in arch/x86/pci/{init.c,pci.h} due to OLPC support manually.	2008-04-29 08:26:51 -07:00
Christoph Lameter	2301696932	vmallocinfo: add caller information Add caller information so that /proc/vmallocinfo shows where the allocation request for a slice of vmalloc memory originated. Results in output like this: 0xffffc20000000000-0xffffc20000801000 8392704 alloc_large_system_hash+0x127/0x246 pages=2048 vmalloc vpages 0xffffc20000801000-0xffffc20000806000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000806000-0xffffc20000c07000 4198400 alloc_large_system_hash+0x127/0x246 pages=1024 vmalloc vpages 0xffffc20000c07000-0xffffc20000c0a000 12288 alloc_large_system_hash+0x127/0x246 pages=2 vmalloc 0xffffc20000c0a000-0xffffc20000c0c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c0c000-0xffffc20000c0f000 12288 acpi_os_map_memory+0x13/0x1c phys=cff64000 ioremap 0xffffc20000c10000-0xffffc20000c15000 20480 acpi_os_map_memory+0x13/0x1c phys=cff65000 ioremap 0xffffc20000c16000-0xffffc20000c18000 8192 acpi_os_map_memory+0x13/0x1c phys=cff69000 ioremap 0xffffc20000c18000-0xffffc20000c1a000 8192 acpi_os_map_memory+0x13/0x1c phys=fed1f000 ioremap 0xffffc20000c1a000-0xffffc20000c1c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1c000-0xffffc20000c1e000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1e000-0xffffc20000c20000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c20000-0xffffc20000c22000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c22000-0xffffc20000c24000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c24000-0xffffc20000c26000 8192 acpi_os_map_memory+0x13/0x1c phys=e0081000 ioremap 0xffffc20000c26000-0xffffc20000c28000 8192 acpi_os_map_memory+0x13/0x1c phys=e0080000 ioremap 0xffffc20000c28000-0xffffc20000c2d000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000c2d000-0xffffc20000c31000 16384 tcp_init+0xd5/0x31c pages=3 vmalloc 0xffffc20000c31000-0xffffc20000c34000 12288 alloc_large_system_hash+0x127/0x246 pages=2 vmalloc 0xffffc20000c34000-0xffffc20000c36000 8192 init_vdso_vars+0xde/0x1f1 0xffffc20000c36000-0xffffc20000c38000 8192 pci_iomap+0x8a/0xb4 phys=d8e00000 ioremap 0xffffc20000c38000-0xffffc20000c3a000 8192 usb_hcd_pci_probe+0x139/0x295 [usbcore] phys=d8e00000 ioremap 0xffffc20000c3a000-0xffffc20000c3e000 16384 sys_swapon+0x509/0xa15 pages=3 vmalloc 0xffffc20000c40000-0xffffc20000c61000 135168 e1000_probe+0x1c4/0xa32 phys=d8a20000 ioremap 0xffffc20000c61000-0xffffc20000c6a000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c6a000-0xffffc20000c73000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c73000-0xffffc20000c7c000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c7c000-0xffffc20000c7f000 12288 e1000e_setup_tx_resources+0x29/0xbe pages=2 vmalloc 0xffffc20000c80000-0xffffc20001481000 8392704 pci_mmcfg_arch_init+0x90/0x118 phys=e0000000 ioremap 0xffffc20001481000-0xffffc20001682000 2101248 alloc_large_system_hash+0x127/0x246 pages=512 vmalloc 0xffffc20001682000-0xffffc20001e83000 8392704 alloc_large_system_hash+0x127/0x246 pages=2048 vmalloc vpages 0xffffc20001e83000-0xffffc20002204000 3674112 alloc_large_system_hash+0x127/0x246 pages=896 vmalloc vpages 0xffffc20002204000-0xffffc2000220d000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc2000220d000-0xffffc20002216000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002216000-0xffffc2000221f000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc2000221f000-0xffffc20002228000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002228000-0xffffc20002231000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002231000-0xffffc20002234000 12288 e1000e_setup_rx_resources+0x35/0x122 pages=2 vmalloc 0xffffc20002240000-0xffffc20002261000 135168 e1000_probe+0x1c4/0xa32 phys=d8a60000 ioremap 0xffffc20002261000-0xffffc2000270c000 4894720 sys_swapon+0x509/0xa15 pages=1194 vmalloc vpages 0xffffffffa0000000-0xffffffffa0022000 139264 module_alloc+0x4f/0x55 pages=33 vmalloc 0xffffffffa0022000-0xffffffffa0029000 28672 module_alloc+0x4f/0x55 pages=6 vmalloc 0xffffffffa002b000-0xffffffffa0034000 36864 module_alloc+0x4f/0x55 pages=8 vmalloc 0xffffffffa0034000-0xffffffffa003d000 36864 module_alloc+0x4f/0x55 pages=8 vmalloc 0xffffffffa003d000-0xffffffffa0049000 49152 module_alloc+0x4f/0x55 pages=11 vmalloc 0xffffffffa0049000-0xffffffffa0050000 28672 module_alloc+0x4f/0x55 pages=6 vmalloc [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:21 -07:00
Jeremy Fitzhardinge	180c06efce	hotplug-memory: make online_page() common All architectures use an effectively identical definition of online_page(), so just make it common code. x86-64, ia64, powerpc and sh are actually identical; x86-32 is slightly different. x86-32's differences arise because it puts its hotplug pages in the highmem zone. We can handle this in the generic code by inspecting the page to see if its in highmem, and update the totalhigh_pages count appropriately. This leaves init_32.c:free_new_highpage with a single caller, so I folded it into add_one_highpage_init. I also removed an incorrect comment referring to the NUMA case; any NUMA details have already been dealt with by the time online_page() is called. [akpm@linux-foundation.org: fix indenting] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:17 -07:00
Ingo Molnar	f022bfd582	x86: PAT fix Adrian Bunk noticed the following Coverity report: > Commit `e7f260a276` > (x86: PAT use reserve free memtype in mmap of /dev/mem) > added the following gem to arch/x86/mm/pat.c: > > <-- snip --> > > ... > int phys_mem_access_prot_allowed(struct file file, unsigned long pfn, > unsigned long size, pgprot_t vma_prot) > { > u64 offset = ((u64) pfn) << PAGE_SHIFT; > unsigned long flags = _PAGE_CACHE_UC_MINUS; > unsigned long ret_flags; > ... > ... (nothing that touches ret_flags) > ... > if (flags != _PAGE_CACHE_UC_MINUS) { > retval = reserve_memtype(offset, offset + size, flags, NULL); > } else { > retval = reserve_memtype(offset, offset + size, -1, &ret_flags); > } > > if (retval < 0) > return 0; > > flags = ret_flags; > > if (pfn <= max_pfn_mapped && > ioremap_change_attr((unsigned long)__va(offset), size, flags) < 0) { > free_memtype(offset, offset + size); > printk(KERN_INFO > "%s:%d /dev/mem ioremap_change_attr failed %s for %Lx-%Lx\n", > current->comm, current->pid, > cattr_name(flags), > offset, offset + size); > return 0; > } > > vma_prot = __pgprot((pgprot_val(vma_prot) & ~_PAGE_CACHE_MASK) \| > flags); > return 1; > } > > <-- snip --> > > If (flags != _PAGE_CACHE_UC_MINUS) we pass garbage from the stack to > ioremap_change_attr() and/or __pgprot(). > > Spotted by the Coverity checker. the fix simplifies the code as we get rid of the 'ret_flags' complication. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:15:16 -07:00
Linus Torvalds	86cf02f8ea	x86 PAT: tone down debugging messages some more Ingo already fixed one of these at my request (in "x86 PAT: tone down debugging messages", commit `1ebcc654f0`), but there was another one he missed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-27 11:59:30 -07:00
Yinghai Lu	cbf9bd603a	acpi: get boot_cpu_id as early for k8_scan_nodes [mingo@elte.hu: split from "x86_64: get boot_cpu_id as early for k8_scan_nodes] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	c2b91e2eec	x86_64/mm: check and print vmemmap allocation continuous On big systems with lots of memory, don't print out too much during bootup, and make it easy to find if it is continuous. on 256G 8 sockets system will get [ffffe20000000000-ffffe20002bfffff] PMD -> [ffff810001400000-ffff810003ffffff] on node 0 [ffffe2001c700000-ffffe2001c7fffff] potential offnode page_structs [ffffe20002c00000-ffffe2001c7fffff] PMD -> [ffff81000c000000-ffff8100255fffff] on node 0 [ffffe20038700000-ffffe200387fffff] potential offnode page_structs [ffffe2001c800000-ffffe200387fffff] PMD -> [ffff810820200000-ffff81083c1fffff] on node 1 [ffffe20040000000-ffffe2007fffffff] PUD ->ffff811027a00000 on node 2 [ffffe20038800000-ffffe2003fffffff] PMD -> [ffff811020200000-ffff8110279fffff] on node 2 [ffffe20054700000-ffffe200547fffff] potential offnode page_structs [ffffe20040000000-ffffe200547fffff] PMD -> [ffff811027c00000-ffff81103c3fffff] on node 2 [ffffe20070700000-ffffe200707fffff] potential offnode page_structs [ffffe20054800000-ffffe200707fffff] PMD -> [ffff811820200000-ffff81183c1fffff] on node 3 [ffffe20080000000-ffffe200bfffffff] PUD ->ffff81202fa00000 on node 4 [ffffe20070800000-ffffe2007fffffff] PMD -> [ffff812020200000-ffff81202f9fffff] on node 4 [ffffe2008c700000-ffffe2008c7fffff] potential offnode page_structs [ffffe20080000000-ffffe2008c7fffff] PMD -> [ffff81202fc00000-ffff81203c3fffff] on node 4 [ffffe200a8700000-ffffe200a87fffff] potential offnode page_structs [ffffe2008c800000-ffffe200a87fffff] PMD -> [ffff812820200000-ffff81283c1fffff] on node 5 [ffffe200c0000000-ffffe200ffffffff] PUD ->ffff813037a00000 on node 6 [ffffe200a8800000-ffffe200bfffffff] PMD -> [ffff813020200000-ffff8130379fffff] on node 6 [ffffe200c4700000-ffffe200c47fffff] potential offnode page_structs [ffffe200c0000000-ffffe200c47fffff] PMD -> [ffff813037c00000-ffff81303c3fffff] on node 6 [ffffe200c4800000-ffffe200e07fffff] PMD -> [ffff813820200000-ffff81383c1fffff] on node 7 instead of a very long print out... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 22:51:09 +02:00
Yinghai Lu	1a27fc0a42	x86_64: fix setup_node_bootmem to support big mem excluding with memmap typical case: four sockets system, every node has 4g ram, and we are using: memmap=10g$4g to mask out memory on node1 and node2 when numa is enabled, early_node_mem is used to get node_data and node_bootmap. if it can not get memory from the same node with find_e820_area(), it will use alloc_bootmem to get buff from previous nodes. so check it and print out some info about it. need to move early_res_to_bootmem into every setup_node_bootmem. and it takes range that node has. otherwise alloc_bootmem could return addr that reserved early. depends on "mm: make reserve_bootmem can crossed the nodes". Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 22:51:08 +02:00
Yinghai Lu	8b3cd09ed2	x86_64: make reserve_bootmem_generic() use new reserve_bootmem() "mm: make reserve_bootmem can crossed the nodes" provides new reserve_bootmem(), let reserve_bootmem_generic() use that. reserve_bootmem_generic() is used to reserve initramdisk, so this way we can make sure even when bootloader or kexec load ranges cross the node memory boundaries, reserve_bootmem still works. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 22:51:08 +02:00
Venki Pallipadi	0124cecfc8	x86, PAT: disable /dev/mem mmap RAM with PAT disable /dev/mem mmap of RAM with PAT. It makes things safer and eliminates aliasing. A future improvement would be to avoid the range_is_allowed duplication. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 21:28:29 +02:00
Linus Torvalds	4a27214d7b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes: x86 PAT: decouple from nonpromisc devmem x86 PAT: tone down debugging messages	2008-04-26 09:50:58 -07:00
Dmitri Vorobiev	f7f17a67c5	x86: remove NexGen support It is claimed that NexGen CPUs were never shipped: http://lkml.org/lkml/2008/4/20/179 Also, the kernel support for these chips has been broken for a long time, the code intended to support NexGen thereby being essentially dead. As an outcome of the discussion that can be found using the URL above, this patch removes the NexGen support altogether. The changes in this patch survived a defconfig build for i386, a couple of successful randconfig builds, as well as a runtime test, which consisted in booting a 32-bit x86 box up to the shell prompt. Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 17:35:47 +02:00
Ingo Molnar	1ebcc654f0	x86 PAT: tone down debugging messages Linus reported these excessive debug printouts: > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0380000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 turn that into a pr_debug(). Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 16:01:26 +02:00
Linus Torvalds	bf16ae2509	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat: generic: add ioremap_wc() interface wrapper /dev/mem: make promisc the default pat: cleanups x86: PAT use reserve free memtype in mmap of /dev/mem x86: PAT phys_mem_access_prot_allowed for dev/mem mmap x86: PAT avoid aliasing in /dev/mem read/write devmem: add range_is_allowed() check to mmap of /dev/mem x86: introduce /dev/mem restrictions with a config option	2008-04-25 12:48:08 -07:00
Linus Torvalds	4b7227ca32	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-xen-next * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-xen-next: (52 commits) xen: add balloon driver xen: allow compilation with non-flat memory xen: fold xen_sysexit into xen_iret xen: allow set_pte_at on init_mm to be lockless xen: disable preemption during tlb flush xen pvfb: Para-virtual framebuffer, keyboard and pointer driver xen: Add compatibility aliases for frontend drivers xen: Module autoprobing support for frontend drivers xen blkfront: Delay wait for block devices until after the disk is added xen/blkfront: use bdget_disk xen: Make xen-blkfront write its protocol ABI to xenstore xen: import arch generic part of xencomm xen: make grant table arch portable xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed one xen: make include/xen/page.h portable moving those definitions under asm dir xen: add resend_irq_on_evtchn() definition into events.c Xen: make events.c portable for ia64/xen support xen: move events.c to drivers/xen for IA64/Xen support xen: move features.c from arch/x86/xen/features.c to drivers/xen xen: add missing definitions in include/xen/interface/vcpu.h which ia64/xen needs ...	2008-04-25 12:32:10 -07:00
Ingo Molnar	70c9f590ff	x86: remove set_fixmap() warning set_fixmap()+clear_fixmap() is safe. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-25 19:54:07 +02:00
Ingo Molnar	82a355f5a2	x86: make __set_fixmap() non-init Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-25 19:54:07 +02:00
Jeremy Fitzhardinge	85958b465c	x86: unify pgd ctor/dtor All pagetables need fundamentally the same setup and destruction, so just use the same code for everything. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	68db065c84	x86: unify KERNEL_PGD_PTRS Make KERNEL_PGD_PTRS common, as previously it was only being defined for 32-bit. There are a couple of follow-on changes from this: - KERNEL_PGD_PTRS was being defined in terms of USER_PGD_PTRS. The definition of USER_PGD_PTRS doesn't really make much sense on x86-64, since it can have two different user address-space configurations. I renamed USER_PGD_PTRS to KERNEL_PGD_BOUNDARY, which is meaningful for all of 32/32, 32/64 and 64/64 process configurations. - USER_PTRS_PER_PGD was also defined and was being used for similar purposes. Converting its users to KERNEL_PGD_BOUNDARY left it completely unused, and so I removed it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andi Kleen <ak@suse.de> Cc: Zach Amsden <zach@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	c20311e165	x86/pgtable.h: demacro ptep_clear_flush_young Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	f9fbf1a36a	x86/pgtable.h: demacro ptep_test_and_clear_young Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	ee5aa8d3ba	x86/pgtable.h: demacro ptep_set_access_flags Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	2761fa0920	x86: add pud_alloc for 4-level pagetables Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	6944a9c894	x86: rename paravirt_alloc_pt etc after the pagetable structure Rename (alloc\|release)_(pt\|pd) to pte/pmd to explicitly match the name of the appropriate pagetable level structure. [ x86.git merge work by Mark McLoughlin <markmc@redhat.com> ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	394158559d	x86: move all the pgd_list handling to one place Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	5a5f8f4224	x86: move pgalloc pud and pgd operations into common place Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	170fdff705	x86: move pmd functions into common asm/pgalloc.h Common definitions for 3-level pagetable functions. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	397f687ab7	x86: move pte functions into common asm/pgalloc.h Common definitions for 2-level pagetable functions. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	1d262d3a49	x86: put paravirt stubs into common asm/pgalloc.h Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Ingo Molnar	1ec1fe73df	x86: xen unify x86 add common mm pgtable c fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	4f76cd3822	x86: add common mm/pgtable.c Add a common arch/x86/mm/pgtable.c file for common pagetable functions. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	79bf6d66ab	x86: convert pgalloc_64.h from macros to inlines Convert asm-x86/pgalloc_64.h from macros into functions (#include hell prevents __*_free_tlb from being inline, but they're probably a bit big to inline anyway). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Ingo Molnar	28eb559b5b	pat: cleanups Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
venkatesh.pallipadi@intel.com	e7f260a276	x86: PAT use reserve free memtype in mmap of /dev/mem Use reserve_memtype and free_memtype wrappers for /dev/mem mmaps. The memtype is slightly complicated here, given that we have to support existing X mappings. We fallback on UC_MINUS for that. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
venkatesh.pallipadi@intel.com	f0970c13b6	x86: PAT phys_mem_access_prot_allowed for dev/mem mmap Introduce phys_mem_access_prot_allowed(), which checks whether the mapping is possible, without any conflicts and returns success or failure based on that. phys_mem_access_prot() by itself does not allow failure case. This ability to return error is needed for PAT where we may have aliasing conflicts. x86 setup __HAVE_PHYS_MEM_ACCESS_PROT and move x86 specific code out of /dev/mem into arch specific area. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
venkatesh.pallipadi@intel.com	e045fb2a98	x86: PAT avoid aliasing in /dev/mem read/write Add xlate and unxlate around /dev/mem read/write. This sets up the mapping that can be used for /dev/mem read and write without aliasing worries. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
Arjan van de Ven	ae531c26c5	x86: introduce /dev/mem restrictions with a config option This patch introduces a restriction on /dev/mem: Only non-memory can be read or written unless the newly introduced config option is set. The X server needs access to /dev/mem for the PCI space, but it doesn't need access to memory; both the file permissions and SELinux permissions of /dev/mem just make X effectively super-super powerful. With the exception of the BIOS area, there's just no valid app that uses /dev/mem on actual memory. Other popular users of /dev/mem are rootkits and the like. (note: mmap access of memory via /dev/mem was already not allowed since a really long time) People who want to use /dev/mem for kernel debugging can enable the config option. The restrictions of this patch have been in the Fedora and RHEL kernels for at least 4 years without any problems. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:40:47 +02:00
Ingo Molnar	a4928cffe6	"make namespacecheck" fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:15:44 +02:00
Linus Torvalds	ec965350bb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel: (62 commits) sched: build fix sched: better rt-group documentation sched: features fix sched: /debug/sched_features sched: add SCHED_FEAT_DEADLINE sched: debug: show a weight tree sched: fair: weight calculations sched: fair-group: de-couple load-balancing from the rb-trees sched: fair-group scheduling vs latency sched: rt-group: optimize dequeue_rt_stack sched: debug: add some debug code to handle the full hierarchy sched: fair-group: SMP-nice for group scheduling sched, cpuset: customize sched domains, core sched, cpuset: customize sched domains, docs sched: prepatory code movement sched: rt: multi level group constraints sched: task_group hierarchy sched: fix the task_group hierarchy for UID grouping sched: allow the group scheduler to have multiple levels sched: mix tasks and groups ...	2008-04-21 15:40:24 -07:00
Mike Travis	f46bdf2db2	numa: move large array from stack to _initdata section * Move large array "struct bootnode nodes" from stack to _initdata section to reduce amount of stack space required. Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-19 19:44:58 +02:00
Glauber Costa	85c246ee16	x86: move definition to pci-dma.c Move dma_ops structure definition to pci-dma.c, where it belongs. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-19 19:19:57 +02:00
Suresh Siddha	6ec6e0d9f2	srat, x86: add support for nodes spanning other nodes For example, If the physical address layout on a two node system with 8 GB memory is something like: node 0: 0-2GB, 4-6GB node 1: 2-4GB, 6-8GB Current kernels fail to boot/detect this NUMA topology. ACPI SRAT tables can expose such a topology which needs to be supported. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-19 19:19:55 +02:00
Ingo Molnar	fa5c463941	x86: rename find_max_pfn() to propagate_e820_map() this function doesnt just 'find' the max_pfn - it also has other side-effects such as registering sparse memory maps. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-19 19:19:55 +02:00
Randy Dunlap	4c8337ac42	x86: fix arch/x86/mm/ioremap.c warning Fix printk formats in x86/mm/ioremap.c: next-20080410/arch/x86/mm/ioremap.c:137: warning: format '%llx' expects type 'long long unsigned int', but argument 2 has type 'resource_size_t' next-20080410/arch/x86/mm/ioremap.c:188: warning: format '%llx' expects type 'long long unsigned int', but argument 2 has type 'resource_size_t' next-20080410/arch/x86/mm/ioremap.c:188: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'long unsigned int' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-19 19:19:54 +02:00
WANG Cong	cf9b111c17	x86: remove pointless comments Remove old comments that include the old arch/i386 directory. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-19 19:19:54 +02:00
Ingo Molnar	d1a4be630f	x86 PAT: fix mmap() of holes do not return a -EINVAL when mmap()-ing PCI holes. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com>	2008-04-18 23:40:49 +02:00
Mike Travis	b447a468fc	x86: clean up non-smp usage of cpu maps Cleanup references to the early cpu maps for the non-SMP configuration and remove some functions called for SMP configurations only. Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:34 +02:00
Jack Steiner	a65d1d644c	x86: increase size of APICID Increase the number of bits in an apicid from 8 to 32. By default, MP_processor_info() gets the APICID from the mpc_config_processor structure. However, this structure limits the size of APICID to 8 bits. This patch allows the caller of MP_processor_info() to optionally pass a larger APICID that will be used instead of the one in the mpc_config_processor struct. Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:33 +02:00
gorcunov@gmail.com	6b6891f9c5	x86: cleanup - rename VM_MASK to X86_VM_MASK This patch renames VM_MASK to X86_VM_MASK (which in turn defined as alias to X86_EFLAGS_VM) to better distinguish from virtual memory flags. We can't just use X86_EFLAGS_VM instead because it is also used for conditional compilation Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:33 +02:00
Ingo Molnar	756a6c6855	x86: ioremap of 64-bit resource on 32-bit kernel fix Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Andi Kleen	f5c24a7fd0	x86: don't use large pages to map the first 2/4MB of memory Intel recommends to not use large pages for the first 1MB of the physical memory because there are fixed size MTRRs there which cause splitups in the TLBs. On AMD doing so is also a good idea. The implementation is a little different between 32bit and 64bit. On 32bit I just taught the initial page table set up about this because it was very simple to do. This also has the advantage that the risk of a prefetch ever seeing the page even if it only exists for a short time is minimized. On 64bit that is not quite possible, so use set_memory_4k() a little later (in check_bugs) instead. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Andi Kleen	c9caa02c52	x86: add set_memory_4k to pageattr.c Add a new function to force split large pages into 4k pages. This is needed for some followup optimizations. I had to add a new field to cpa_data to pass down the information that try_preserve_large_page should not run. Right now no set_page_4k() because I didn't need it and all the specialized users I have in mind would be more comfortable with pure addresses. I also didn't export it because it's unlikely external code needs it. Signed-off-by: Andi Kleen <ak@suse.de> Cc: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Andi Kleen	cc61503219	x86: account overlapped mappings in max_pfn_mapped When end_pfn is not aligned to 2MB (or 1GB) then the kernel might map more memory than end_pfn. Account this in max_pfn_mapped. Signed-off-by: Andi Kleen <ak@suse.de> Cc: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Thomas Gleixner	67794292c8	x86: replace the now useless max_pfn_mapped define Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Andi Kleen	7d1116a92d	x86: implement true end_pfn_mapped for 32bit Even on 32bit 2MB pages can map more memory than is in the true max_low_pfn if end_pfn is not highmem and not aligned to 2MB. Add a end_pfn_map similar to x86-64 that accounts for this fact. This is important for code that really needs to know about all mapping aliases. Signed-off-by: Andi Kleen <ak@suse.de> Cc: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Yinghai Lu	dcfe946520	x86: fix memtest print out Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:21 +02:00
Yinghai Lu	c64df70793	x86: memtest bootparam Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:21 +02:00
Venki Pallipadi	b450e5e816	x86: PAT bug fix for attribute type check after reserve_memtype, debug Make the PAT related printks in ioremap pr_debug. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
Venki Pallipadi	dee7cbb210	x86: PAT bug fix for attribute type check after reserve_memtype Bug fixes for reserve_memtype() call in __ioremap and pci_mmap_page_range(). If reserve_memtype returns non-zero, then it is an error and subsequent free is not required. Requested and returned prot value check should be done when reserve_memtype returns success. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
Yinghai Lu	9307cacad0	x86: pat cpu feature bit setting for known cpus Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
Yinghai Lu	35605a1027	x86: enable PAT for amd k8 and fam10h make known_pat_cpu to think amd k8 and fam10h is ok too. also make tom2 below to be WRBACK Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
venkatesh.pallipadi@intel.com	6997ab4982	x86: add PAT related debug prints Adds debug prints at critical code. Adds enough info in dmesg to allow us to do effective first round of analysis of any issues that may result due to PAT patch series. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
venkatesh.pallipadi@intel.com	b310f381d2	x86: PAT add ioremap_wc() interface Introduce ioremap_wc for wc remap. (generic wrapper is in a later patch) Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
venkatesh.pallipadi@intel.com	ef354af462	x86: PAT add set_memory_wc() interface Add a set_memory_wc interface(), similar to set_memory_uc interface. Callers has to call set_memory_uc, set_memory_wb and set_memory_wc, set_memory_wb as pairs. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
venkatesh.pallipadi@intel.com	1219333dfd	x86: PAT use reserve free memtype in set_memory_uc Use reserve_memtype and free_memtype interfaces in set_memory_uc/set_memory_wb interfaces to avoid aliasing. Usage model of set_memory_uc and set_memory_wb is for RAM memory and users will first call set_memory_uc and call set_memory_wb after use to reset the attribute. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
venkatesh.pallipadi@intel.com	d7677d4034	x86: PAT use reserve free memtype in ioremap and iounmap Use reserve_memtype and free_memtype interfaces in ioremap/iounmap to avoid aliasing. If there is an existing alias for the region, inherit the memory type from the alias. If there are conflicting aliases for the entire region, then fail ioremap. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
venkatesh.pallipadi@intel.com	3a96ce8cac	x86: PAT make ioremap_change_attr non-static Make ioremap_change_attr() non-static and use prot_val in place of ioremap_mode. This interface is used in subsequent PAT patches. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Ingo Molnar	55c626820a	x86: revert ucminus change Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
venkatesh.pallipadi@intel.com	2e5d9c857d	x86: PAT infrastructure patch Sets up pat_init() infrastructure. PAT MSR has following setting. PAT \|PCD \|\|PWT \|\|\| 000 WB _PAGE_CACHE_WB 001 WC _PAGE_CACHE_WC 010 UC- _PAGE_CACHE_UC_MINUS 011 UC _PAGE_CACHE_UC We are effectively changing WT from boot time setting to WC. UC_MINUS is used to provide backward compatibility to existing /dev/mem users(X). reserve_memtype and free_memtype are new interfaces for maintaining alias-free mapping. It is currently implemented in a simple way with a linked list and not optimized. reserve and free tracks the effective memory type, as a result of PAT and MTRR setting rather than what is actually requested in PAT. pat_init piggy backs on mtrr_init as the rules for setting both pat and mtrr are same. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Yinghai Lu	272b9cad6e	x86: early memtest to find bad ram do simple memtest after init_memory_mapping use find_e820_area_size to find all ram range that is not reserved. and do some simple bits test to find some bad ram. if find some bad ram, use reserve_early to exclude that range. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Alexey Starikovskiy	ce3fe6b2bf	x86: use get_bios_ebda in mpparse_64.c Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:05 +02:00
Johannes Weiner	1415d160c7	x86: Remove redundant display of free swap space in show_mem() Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:58 +02:00
Yinghai Lu	9a79cf9c1a	x86: sort address_markers for dump_pagetables otherwise Vmemmap and High Kernel Mapping string is not showing up. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:58 +02:00
Mathieu Desnoyers	4e4eee0e01	x86: enhance DEBUG_RODATA support for hotplug and kprobes Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Andi Kleen <andi@firstfloor.org> Cc: pageexec@freemail.hu Cc: akpm@linux-foundation.org CC: Andi Kleen <andi@firstfloor.org> CC: pageexec@freemail.hu CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:58 +02:00
Ingo Molnar	9fc34113f6	x86: debug pmd_bad() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:52 +02:00
Ingo Molnar	ba748d221e	x86: warn about RAM pages in ioremap() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:52 +02:00
Ingo Molnar	bdd3cee2e4	x86: ioremap(), extend check to all RAM pages Suggested by Jan Beulich. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jan Beulich <jbeulich@novell.com>	2008-04-17 17:40:51 +02:00
Thomas Gleixner	e3100c82ab	x86: check physical address range in ioremap Roland Dreier reported in http://lkml.org/lkml/2008/2/27/194 [ 8425.915139] BUG: unable to handle kernel paging request at ffffc20001a0a000 [ 8425.919087] IP: [<ffffffff8021dacc>] clflush_cache_range+0xc/0x25 [ 8425.919087] PGD 1bf80e067 PUD 1bf80f067 PMD 1bb497067 PTE 80000047000ee17b This is on a Intel machine with 36bit physical address space. The PTE entry references 47000ee000, which is outside of it. Add a check for the physical address space and warn/printk about the stupid caller. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:51 +02:00
Ian Campbell	c92a7a54d6	x86: reduce arch/x86/mm/ioremap.o size > Don't we have a special section for page-aligned data so it doesn't > waste most of two pages? We have .bss.page_aligned and it seems appropriate to use it. text data bss dec hex filename - 3388 8236 4 11628 2d6c ../build-32/arch/x86/mm/ioremap.o + 3388 48 4100 7536 1d70 ../build-32/arch/x86/mm/ioremap.o Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: Matt Mackall <mpm@selenic.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:47 +02:00
Yinghai Lu	04adf11435	x86: remove never used nodenumer in pda Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:47 +02:00
Yinghai Lu	beafe91f1c	x86: get apic_id later in acpi_numa_processor_affinity_init we don't need get that so early. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:47 +02:00
Andi Kleen	ef9257668e	x86: do kernel direct mapping at boot using GB pages The AMD Fam10h CPUs support new Gigabyte page table entry for mapping 1GB at a time. Use this for the kernel direct mapping. Only done for 64bit because i386 does not support GB page tables. This only applies to the data portion of the direct mapping; the kernel text mapping stays with 2MB pages because the AMD Fam10h microarchitecture does not support GB ITLBs and AMD recommends against using GB mappings for code. Can be disabled with disable_gbpages on the kernel command line [ tglx@linutronix.de: simplify enable code ] [ Yinghai Lu <yinghai.lu@sun.com>: boot fix on 256 GB RAM ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Ingo Molnar	00d1c5e057	x86: add gbpages switches These new controls toggle experimental support for a new CPU feature, the straightforward extension of largepages from the pmd level to the pud level, which allows 1GB (kernel) TLBs instead of 2MB TLBs. Turn it off by default, as this code has not been tested well enough yet. Use the CONFIG_DIRECT_GBPAGES=y .config option or gbpages on the boot line can be used to enable it. If enabled in the .config then nogbpages boot option disables it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
H. Peter Anvin	fe770bf031	x86: clean up the page table dumper and add 32-bit support Clean up the page table dumper (fix boundary conditions, table driven address ranges, some formatting changes since it is no longer using the kernel log but a separate virtual file), and generalize to 32 bits. [ mingo@elte.hu: x86: fix the pagetable dumper ] Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Arjan van de Ven	926e5392ba	x86: add code to dump the (kernel) page tables for visual inspection by kernel developers This patch adds code to the kernel to have an (optional) /proc/kernel_page_tables debug file that basically dumps the kernel pagetables; this allows us kernel developers to verify that nothing fishy is going on and that the various mappings are set up correctly. This was quite useful in finding various change_page_attr() bugs, and is very likely to be useful in the future as well. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: mingo@elte.hu Cc: tglx@tglx.de Cc: hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
H. Peter Anvin	2596e0fae0	x86: unify arch/x86/mm/Makefile Unify arch/x86/mm/Makefile between 32 and 64 bits. All configuration variables that are protected by Kconfig constraints have been put in the common part of the Makefile; however, the NUMA files are totally different between 32 and 64 bits and are handled via an ifdef. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Thomas Gleixner	ee7ae7a198	x86: add debug info to DEBUG_PAGEALLOC Add debug information for DEBUG_PAGEALLOC to get some statistics about the pool usage and split status. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Ingo Molnar	b4e0409a36	x86: check vmlinux limits, 64-bit these build-time and link-time checks would have prevented the vmlinux size regression. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:45 +02:00
Andrew Morton	9c312058b2	Avoid false positive warnings in kmap_atomic_prot() with DEBUG_HIGHMEM I believe http://bugzilla.kernel.org/show_bug.cgi?id=10318 is a false positive. There's no way in which networking will be using highmem pages here, so it won't be taking the KM_USER0 kmap slot, so there's no point in performing these checks. Cc: Pawel Staszewski <pstaszewski@artcom.pl> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Christoph Lameter <clameter@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Really sad. We lose almost all real-life coverage of the debug tests with this patch. Now it will only report problems for the cases where people actually end up using a HIGHMEM page, not when they just _might_ use one. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-28 13:08:14 -07:00
Ingo Molnar	3085354de6	x86: prefetch fix #2 Linus noticed a second bug and an uncleanliness: - we'd return on any instruction fetch fault - we'd use both the value of 16 and the PF_INSTR symbol which are the same and make no sense the cleanup nicely unifies this piece of logic. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 22:00:16 +01:00
Christoph Lameter	25e59881f1	x86: stricter check in follow_huge_addr() The first page of the compound page is determined in follow_huge_addr() but then PageCompound() only checks if the page is part of a compound page. PageHead() allows checking if this is indeed the first page of the compound. Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 16:08:45 +01:00
Ingo Molnar	bc713dcf35	x86: fix prefetch workaround some early Athlon XP's and Opterons generate bogus faults on prefetch instructions. The workaround for this regressed over .24 - reinstate it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 16:08:44 +01:00
Suresh Siddha	d546b67a94	x86: fix performance drop for glx fix the 3D performance drop reported at: http://bugzilla.kernel.org/show_bug.cgi?id=10328 fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with WC attribute. Recent changes in page attribute code made both ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This breaks the graphics performance, as the effective memory type is UC instead of expected WC. The correct way to fix this is to add ioremap_wc() (which uses UC- in the absence of PAT kernel support and WC with PAT) and change all the fb drivers to use this new ioremap_wc() API. We can take this correct and longer route for post 2.6.25. For now, revert back to the UC- behavior for ioremap/ioremap_nocache. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-26 22:23:41 +01:00
Yinghai Lu	76c324182b	x86: fix trim mtrr not to setup_memory two times we could call find_max_pfn() directly instead of setup_memory() to get max_pfn needed for mtrr trimming. otherwise setup_memory() is called two times... that is duplicated... [ mingo@elte.hu: both Thomas and me simulated a double call to setup_bootmem_allocator() and can confirm that it is a real bug which can hang in certain configs. It's not been reported yet but that is probably due to the relatively scarce nature of MTRR-trimming systems. ] Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-26 22:23:41 +01:00
Linus Torvalds	b9e76a0074	x86-32: Pass the full resource data to ioremap() It appears that 64-bit PCI resources cannot possibly ever have worked on x86-32 even when the RESOURCES_64BIT config option was set, because any driver that tried to [pci_]ioremap() the resource would have been unable to do so because the high 32 bits would have been silently dropped on the floor by the ioremap() routines that only used "unsigned long". Change them to use "resource_size_t" instead, which properly encodes the whole 64-bit resource data if RESOURCES_64BIT is enabled. Acked-by: H. Peter Anvin <hpa@kernel.org> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-24 11:22:39 -07:00
Yinghai Lu	37bff62e98	x86_64: free_bootmem should take phys so use nodedata_phys directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Thomas Gleixner	985a34bd75	x86: remove quicklists quicklists cause a serious memory leak on 32-bit x86, as documented at: http://bugzilla.kernel.org/show_bug.cgi?id=9991 the reason is that the quicklist pool is a special-purpose cache that grows out of proportion. It is not accounted for anywhere and users have no way to even realize that it's the quicklists that are causing RAM usage spikes. It was supposed to be a relatively small pool, but as demonstrated by KOSAKI Motohiro, they can grow as large as: Quicklists: 1194304 kB given how much trouble this code has caused historically, and given that Andrew objected to its introduction on x86 (years ago), the best option at this point is to remove them. [ any performance benefits of caching constructed pgds should be implemented in a more generic way (possibly within the page allocator), while still allowing constructed pages to be allocated by other workloads. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:55 +01:00
Ingo Molnar	9a46d7e5b6	x86: ioremap, remove WARN_ON() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:54 +01:00
Yinghai Lu	7c9e92b6cd	x86: not set node to cpu_to_node if the node is not online resolve boot problem reported by Mel Gorman: http://lkml.org/lkml/2008/2/13/404 init_cpu_to_node will use cpu->apic (from MADT or mptable) and apic->node(from SRAT or AMD config space with k8_bus_64.c) to have cpu->node mapping, and later identify_cpu will overwrite them again...(with nearby_node...) this patch checks if the node is online, otherwise it will not update cpu_node map. so keep cpu_node map to online node before identify_cpu..., to prevent possible error. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-04 17:10:12 +01:00
Rafael J. Wysocki	9b5cf48b06	x86: revert "x86: CPA: avoid split of alias mappings" Revert: commit `8be8f54bae` Author: Thomas Gleixner <tglx@linutronix.de> Date: Sat Feb 23 20:43:21 2008 +0100 x86: CPA: avoid split of alias mappings because it clearly mishandles the case when __change_page_attr(), called from __change_page_attr_set_clr(), changes cpa->processed to 1 and cpa_process_alias(cpa) is executed right after that. This crashes my x86-64 test box early in the boot process (ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-03 14:18:27 +01:00
Thomas Gleixner	8be8f54bae	x86: CPA: avoid split of alias mappings avoid over-eager large page splitup. When the target area needs to be split or is split already (ioremap) then the current code enforces the split of large mappings in the alias regions even if we could avoid it. Use a separate variable processed in the cpa_data structure to carry the number of pages which have been processed instead of reusing the numpages variable. This keeps numpages intact and gives the alias code a chance to keep large mappings intact. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:55:42 +01:00
Ingo Molnar	b16bf712f4	x86: fix leak un ioremap_page_range() failure Jan Beulich noticed it during code review that if a driver's ioremap() fails (say due to -ENOMEM) then we might leak the struct vm_area. Free it properly. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:55:42 +01:00
Ingo Molnar	88f3aec7af	x86: fix spontaneous reboot with allyesconfig bzImage recently the 64-bit allyesconfig bzImage kernel started spontaneously rebooting during early bootup. after a few fun hours spent with early init debugging, it turns out that we've got this rather annoying limit on the size of the kernel image: #define KERNEL_TEXT_SIZE (4010241024) which limit my vmlinux just happened to pass: text data bss dec hex filename 29703744 4222751 `8646224` 42572719 2899baf vmlinux 40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/ So it happily crashed right in head_64.S, which - as we all know - is the most debuggable code in the whole architecture ;-) So increase the limit to allow an up to 128MB kernel image to be mapped. (should anyone be that crazy or lazy) We have a full 4K of pagetable (level2_kernel_pgt) allocated for these mappings already, so there's no RAM overhead and the limit was rather pointless and arbitrary. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:56 +01:00

... 3 4 5 6 7 ...

774 commits