kernel-fxtec-pro1x

Author	SHA1	Message	Date
Gleb Natapov	6d77dbfc88	KVM: inject #UD if instruction emulation fails and exit to userspace Do not kill VM when instruction emulation fails. Inject #UD and report failure to userspace instead. Userspace may choose to reenter guest if vcpu is in userspace (cpl == 3) in which case guest OS will kill offending process and continue running. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:35:40 +03:00
Gui Jianfeng	54a4f0239f	KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed Currently, kvm_mmu_zap_page() returning the number of freed children sp. This might confuse the caller, because caller don't know the actual freed number. Let's make kvm_mmu_zap_page() return the number of pages it actually freed. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:39 +03:00
Gui Jianfeng	518c5a05e8	KVM: MMU: Fix debug output error in walk_addr() Fix a debug output error in walk_addr Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:39 +03:00
Gui Jianfeng	f3b8c964a9	KVM: MMU: mark page table dirty when a pte is actually modified Sometime cmpxchg_gpte doesn't modify gpte, in such case, don't mark page table page as dirty. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:38 +03:00
Joerg Roedel	eec4b140c9	KVM: SVM: Allow EFER.LMSLE to be set with nested svm This patch enables setting of efer bit 13 which is allowed in all SVM capable processors. This is necessary for the SLES11 version of Xen 4.0 to boot with nested svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:38 +03:00
Joerg Roedel	3f10c846f8	KVM: SVM: Dump vmcb contents on failed vmrun This patch adds a function to dump the vmcb into the kernel log and calls it after a failed vmrun to ease debugging. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:38 +03:00
Avi Kivity	d94e1dc9af	KVM: Get rid of KVM_REQ_KICK KVM_REQ_KICK poisons vcpu->requests by having a bit set during normal operation. This causes the fast path check for a clear vcpu->requests to fail all the time, triggering tons of atomic operations. Fix by replacing KVM_REQ_KICK with a vcpu->guest_mode atomic. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:37 +03:00
Gleb Natapov	54b8486f46	KVM: x86 emulator: do not inject exception directly into vcpu Return exception as a result of instruction emulation and handle injection in KVM code. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:37 +03:00
Gleb Natapov	95cb229530	KVM: x86 emulator: move interruptibility state tracking out of emulator Emulator shouldn't access vcpu directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:36 +03:00
Gleb Natapov	4d2179e1e9	KVM: x86 emulator: handle shadowed registers outside emulator Emulator shouldn't access vcpu directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:36 +03:00
Gleb Natapov	bdb475a323	KVM: x86 emulator: use shadowed register in emulate_sysexit() emulate_sysexit() should use shadowed registers copy instead of looking into vcpu state directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:36 +03:00
Gleb Natapov	ef050dc039	KVM: x86 emulator: set RFLAGS outside x86 emulator code Removes the need for set_flags() callback. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:35 +03:00
Gleb Natapov	95c5588652	KVM: x86 emulator: advance RIP outside x86 emulator code Return new RIP as part of instruction emulation result instead of updating KVM's RIP from x86 emulator code. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:35 +03:00
Gleb Natapov	3457e4192e	KVM: handle emulation failure case first If emulation failed return immediately. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:34 +03:00
Gleb Natapov	8fe681e984	KVM: do not inject #PF in (read\|write)_emulated() callbacks Return error to x86 emulator instead of injection exception behind its back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:34 +03:00
Gleb Natapov	f181b96d4c	KVM: remove export of emulator_write_emulated() It is not called directly outside of the file it's defined in anymore. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:34 +03:00
Gleb Natapov	c3cd7ffaf5	KVM: x86 emulator: x86_emulate_insn() return -1 only in case of emulation failure Currently emulator returns -1 when emulation failed or IO is needed. Caller tries to guess whether emulation failed by looking at other variables. Make it easier for caller to recognise error condition by always returning -1 in case of failure. For this new emulator internal return value X86EMUL_IO_NEEDED is introduced. It is used to distinguish between error condition (which returns X86EMUL_UNHANDLEABLE) and condition that requires IO exit to userspace to continue emulation. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:33 +03:00
Gleb Natapov	411c35b7ef	KVM: fill in run->mmio details in (read\|write)_emulated function Fill in run->mmio details in (read\|write)_emulated function just like pio does. There is no point in filling only vcpu fields there just to copy them into vcpu->run a little bit later. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:33 +03:00
Gleb Natapov	e680080e65	KVM: x86 emulator: fix X86EMUL_RETRY_INSTR and X86EMUL_CMPXCHG_FAILED values Currently X86EMUL_PROPAGATE_FAULT, X86EMUL_RETRY_INSTR and X86EMUL_CMPXCHG_FAILED have the same value so caller cannot distinguish why function such as emulator_cmpxchg_emulated() (which can return both X86EMUL_PROPAGATE_FAULT and X86EMUL_CMPXCHG_FAILED) failed. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:32 +03:00
Gleb Natapov	338dbc9781	KVM: x86 emulator: make (get\|set)_dr() callback return error if it fails Make (get\|set)_dr() callback return error if it fails instead of injecting exception behind emulator's back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:32 +03:00
Gleb Natapov	0f12244fe7	KVM: x86 emulator: make set_cr() callback return error if it fails Make set_cr() callback return error if it fails instead of injecting #GP behind emulator's back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:31 +03:00
Gleb Natapov	79168fd1a3	KVM: x86 emulator: cleanup some direct calls into kvm to use existing callbacks Use callbacks from x86_emulate_ops to access segments instead of calling into kvm directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:31 +03:00
Gleb Natapov	5951c44237	KVM: x86 emulator: add get_cached_segment_base() callback to x86_emulate_ops On VMX it is expensive to call get_cached_descriptor() just to get segment base since multiple vmcs_reads are done instead of only one. Introduce new call back get_cached_segment_base() for efficiency. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:31 +03:00
Gleb Natapov	3fb1b5dbd3	KVM: x86 emulator: add (set\|get)_msr callbacks to x86_emulate_ops Add (set\|get)_msr callbacks to x86_emulate_ops instead of calling them directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:30 +03:00
Gleb Natapov	35aa5375d4	KVM: x86 emulator: add (set\|get)_dr callbacks to x86_emulate_ops Add (set\|get)_dr callbacks to x86_emulate_ops instead of calling them directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:30 +03:00
Gleb Natapov	414e6277fd	KVM: x86 emulator: handle "far address" source operand ljmp/lcall instruction operand contains address and segment. It can be 10 bytes long. Currently we decode it as two different operands. Fix it by introducing new kind of operand that can hold entire far address. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:30 +03:00
Gleb Natapov	b8a98945ea	KVM: x86 emulator: cleanup nop emulation Make it more explicit what we are checking for. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:29 +03:00
Gleb Natapov	f0c13ef1a8	KVM: x86 emulator: cleanup xchg emulation Dst operand is already initialized during decoding stage. No need to reinitialize. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:29 +03:00
Gleb Natapov	054fe9f6e3	KVM: x86 emulator: fix Move r/m16 to segment register decoding This instruction does not need generic decoding for its dst operand. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:28 +03:00
Gleb Natapov	9de4157367	KVM: x86 emulator: introduce read cache Introduce read cache which is needed for instruction that require more then one exit to userspace. After returning from userspace the instruction will be re-executed with cached read value. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:28 +03:00
Avi Kivity	1c11e71357	KVM: VMX: Avoid writing HOST_CR0 every entry cr0.ts may change between entries, so we copy cr0 to HOST_CR0 before each entry. That is slow, so instead, set HOST_CR0 to have TS set unconditionally (which is a safe value), and issue a clts() just before exiting vcpu context if the task indeed owns the fpu. Saves ~50 cycles/exit. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:28 +03:00
Avi Kivity	08acfa1871	KVM: kvm_pdptr_read() may sleep Annotate it thusly. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:27 +03:00
Takuya Yoshikawa	914ebccd2d	KVM: x86: avoid unnecessary bitmap allocation when memslot is clean Although we always allocate a new dirty bitmap in x86's get_dirty_log(), it is only used as a zero-source of copy_to_user() and freed right after that when memslot is clean. This patch uses clear_user() instead of doing this unnecessary zero-source allocation. Performance improvement: as we can expect easily, the time needed to allocate a bitmap is completely reduced. In my test, the improved ioctl was about 4 to 10 times faster than the original one for clean slots. Furthermore, reducing memory allocations and copies will produce good effects to caches too. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:27 +03:00
Avi Kivity	c332c83ae7	KVM: VMX: Simplify vmx_get_nmi_mask() !! is not needed due to the cast to bool. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:27 +03:00
Huang Ying	bf998156d2	KVM: Avoid killing userspace through guest SRAO MCE on unmapped pages In common cases, guest SRAO MCE will cause corresponding poisoned page be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay the MCE to guest OS. But it is reported that if the poisoned page is accessed in guest after unmapping and before MCE is relayed to guest OS, userspace will be killed. The reason is as follows. Because poisoned page has been un-mapped, guest access will cause guest exit and kvm_mmu_page_fault will be called. kvm_mmu_page_fault can not get the poisoned page for fault address, so kernel and user space MMIO processing is tried in turn. In user MMIO processing, poisoned page is accessed again, then userspace is killed by force_sig_info. To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM and do not try kernel and user space MMIO processing for poisoned page. [xiao: fix warning introduced by avi] Reported-by: Max Asbock <masbock@linux.vnet.ibm.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:26 +03:00
Andres Salomon	54e5bc020c	x86, olpc: Constify an olpc_ofw() arg The arguments passed to OFW shouldn't be modified; update the 'args' argument of olpc_ofw to reflect this. This saves us some later casting away of consts. Signed-off-by: Andres Salomon <dilinger@queued.net> LKML-Reference: <20100628220029.1555ac24@debian> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-30 18:02:21 -07:00
Andres Salomon	25971865d4	x86, olpc: Use pr_debug() for EC commands Unconditionally printing EC debug messages was helpful when we were actually debugging the EC, but during normal operation it can get pretty annoying. Using pr_debug allows us finer-grained control. Signed-off-by: Andres Salomon <dilinger@queued.net> LKML-Reference: <20100616231928.16b539f0@dev.queued.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-30 18:01:52 -07:00
Fenghua Yu	9792db6174	x86, cpu: Package Level Thermal Control, Power Limit Notification definitions Add package level thermal and power limit feature support. The two MSRs and features are new starting with Intel's Sandy Bridge processor. Please check Intel 64 and IA-32 Architectures SDMV Vol 3A 14.5.6 Power Limit Notification and 14.6 Package Level Thermal Management. This patch also fixes a bug which defines reverse THERM_INT_LOW_ENABLE bit and THERM_INT_HIGH_ENABLE bit. [ hpa: fixed up against current tip:x86/cpu ] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> LKML-Reference: <1280448826-12004-2-git-send-email-fenghua.yu@intel.com> Reviewed-by: Len Brown <len.brown@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-30 16:15:32 -07:00
Suresh Siddha	68f202e4e8	x86, mtrr: Use stop machine context to rendezvous all the cpu's Use the stop machine context rather than IPI's to rendezvous all the cpus for MTRR initialization that happens during cpu bringup or for MTRR modifications during runtime. This avoids deadlock scenario (reported by Prarit) like: cpu A holds a read_lock (tasklist_lock for example) with irqs enabled cpu B waits for the same lock with irqs disabled using write_lock_irq cpu C doing set_mtrr() (during AP bringup for example), which will try to rendezvous all the cpus using IPI's This will result in C and A come to the rendezvous point and waiting for B. B is stuck forever waiting for the lock and thus not reaching the rendezvous point. Using stop cpu (run in the process context of per cpu based keventd) to do this rendezvous, avoids this deadlock scenario. Also make sure all the cpu's are in the rendezvous handler before we proceed with the local_irq_save() on each cpu. This lock step disabling irqs on all the cpus will avoid other deadlock scenarios (for example involving with the blocking smp_call_function's etc). [ This problem is very old. Marking -stable only for 2.6.35 as the stop_one_cpu_nowait() API is present only in 2.6.35. Any older kernel interested in this fix need to do some more work in backporting this patch. ] Reported-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1280515602.2682.10.camel@sbsiddha-MOBL3.sc.intel.com> Acked-by: Prarit Bhargava <prarit@redhat.com> Cc: stable@kernel.org [2.6.35] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-30 15:59:49 -07:00
Kulikov Vasiliy	1f7979ac53	x86/PCI: use for_each_pci_dev() Use for_each_pci_dev() to simplify the code. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-07-30 09:47:33 -07:00
Ben Hutchings	30da552428	PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc() commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove unsafe and unnecessary hardware access" changed read_msi_msg_desc() to return the last MSI message written instead of reading it from the device, since it may be called while the device is in a reduced power state. However, the pSeries platform code really does need to read messages from the device, since they are initially written by firmware. Therefore: - Restore the previous behaviour of read_msi_msg_desc() - Add new functions get_cached_msi_msg{,_desc}() which return the last MSI message written - Use the new functions where appropriate Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-07-30 09:41:39 -07:00
Bjorn Helgaas	2491762cfb	x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN This DMI quirk turns on "pci=use_crs" for the ALiveSATA2-GLAN because amd_bus.c doesn't handle this system correctly. The system has a single HyperTransport I/O chain, but has two PCI host bridges to buses 00 and 80. amd_bus.c learns the MMIO range associated with buses 00-ff and that this range is routed to the HT chain hosted at node 0, link 0: bus: [00, ff] on node 0 link 0 bus: 00 index 1 [mem 0x80000000-0xfcffffffff] This includes the address space for both bus 00 and bus 80, and amd_bus.c assumes it's all routed to bus 00. We find device 80:01.0, which BIOS left in the middle of that space, but we don't find a bridge from bus 00 to bus 80, so we conclude that 80:01.0 is unreachable from bus 00, and we move it from the original, working, address to something outside the bus 00 aperture, which does not work: pci 0000:80:01.0: reg 10: [mem 0xfebfc000-0xfebfffff 64bit] pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit] The BIOS told us everything we need to know to handle this correctly, so we're better off if we just pay attention, which lets us leave the 80:01.0 device at the original, working, address: ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f]) pci_root PNP0A03:00: host bridge window [mem 0x80000000-0xff37ffff] ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff]) pci_root PNP0A08:00: host bridge window [mem 0xfebfc000-0xfebfffff] This was a regression between 2.6.33 and 2.6.34. In 2.6.33, amd_bus.c was used only when we found multiple HT chains. `3e3da00c01`, which enabled amd_bus.c even on systems with a single HT chain, caused this failure. This quirk was written by Graham. If we ever enable "pci=use_crs" for machines from 2006 or earlir, this quirk should be removed. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16007 Cc: stable@kernel.org Reported-by: Graham Ramsey <ramsey.graham@ntlworld.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-07-30 09:30:31 -07:00
Mike Habeck	7bd1c365fd	x86/PCI: Add option to not assign BAR's if not already assigned The Linux kernel assigns BARs that a BIOS did not assign, most likely to handle broken BIOSes that didn't enumerate the devices correctly. On UV the BIOS purposely doesn't assign I/O BARs for certain devices/ drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...). We purposely don't assign these I/O BARs because I/O Space is a very limited resource. There is only 64k of I/O Space, and in a PCIe topology that space gets divided up into 4k chucks (this is due to the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)... Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16) SGI needs to scale to >16 devices with I/O BARs. So by not assigning I/O BARs on devices we know don't use them, we can do that (iff the kernel doesn't go and assign these BARs that the BIOS purposely didn't assign). This patch will not assign a resource to a device BAR if that BAR was not assigned by the BIOS, and the kernel cmdline option 'pci=nobar' was specified. This patch is closely modeled after the 'pci=norom' option that currently exists in the tree. Signed-off-by: Mike Habeck <habeck@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-07-30 09:29:12 -07:00
Jiri Slaby	73cd3b43f0	x86/PCI: pci, fix section mismatch pcibios_scan_specific_bus calls pci_scan_bus_on_node which is __devinit. Mark pcibios_scan_specific_bus __devinit as well since all users are now __init or __devinit. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-07-30 09:29:09 -07:00
Stefano Stabellini	ca65f9fc0c	Introduce CONFIG_XEN_PVHVM compile option This patch introduce a CONFIG_XEN_PVHVM compile time option to enable/disable Xen PV on HVM support. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-07-29 11:11:33 -07:00
Yasuaki Ishimatsu	3709c85735	x86: Ioremap: fix wrong physical address handling in PAT code The following two commits fixed a problem that x86 ioremap() doesn't handle physical address higher than 32-bit properly in X86_32 PAE mode. `ffa71f33a8` (x86, ioremap: Fix incorrect physical address handling in PAE mode) `35be1b716a` (x86, ioremap: Fix normal ram range check) But these fixes are not enough, since pat_pagerange_is_ram() in PAT code also has a same problem. This patch fixes it. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> LKML-Reference: <4C47DDCF.80300@jp.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-07-29 15:26:11 +02:00
Jason Wessel	ba773f7c51	x86,kgdb: Fix hw breakpoint regression HW breakpoints events stopped working correctly with kgdb as a result of commit: `018cbffe68` (Merge commit 'v2.6.33' into perf/core). The regression occurred because the behavior changed for setting NOTIFY_STOP as the return value to the die notifier if the breakpoint was known to the HW breakpoint API. Because kgdb is using the HW breakpoint API to register HW breakpoints slots, it must also now implement the overflow_handler call back else kgdb does not get to see the events from the die notifier. The kgdb_ll_trap function will be changed to be general purpose code which can allow an easy way to implement the hw_breakpoint API overflow call back. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Dongdong Deng <dongdong.deng@windriver.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-07-28 19:10:30 -05:00
H. Peter Anvin	a378d9338e	x86, asm: Merge cmpxchg_486_u64() and cmpxchg8b_emu() We have two functions for doing exactly the same thing -- emulating cmpxchg8b on 486 and older hardware -- with different calling conventions, and yet doing the same thing. Drop the C version and use the assembly version, via alternatives, for both the local and non-local versions of cmpxchg8b. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <AANLkTikAmaDPji-TVDarmG1yD=fwbffcsmEU=YEuP+8r@mail.gmail.com>	2010-07-28 17:05:11 -07:00
H. Peter Anvin	90c8f92f5c	x86, asm: Move cmpxchg emulation code to arch/x86/lib Move cmpxchg emulation code from arch/x86/kernel/cpu (which is otherwise CPU identification) to arch/x86/lib, where other emulation code lives already. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <AANLkTikAmaDPji-TVDarmG1yD=fwbffcsmEU=YEuP+8r@mail.gmail.com>	2010-07-28 16:53:49 -07:00
H. Peter Anvin	a5b91606bd	x86, cpu: Export AMD errata definitions Exprot the AMD errata definitions, since they are needed by kvm_amd.ko if that is built as a module. Doing "make allmodconfig" during testing would have caught this. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> LKML-Reference: <1280336972-865982-1-git-send-email-hans.rosenfeld@amd.com>	2010-07-28 16:23:20 -07:00
H. Peter Anvin	4532b305e8	x86, asm: Clean up and simplify <asm/cmpxchg.h> Remove the __xg() hack to create a memory barrier near xchg and cmpxchg; it has been there since 1.3.11 but should not be necessary with "asm volatile" and a "memory" clobber, neither of which were there in the original implementation. However, we should make this a volatile reference. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <AANLkTikAmaDPji-TVDarmG1yD=fwbffcsmEU=YEuP+8r@mail.gmail.com>	2010-07-28 15:24:09 -07:00
Hans Rosenfeld	1be85a6d93	x86, cpu: Use AMD errata checking framework for erratum 383 Use the AMD errata checking framework instead of open-coding the test. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> LKML-Reference: <1280336972-865982-3-git-send-email-hans.rosenfeld@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-28 13:12:31 -07:00
Hans Rosenfeld	9d8888c2a2	x86, cpu: Clean up AMD erratum 400 workaround Remove check_c1e_idle() and use the new AMD errata checking framework instead. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> LKML-Reference: <1280336972-865982-2-git-send-email-hans.rosenfeld@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-28 13:12:11 -07:00
Hans Rosenfeld	d78d671db4	x86, cpu: AMD errata checking framework Errata are defined using the AMD_LEGACY_ERRATUM() or AMD_OSVW_ERRATUM() macros. The latter is intended for newer errata that have an OSVW id assigned, which it takes as first argument. Both take a variable number of family-specific model-stepping ranges created by AMD_MODEL_RANGE(). Iff an erratum has an OSVW id, OSVW is available on the CPU, and the OSVW id is known to the hardware, it is used to determine whether an erratum is present. Otherwise, the model-stepping ranges are matched against the current CPU to find out whether the erratum applies. For certain special errata, the code using this framework might have to conduct further checks to make sure an erratum is really (not) present. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> LKML-Reference: <1280336972-865982-1-git-send-email-hans.rosenfeld@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-28 13:12:04 -07:00
H. Peter Anvin	7d50d07da2	Merge remote branch 'linus/master' into x86/cpu	2010-07-28 13:11:28 -07:00
H. Peter Anvin	18642a57df	x86, vdso: Don't quote $nm in the script for checking vdso references Don't quote $nm in the script for checking the vdso for external references. Doing so breaks multiword constructs, like using CROSS_COMPILE='ccache '. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <20100728134252.2e4c27cf.sfr@canb.auug.org.au>	2010-07-27 23:52:29 -07:00
H. Peter Anvin	69309a0590	x86, asm: Clean up and simplify set_64bit() Clean up and simplify set_64bit(). This code is quite old (1.3.11) and contains a fair bit of auxilliary machinery that current versions of gcc handle just fine automatically. Worse, the auxilliary machinery can actually cause an unnecessary spill to memory. Furthermore, the loading of the old value inside the loop in the 32-bit case is unnecessary: if the value doesn't match, the CMPXCHG8B instruction will already have loaded the "new previous" value for us. Clean up the comment, too, and remove page references to obsolete versions of the Intel SDM. Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <tip-*@vger.kernel.org>	2010-07-27 23:29:52 -07:00
H. Peter Anvin	d3608b5681	Merge remote branch 'origin/x86/urgent' into x86/asm	2010-07-27 23:28:28 -07:00
Jeremy Fitzhardinge	c7f52cdc2f	support multiple .discard.* sections to avoid section type conflicts gcc 4.4.4 will complain if you use a .discard section for both text and data ("causes a section type conflict"). Add support for ".discard.*" sections, and use .discard.text for a dummy function in the x86 RESERVE_BRK() macro. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-27 22:45:19 -07:00
H. Peter Anvin	113fc5a6e8	x86: Add memory modify constraints to xchg() and cmpxchg() xchg() and cmpxchg() modify their memory operands, not merely read them. For some versions of gcc the "memory" clobber has apparently dealt with the situation, but not for all. Originally-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Glauber Costa <glommer@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Peter Palfrader <peter@palfrader.org> Cc: Greg KH <gregkh@suse.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Zachary Amsden <zamsden@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: <stable@kernel.org> LKML-Reference: <4C4F7277.8050306@zytor.com>	2010-07-27 17:14:02 -07:00
Joerg Roedel	7a42c4ff02	Merge branches 'iommu-api/2.6.36' and 'amd-iommu/2.6.36' into iommu/2.6.36	2010-07-27 18:19:32 +02:00
Joerg Roedel	80a506b8fd	x86/amd-iommu: Export cache-coherency capability This patch exports the capability of the AMD IOMMU to force cache coherency of DMA transactions through the IOMMU-API. This is required to disable some nasty hacks in KVM when this capability is not available. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2010-07-27 17:14:24 +02:00
John Stultz	f12a15be63	x86: Convert common clocksources to use clocksource_register_hz/khz This converts the most common of the x86 clocksources over to use clocksource_register_hz/khz. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-11-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-07-27 12:40:55 +02:00
John Stultz	7615856ebf	timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset update_vsyscall() did not provide the wall_to_monotoinc offset, so arch specific implementations tend to reference wall_to_monotonic directly. This limits future cleanups in the timekeeping core, so this patch fixes the update_vsyscall interface to provide wall_to_monotonic, allowing wall_to_monotonic to be made static as planned in Documentation/feature-removal-schedule.txt Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Anton Blanchard <anton@samba.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Tony Luck <tony.luck@intel.com> LKML-Reference: <1279068988-21864-7-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-07-27 12:40:54 +02:00
John Stultz	592913ecb8	time: Kill off CONFIG_GENERIC_TIME Now that all arches have been converted over to use generic time via clocksources or arch_gettimeoffset(), we can remove the GENERIC_TIME config option and simplify the generic code. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-4-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-07-27 12:40:54 +02:00
John Stultz	8c73626ab2	x86: Fix vtime/file timestamp inconsistencies Due to vtime calling vgettimeofday(), its possible that an application could call time();create("stuff",O_RDRW); only to see the file's creation timestamp to be before the value returned by time. A similar way to reproduce the issue is to compare the vsyscall time() with the syscall time(), and observe ordering issues. The modified test case from Oleg Nesterov below can illustrate this: int main(void) { time_t sec1,sec2; do { sec1 = time(&sec2); sec2 = syscall(__NR_time, NULL); } while (sec1 <= sec2); printf("vtime: %d.000000\n", sec1); printf("time: %d.000000\n", sec2); return 0; } The proper fix is to make vtime use the same time value as current_kernel_time() (which is exported via update_vsyscall) instead of vgettime(). Thanks to Jiri Olsa for bringing up the issue and catching bugs in earlier verisons of this fix. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> LKML-Reference: <1279068988-21864-2-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-07-27 12:40:53 +02:00
Jeremy Fitzhardinge	b43275d661	xen/pvhvm: fix build problem when !CONFIG_XEN x86_hyper_xen_hvm is only defined when Xen is enabled in the kernel config. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:28 -07:00
Stefano Stabellini	5915100106	x86: Call HVMOP_pagetable_dying on exit_mmap. When a pagetable is about to be destroyed, we notify Xen so that the hypervisor can clear the related shadow pagetable. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:26 -07:00
Stefano Stabellini	c1c5413ad5	x86: Unplug emulated disks and nics. Add a xen_emul_unplug command line option to the kernel to unplug xen emulated disks and nics. Set the default value of xen_emul_unplug depending on whether or not the Xen PV frontends and the Xen platform PCI driver have been compiled for this kernel (modules or built-in are both OK). The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM even if the host platform doesn't support unplug. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:25 -07:00
Stefano Stabellini	409771d258	x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock. Use xen_vcpuop_clockevent instead of hpet and APIC timers as main clockevent device on all vcpus, use the xen wallclock time as wallclock instead of rtc and use xen_clocksource as clocksource. The pv clock algorithm needs to work correctly for the xen_clocksource and xen wallclock to be usable, only modern Xen versions offer a reliable pv clock in HVM guests (XENFEAT_hvm_safe_pvclock). Using the hpet as clocksource means a VMEXIT every time we read/write to the hpet mmio addresses, pvclock give us a better rating without VMEXITs. Same goes for the xen wallclock and xen_vcpuop_clockevent Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:25 -07:00
Linus Torvalds	1a041a23da	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Do not try to disable hpet if it hasn't been initialized before x86, i8259: Only register sysdev if we have a real 8259 PIC	2010-07-26 16:02:07 -07:00
Linus Torvalds	ee13cbdec4	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] powernow-k8: Limit Pstate transition latency check [CPUFREQ] Fix PCC driver error path [CPUFREQ] fix double freeing in error path of pcc-cpufreq [CPUFREQ] pcc driver should check for pcch method before calling _OSC [CPUFREQ] fix memory leak in cpufreq_add_dev [CPUFREQ] revert "[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)"	2010-07-26 15:35:53 -07:00
Borislav Petkov	3581ced3b6	[CPUFREQ] powernow-k8: Limit Pstate transition latency check The Pstate transition latency check was added for broken F10h BIOSen which wrongly contain a value of 0 for transition and bus master latency. Fam11h and later, however, (will) have similar transition latency so extend that behavior for them too. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Matthew Garrett	179ee43465	[CPUFREQ] Fix PCC driver error path The PCC cpufreq driver unmaps the mailbox address range if any CPUs fail to initialise, but doesn't do anything to remove the registered CPUs from the cpufreq core resulting in failures further down the line. We're better off simply returning a failure - the cpufreq core will unregister us cleanly if we end up with no successfully registered CPUs. Tidy up the failure path and also add a sanity check to ensure that the firmware gives us a realistic frequency - the core deals badly with that being set to 0. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Daniel J Blueman	3847d223f2	[CPUFREQ] fix double freeing in error path of pcc-cpufreq Prevent double freeing on error path. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Matthew Garrett	47f8bcf362	[CPUFREQ] pcc driver should check for pcch method before calling _OSC The pcc specification documents an _OSC method that's incompatible with the one defined as part of the ACPI spec. This shouldn't be a problem as both are supposed to be guarded with a UUID. Unfortunately approximately nobody (including HP, who wrote this spec) properly check the UUID on entry to the _OSC call. Right now this could result in surprising behaviour if the pcc driver performs an _OSC call on a machine that doesn't implement the pcc specification. Check whether the PCCH method exists first in order to reduce this probability. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Linus Torvalds	20ba5efb9c	Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Use kmalloc() instead of vmalloc() for KVM_[GS]ET_MSR KVM: MMU: fix conflict access permissions in direct sp	2010-07-26 08:18:18 -07:00
Len Brown	0e1cf38889	Merge branch 'bugzilla-16396' into release	2010-07-24 23:26:22 -04:00
Rafael J. Wysocki	72ad5d77fb	ACPI / Sleep: Allow the NVS saving to be skipped during suspend to RAM Commit `2a6b69765a` (ACPI: Store NVS state even when entering suspend to RAM) caused the ACPI suspend code save the NVS area during suspend and restore it during resume unconditionally, although it is known that some systems need to use acpi_sleep=s4_nonvs for hibernation to work. To allow the affected systems to avoid saving and restoring the NVS area during suspend to RAM and resume, introduce kernel command line option acpi_sleep=nonvs and make acpi_sleep=s4_nonvs work as its alias temporarily (add acpi_sleep=s4_nonvs to the feature removal file). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16396 . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: tomas m <tmezzadra@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2010-07-24 23:26:09 -04:00
Stefano Stabellini	ff4878089e	x86: Do not try to disable hpet if it hasn't been initialized before hpet_disable is called unconditionally on machine reboot if hpet support is compiled in the kernel. hpet_disable only checks if the machine is hpet capable but doesn't make sure that hpet has been initialized. [ tglx: Made it a one liner and removed the redundant hpet_address check ] Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <alpine.DEB.2.00.1007211726240.22235@kaball-desktop> Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-07-23 12:53:00 +02:00
Avi Kivity	7a73c0283d	KVM: Use kmalloc() instead of vmalloc() for KVM_[GS]ET_MSR We don't need more than a page, and vmalloc() is slower (much slower recently due to a regression). Signed-off-by: Avi Kivity <avi@redhat.com>	2010-07-23 09:07:14 +03:00
Xiao Guangrong	6aa0b9dec5	KVM: MMU: fix conflict access permissions in direct sp In no-direct mapping, we mark sp is 'direct' when we mapping the guest's larger page, but its access is encoded form upper page-struct entire not include the last mapping, it will cause access conflict. For example, have this mapping: [W] / PDE1 -> \|---\| P[W] \| \| LPA \ PDE2 -> \|---\| [R] P have two children, PDE1 and PDE2, both PDE1 and PDE2 mapping the same lage page(LPA). The P's access is WR, PDE1's access is WR, PDE2's access is RO(just consider read-write permissions here) When guest access PDE1, we will create a direct sp for LPA, the sp's access is from P, is W, then we will mark the ptes is W in this sp. Then, guest access PDE2, we will find LPA's shadow page, is the same as PDE's, and mark the ptes is RO. So, if guest access PDE1, the incorrect #PF is occured. Fixed by encode the last mapping access into direct shadow page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-07-23 09:07:04 +03:00
Stefano Stabellini	016b6f5fe8	xen: Add suspend/resume support for PV on HVM guests. Suspend/resume requires few different things on HVM: the suspend hypercall is different; we don't need to save/restore memory related settings; except the shared info page and the callback mechanism. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-22 16:46:21 -07:00
Sheng Yang	38e20b07ef	x86/xen: event channels delivery on HVM. Set the callback to receive evtchns from Xen, using the callback vector delivery mechanism. The traditional way for receiving event channel notifications from Xen is via the interrupts from the platform PCI device. The callback vector is a newer alternative that allow us to receive notifications on any vcpu and doesn't need any PCI support: we allocate a vector exclusively to receive events, in the vector handler we don't need to interact with the vlapic, therefore we avoid a VMEXIT. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-22 16:45:59 -07:00
Sheng Yang	bee6ab53e6	x86: early PV on HVM features initialization. Initialize basic pv on hvm features adding a new Xen HVM specific hypervisor_x86 structure. Don't try to initialize xen-kbdfront and xen-fbfront when running on HVM because the backends are not available. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-22 16:45:35 -07:00
Jeremy Fitzhardinge	18f19aa62a	xen: Add support for HVM hypercalls. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-07-22 16:45:31 -07:00
Len Brown	4a973f2495	Merge branch 'bugzilla-15886' into release	2010-07-22 18:18:28 -04:00
Len Brown	718be4aaf3	ACPI: skip checking BM_STS if the BIOS doesn't ask for it It turns out that there is a bit in the _CST for Intel FFH C3 that tells the OS if we should be checking BM_STS or not. Linux has been unconditionally checking BM_STS. If the chip-set is configured to enable BM_STS, it can retard or completely prevent entry into deep C-states -- as illustrated by turbostat: http://userweb.kernel.org/~lenb/acpi/utils/pmtools/turbostat/ ref: Intel Processor Vendor-Specific ACPI Interface Specification table 4 "_CST FFH GAS Field Encoding" Bit 1: Set to 1 if OSPM should use Bus Master avoidance for this C-state https://bugzilla.kernel.org/show_bug.cgi?id=15886 Signed-off-by: Len Brown <len.brown@intel.com>	2010-07-22 16:54:27 -04:00
Thomas Renninger	4c21adf26f	x86 cpufreq, perf: Make trace_power_frequency cpufreq driver independent and fix the broken case if a core's frequency depends on others. trace_power_frequency was only implemented in a rather ungeneric way in acpi-cpufreq driver's target() function only. -> Move the call to trace_power_frequency to cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE notifier is triggered. This will support power frequency tracing by all cpufreq drivers. trace_power_frequency did not trace frequency changes correctly when the userspace governor was used or when CPU cores' frequency depend on each other. -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu which gets switched automatically fixes this. Robert Schoene provided some important fixes on top of my initial quick shot version which are integrated in this patch: - Forgot some changes in power_end trace (TP_printk/variable names) - Variable dummy in power_end must now be cpu_id - Use static 64 bit variable instead of unsigned int for cpu_id [akpm@linux-foundation.org: build fix] Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: davej@codemonkey.org.uk Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Jones <davej@codemonkey.org.uk> Acked-by: Arjan van de Ven <arjan@infradead.org> Cc: Robert Schoene <robert.schoene@tu-dresden.de> Tested-by: Robert Schoene <robert.schoene@tu-dresden.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2010-07-22 12:08:27 +02:00
Brian Gerst	650fb4393d	x86-64: Simplify loading initial_gs Load initial_gs as two 32-bit values instead of splitting a 64-bit value. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1279371808-24804-3-git-send-email-brgerst@gmail.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-07-21 21:23:51 -07:00
Brian Gerst	cfaa71ee97	x86: Use symbolic MSR names Use symbolic MSR names instead of hardcoding the MSR index. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1279371808-24804-2-git-send-email-brgerst@gmail.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-07-21 21:23:40 -07:00
Brian Gerst	8c06585d64	x86: Remove redundant K6 MSRs MSR_K6_EFER is unused, and MSR_K6_STAR is redundant with MSR_STAR. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1279371808-24804-1-git-send-email-brgerst@gmail.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-07-21 21:23:05 -07:00
Roland McGrath	0327559151	x86: auditsyscall: fix fastpath return value after reschedule In the CONFIG_AUDITSYSCALL fast-path for x86 64-bit system calls, we can pass a bad return value and/or error indication for the system call to audit_syscall_exit(). This happens when TIF_NEED_RESCHED was set as the system call returned, so we went out to schedule() and came back to the exit-audit fast-path. The fix is to reload the user return value register from the pt_regs before using it for audit_syscall_exit(). Both the 32-bit kernel's fast path and the 64-bit kernel's 32-bit system call fast paths work slightly differently, so that they always leave the fast path entirely to reschedule and don't return there, so they don't have the analogous bugs. Reported-by: Alexander Viro <aviro@redhat.com> Signed-off-by: Roland McGrath <roland@redhat.com>	2010-07-21 17:44:12 -07:00
H. Peter Anvin	1cff92d8fd	x86, xsave: Make xstate_enable_boot_cpu() __init, protect on CPU 0 xstate_enable_boot_cpu() is, as the name implies, only used on the boot CPU; furthermore, it invokes alloc_bootmem(), which is __init; hence it needs to be tagged __init rather than __cpuinit. Furthermore, it is not safe in the long run to rely on CPU 0 only coming online during the early boot -- at some point we're going to support offlining (and re-onlining) the boot CPU, and at that point we must not call xstate_enable_boot_cpu() again. The code is a fair bit more obscure than one would like, because the __ref overrides aren't quite powerful enough. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <4C476236.1020302@zytor.com>	2010-07-21 15:33:54 -07:00
Robert Richter	4995b9dba9	x86, xsave: Add __init attribute to setup_xstate_features() This is called only from initialization code. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-6-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-21 14:06:05 -07:00
Robert Richter	45c2d7f462	x86, xsave: Make init_xstate_buf static The pointer is only used in xsave.c. Making it static. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-5-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-21 14:06:05 -07:00
Robert Richter	ee813d53a8	x86, xsave: Check cpuid level for XSTATE_CPUID (0x0d) The patch introduces the XSTATE_CPUID macro and adds a check that tests if XSTATE_CPUID exists. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-4-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-21 14:06:04 -07:00
Robert Richter	97e80a70db	x86, xsave: Introduce xstate enable functions The patch renames xsave_cntxt_init() and __xsave_init() into xstate_enable_boot_cpu() and xstate_enable() as this names are more meaningful. It also removes the duplicate xcr setup for the boot cpu. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-3-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-21 14:06:04 -07:00
Robert Richter	0e49bf66d2	x86, xsave: Separate fpu and xsave initialization As xsave also supports other than fpu features, it should be initialized independently of the fpu. This patch moves this out of fpu initialization. There is also a lot of cross referencing between fpu and xsave code. This patch reduces this by making xsave_cntxt_init() and init_thread_xstate() static functions. The patch moves the cpu_has_xsave check at the beginning of xsave_init(). All other checks may removed then. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-2-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-21 14:06:04 -07:00
Borislav Petkov	3f8afb77cd	x86, tlb: Clean up and correct used type smp_processor_id() returns an int and not an unsigned long. Also, since the function is small enough, there's no need for a local variable caching its value. No functionality change, just cleanup. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100721124705.GA674@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-21 21:48:15 +02:00
Ingo Molnar	9dcdbf7a33	Merge branch 'linus' into perf/core Merge reason: Pick up the latest perf fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-21 21:43:06 +02:00
Linus Torvalds	a4ce96ac35	Fix up trivial spelling errors ('taht' -> 'that') Pointed out by Lucas who found the new one in a comment in setup_percpu.c. And then I fixed the others that I grepped for. Reported-by: Lucas <canolucas@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-21 09:25:42 -07:00
Michel Lespinasse	b4bcb4c28c	x86, rwsem: Minor cleanups Clarified few comments and made initialization of %edx/%rdx more uniform accross __down_write_nested, __up_read and __up_write functions. Signed-off-by: Michel Lespinasse <walken@google.com> LKML-Reference: <201007202219.o6KMJkiA021048@imap1.linux-foundation.org> Acked-by: David Howells <dhowells@redhat.com> Cc: Mike Waychison <mikew@google.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Ying Han <yinghan@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 17:41:14 -07:00
Michel Lespinasse	a751bd858b	x86, rwsem: Stay on fast path when count > 0 in __up_write() When count > 0 there is no need to take the call_rwsem_wake path. If we did take that path, it would just return without doing anything due to the active count not being zero. Signed-off-by: Michel Lespinasse <walken@google.com> LKML-Reference: <201007202219.o6KMJj9x021042@imap1.linux-foundation.org> Acked-by: David Howells <dhowells@redhat.com> Cc: Mike Waychison <mikew@google.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Ying Han <yinghan@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 17:41:00 -07:00
Florian Zumbiehl	468c30f2bb	x86, iomap: Fix wrong page aligned size calculation in ioremapping code x86 early_iounmap(): fix off-by-one error in page alignment of allocation size for sizes where size%PAGE_SIZE==1. Signed-off-by: Florian Zumbiehl <florz@florz.de> LKML-Reference: <201007202219.o6KMJlES021058@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 16:56:35 -07:00
Andres Salomon	92851e2fca	x86, mm: Create symbolic index into address_markers array Without this, adding entries into the address_markers array means adding more and more of an #ifdef maze in pt_dump_init(). By using indices, we can keep it a bit saner. Signed-off-by: Andres Salomon <dilinger@queued.net> LKML-Reference: <201007202219.o6KMJkUs021052@imap1.linux-foundation.org> Cc: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 16:56:19 -07:00
Yinghai Lu	9aebbdb637	x86, numa: fix boot without RAM on node0 again Commit `e534c7c5f8` ("numa: x86_64: use generic percpu var numa_node_id() implementation") broke numa systems that don't have ram on node0 when MEMORY_HOTPLUG is enabled, because cpu_up() will call cpu_to_node() before per_cpu(numa_node) is setup for APs. When Node0 doesn't have RAM, on x86, cpus already round it to nearest node with RAM in x86_cpu_to_node_map. and per_cpu(numa_node) is not set up until in c_init for APs. When later cpu_up() calling cpu_to_node() will get 0 again, and make it online even there is no RAM on node0. so later all APs can not booted up, and later will have panic. [ 1.611101] On node 0 totalpages: 0 ......... [ 2.608558] On node 0 totalpages: 0 [ 2.612065] Brought up 1 CPUs [ 2.615199] Total of 1 processors activated (3990.31 BogoMIPS). ... 93.225341] calling loop_init+0x0/0x1a4 @ 1 [ 93.229314] PERCPU: allocation failed, size=80 align=8, failed to populate [ 93.246539] Pid: 1, comm: swapper Tainted: G W 2.6.35-rc4-tip-yh-04371-gd64e6c4-dirty #354 [ 93.264621] Call Trace: [ 93.266533] [<ffffffff81125e43>] pcpu_alloc+0x83a/0x8e7 [ 93.270710] [<ffffffff81125f15>] __alloc_percpu+0x10/0x12 [ 93.285849] [<ffffffff8140786c>] alloc_disk_node+0x94/0x16d [ 93.291811] [<ffffffff81407956>] alloc_disk+0x11/0x13 [ 93.306157] [<ffffffff81503e51>] loop_alloc+0xa7/0x180 [ 93.310538] [<ffffffff8277ef48>] loop_init+0x9b/0x1a4 [ 93.324909] [<ffffffff8277eead>] ? loop_init+0x0/0x1a4 [ 93.329650] [<ffffffff810001f2>] do_one_initcall+0x57/0x136 [ 93.345197] [<ffffffff827486d0>] kernel_init+0x184/0x20e [ 93.348146] [<ffffffff81034954>] kernel_thread_helper+0x4/0x10 [ 93.365194] [<ffffffff81c7cc3c>] ? restore_args+0x0/0x30 [ 93.369305] [<ffffffff8274854c>] ? kernel_init+0x0/0x20e [ 93.386011] [<ffffffff81034950>] ? kernel_thread_helper+0x0/0x10 [ 93.392047] loop: out of memory ... Try to assign per_cpu(numa_node) early [akpm@linux-foundation.org: tidy up code comment] Signed-off-by: Yinghai <yinghai@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Tejun Heo <tj@kernel.org> Cc: Denys Vlasenko <vda.linux@googlemail.com> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-20 16:25:40 -07:00
Robert Richter	82d4150cec	x86, xsave: Move boot cpu initialization to xsave_init() This patch moves boot cpu initialization to xsave_init(). Now all cpus are initialized in one single function. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279651857-24639-5-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 16:21:44 -07:00
Robert Richter	db10db48b2	x86, xsave: 32/64 bit boot cpu check unification in initialization Boot cpu id is always 0, thus simplifying and unifying boot cpu check. boot_cpu_id is there for historical reasons and was renamed to boot_cpu_physical_apicid in patch: `c70dcb7` x86: change boot_cpu_id to boot_cpu_physical_apicid However, there are some remaining occurrences of boot_cpu_id that are never touched in the kernel and thus its value is always 0. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279651857-24639-3-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 16:21:38 -07:00
Robert Richter	7aa2b5f8ec	x86, xsave: Do not include asm/i387.h in asm/xsave.h There are no dependencies to asm/i387.h. Instead, if including only xsave.h the following error occurs: .../arch/x86/include/asm/i387.h:110: error: ‘XSTATE_FP’ undeclared (first use in this function) .../arch/x86/include/asm/i387.h:110: error: (Each undeclared identifier is reported only once .../arch/x86/include/asm/i387.h:110: error: for each function it appears in.) This patch fixes this. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1279651857-24639-2-git-send-email-robert.richter@amd.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 16:21:38 -07:00
Andi Kleen	fa10ba64ac	x86, gcc-4.6: Fix set but not read variables Just some dead code, no real bugs. Found by gcc 4.6 -Wall Signed-off-by: Andi Kleen <ak@linux.intel.com> LKML-Reference: <201007202219.o6KMJnQ0021072@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 15:38:30 -07:00
Andi Kleen	5f755293ca	x86, gcc-4.6: Avoid unused by set variables in rdmsr Avoids quite a lot of warnings with a gcc 4.6 -Wall build because this happens in a commonly used header file (apic.h) Signed-off-by: Andi Kleen <ak@linux.intel.com> LKML-Reference: <201007202219.o6KMJme6021066@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 15:38:18 -07:00
Adam Lackorzynski	087b255a2b	x86, i8259: Only register sysdev if we have a real 8259 PIC My platform makes use of the null_legacy_pic choice and oopses when doing a shutdown as the shutdown code goes through all the registered sysdevs and calls their shutdown method which in my case poke on a non-existing i8259. Imho the i8259 specific sysdev should only be registered if the i8259 is actually there. Do not register the sysdev function when the null_legacy_pic is used so that the i8259 resume, suspend and shutdown functions are not called. Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de> LKML-Reference: <201007202218.o6KMIJ3m020955@imap1.linux-foundation.org> Cc: Jacob Pan <jacob.jun.pan@intel.com> Cc: <stable@kernel.org> 2.6.34 Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-20 15:27:33 -07:00
Jeremy Fitzhardinge	f89e048e76	xen: make sure pages are really part of domain before freeing Scan the set of pages we're freeing and make sure they're actually owned by the domain before freeing. This generally won't happen on a domU (since Xen gives us contigious memory), but it could happen if there are some hardware mappings passed through. We only bother going up to the highest page Xen actually claimed to give us, since there's definitely nothing of ours above that. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-20 12:57:46 -07:00
Miroslav Rezanina	093d7b4639	xen: release unused free memory Scan an e820 table and release any memory which lies between e820 entries, as it won't be used and would just be wasted. At present this is just to release any memory beyond the end of the e820 map, but it will also deal with holes being punched in the map. Derived from patch by Miroslav Rezanina <mrezanin@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-20 12:51:22 -07:00
H. Peter Anvin	2decb194e6	x86, cpu: Split addon_cpuid_features.c addon_cpuid_features.c contains exactly two almost completely unrelated functions, plus has a long and very generic name. Split it into two files, scattered.c for the scattered feature flags, and topology.c for the topology information. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <tip-*@git.kernel.org>	2010-07-19 19:02:41 -07:00
H. Peter Anvin	278bc5f6ab	x86, cpu: Clean up formatting in cpufeature.h, remove override Clean up the formatting in cpufeature.h, and remove an unnecessary name override. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <tip-*@git.kernel.org>	2010-07-19 19:02:35 -07:00
Suresh Siddha	6bad06b768	x86, xsave: Use xsaveopt in context-switch path when supported xsaveopt is a more optimized form of xsave specifically designed for the context switch usage. xsaveopt doesn't save the state that's not modified from the prior xrstor. And if a specific feature state gets modified to the init state, then xsaveopt just updates the header bit in the xsave memory layout without updating the corresponding memory layout. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.604014179@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-19 17:52:24 -07:00
Suresh Siddha	29104e101d	x86, xsave: Sync xsave memory layout with its header for user handling With xsaveopt, if a processor implementation discern that a processor state component is in its initialized state it may modify the corresponding bit in the xsave_hdr.xstate_bv as '0', with out modifying the corresponding memory layout. Hence wHile presenting the xstate information to the user, we always ensure that the memory layout of a feature will be in the init state if the corresponding header bit is zero. This ensures the consistency and avoids the condition of the user seeing some some stale state in the memory layout during signal handling, debugging etc. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.351459480@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-19 17:51:30 -07:00
Suresh Siddha	a1488f8bf4	x86, xsave: Track the offset, size of state in the xsave layout Subleaves of the cpuid vector 0xd provides the offset and size of different feature state that are managed by the xsave/xrstor. Track this for the upcoming usage during signal handling. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.262987929@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-19 17:51:17 -07:00
Suresh Siddha	5734f62b66	x86, cpu: Enumerate xsaveopt Enumerate the xsaveopt feature. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.604014179@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-19 16:50:40 -07:00
Suresh Siddha	40e1d7a4ff	x86, cpu: Add xsaveopt cpufeature Add cpu feature bit support for the XSAVEOPT instruction. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.523204988@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-19 16:48:06 -07:00
Suresh Siddha	edb18f8ab0	x86, cpu: Make init_scattered_cpuid_features() consider cpuid subleaves Some cpuid features (like xsaveopt) are enumerated using cpuid subleaves. Extend init_scattered_cpuid_features() to take subleaf into account. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.439900717@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-19 16:47:59 -07:00
Linus Torvalds	d0c6f62584	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pci, mrst: Add extra sanity check in walking the PCI extended cap chain x86: Fix x2apic preenabled system with kexec x86: Force HPET readback_cmp for all ATI chipsets	2010-07-19 13:19:32 -07:00
Kulikov Vasiliy	6c54aabd5e	x86/amd-iommu: Use for_each_pci_dev() Use for_each_pci_dev() to simplify the code. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2010-07-19 15:41:13 +02:00
Pavel Machek	a2531293db	update email address pavel@suse.cz no longer works, replace it with working address. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-07-19 10:56:54 +02:00
Dave Chinner	7f8275d0d6	mm: add context argument to shrinker callback The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-07-19 14:56:17 +10:00
Linus Torvalds	a9f7f2e74a	Merge branch 'x86/kprobes' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'x86/kprobes' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: x86: kprobes: fix swapped segment registers in kretprobe	2010-07-18 15:13:30 -07:00
Roland McGrath	a197479848	x86: kprobes: fix swapped segment registers in kretprobe In commit `f007ea26`, the order of the %es and %ds segment registers got accidentally swapped, so synthesized 'struct pt_regs' frames have the two values inverted. It's almost sure that these values never matter, and that they also never differ. But wrong is wrong. Signed-off-by: Roland McGrath <roland@redhat.com>	2010-07-18 15:05:34 -07:00
Jacob Pan	f82c3d71d6	x86, pci, mrst: Add extra sanity check in walking the PCI extended cap chain The fixed bar capability structure is searched in PCI extended configuration space. We need to make sure there is a valid capability ID to begin with otherwise, the search code may stuck in a infinite loop which results in boot hang. This patch adds additional check for cap ID 0, which is also invalid, and indicates end of chain. End of chain is supposed to have all fields zero, but that doesn't seem to always be the case in the field. Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <1279306706-27087-1-git-send-email-jacob.jun.pan@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-16 16:52:15 -07:00
Yinghai Lu	fd19dce7ac	x86: Fix x2apic preenabled system with kexec Found one x2apic system kexec loop test failed when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip) first kernel can kexec second kernel, but second kernel can not kexec third one. it can be duplicated on another system with BIOS preenabled x2apic. First kernel can not kexec second kernel. It turns out, when kernel boot with pre-enabled x2apic, it will not execute disable_local_APIC on shutdown path. when init_apic_mappings() is called in setup_arch, it will skip setting of apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic()) Then later, disable_local_APIC() will bail out early because !apic_phys. So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys. another solution could be updating init_apic_mappings() to set apic_phys even for preenabled x2apic system. Actually even for x2apic system, that lapic address is mapped already in early stage. BTW: is there any x2apic preenabled system with apicid of boot cpu > 255? Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4C3EB22B.3000701@kernel.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: stable@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-16 16:49:41 -07:00
Bjorn Helgaas	58c84eda07	PCI: fall back to original BIOS BAR addresses If we fail to assign resources to a PCI BAR, this patch makes us try the original address from BIOS rather than leaving it disabled. Linux tries to make sure all PCI device BARs are inside the upstream PCI host bridge or P2P bridge apertures, reassigning BARs if necessary. Windows does similar reassignment. Before this patch, if we could not move a BAR into an aperture, we left the resource unassigned, i.e., at address zero. Windows leaves such BARs at the original BIOS addresses, and this patch makes Linux do the same. This is a bit ugly because we disable the resource long before we try to reassign it, so we have to keep track of the BIOS BAR address somewhere. For lack of a better place, I put it in the struct pci_dev. I think it would be cleaner to attempt the assignment immediately when the claim fails, so we could easily remember the original address. But we currently claim motherboard resources in the middle, after attempting to claim PCI resources and before assigning new PCI resources, and changing that is a fairly big job. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16263 Reported-by: Andrew <nitr0@seti.kr.ua> Tested-by: Andrew <nitr0@seti.kr.ua> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-07-16 11:39:48 -07:00
Joe Perches	b2691085d1	x86: Clean up arch/x86/kernel/cpu/mtrr/cleanup.c: use ";" not "," to terminate statements Also needed if pr_<level> becomes a bit more space efficient. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> LKML-Reference: <1277768808.29157.280.camel@Joe-Laptop.home> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-15 18:21:22 +02:00
Thomas Gleixner	08be97962b	x86: Force HPET readback_cmp for all ATI chipsets commit `30a564be` (x86, hpet: Restrict read back to affected ATI chipset) restricted the workaround for the HPET bug to SMX00 chipsets. This was reasonable as those were the only ones against which we ever got a bug report. Stephan Wolf reported now that this patch breaks his IXP400 based machine. Though it's confirmed to work on other IXP400 based systems. To error out on the safe side, we force the HPET readback workaround for all ATI SMbus class chipsets. Reported-by: Stephan Wolf <stephan@letzte-bankreihe.de> LKML-Reference: <alpine.LFD.2.00.1007142134140.3321@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Stephan Wolf <stephan@letzte-bankreihe.de> Acked-by: Borislav Petkov <borislav.petkov@amd.com>	2010-07-15 17:10:02 +02:00
Yinghai Lu	70b0d22d58	x86, setup: Only set early_serial_base after port is initialized putchar is using early_serial_base to check if port is initialized. So we only assign it after early_serial_init() is called, in case we need use VGA to debug early serial console. Also add display for port addr and baud. -v2: update to current tip Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4C3E0171.6050008@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-14 12:07:16 -07:00
Linus Torvalds	bcefc8d0d3	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: input: i8042 - add runtime check in x86's i8042_platform_init Revert "Input: fixup X86_MRST selects" Revert "Input: do not force selecting i8042 on Moorestown" x86, mrst: Add i8042_detect API for Moorestwon platform x86: Add i8042 pre-detection hook to x86_platform_ops x86, platform: Export x86_platform to modules	2010-07-13 17:31:11 -07:00
Linus Torvalds	177dd7e1eb	Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: MMU: flush remote tlbs when overwriting spte with different pfn KVM: VMX: Fix host MSR_KERNEL_GS_BASE corruption	2010-07-13 17:30:49 -07:00
H. Peter Anvin	3b770a2128	x86, alternatives: BUG on encountering an invalid CPU feature number Make the alternatives-patching code BUG on encountering an invalid CPU feature number. Should have done this a long time ago. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Yinghai Lu <yinhai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <tip-df378ccfc4dd04e263426ad805516915874774aa@git.kernel.org>	2010-07-13 14:57:50 -07:00
H. Peter Anvin	df378ccfc4	x86, alternatives: Fix one more open-coded 8-bit alternative number Fix a missing case of an 8-bit alternative number, buried inside an assembly macro. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Reported-by: Yinghai Lu <yinhai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4C3BDDA3.2060900@kernel.org>	2010-07-13 14:56:16 -07:00
Yinghai Lu	ce0aa5dd20	x86, setup: Make the setup code also accept console=uart8250 Make the boot code also accept the console=uart8250,io,0x2f8,115200n form of early console. Also add back simple_guess_base(), otherwise those simple_strtoull(,,0) are not going to work. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4C3CCE05.4090505@kernel.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-13 14:08:12 -07:00
Pekka Enberg	fa97bdf927	x86, setup: Early-boot serial I/O support This patch adds serial I/O support to the real-mode setup (very early boot) printf(). It's useful for debugging boot code when running Linux under KVM, for example. The actual code was lifted from early printk. Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <1278835617-11368-1-git-send-email-penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-12 14:46:00 -07:00
Xiao Guangrong	91546356d0	KVM: MMU: flush remote tlbs when overwriting spte with different pfn After remove a rmap, we should flush all vcpu's tlb Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-07-12 14:05:56 -03:00
Kenji Kaneshige	35be1b716a	x86, ioremap: Fix normal ram range check Check for normal RAM in x86 ioremap() code seems to not work for the last page frame in the specified physical address range. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> LKML-Reference: <4C1AE6CD.1080704@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-09 11:42:11 -07:00
Kenji Kaneshige	ffa71f33a8	x86, ioremap: Fix incorrect physical address handling in PAE mode Current x86 ioremap() doesn't handle physical address higher than 32-bit properly in X86_32 PAE mode. When physical address higher than 32-bit is passed to ioremap(), higher 32-bits in physical address is cleared wrongly. Due to this bug, ioremap() can map wrong address to linear address space. In my case, 64-bit MMIO region was assigned to a PCI device (ioat device) on my system. Because of the ioremap()'s bug, wrong physical address (instead of MMIO region) was mapped to linear address space. Because of this, loading ioatdma driver caused unexpected behavior (kernel panic, kernel hangup, ...). Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> LKML-Reference: <4C1AE680.7090408@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-09 11:42:03 -07:00
Ky Srinivasan	9279aa5506	x86: Export the symbol ms_hyperv This is needed so that the staging hyperv can properly access this symbol. Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-07-08 14:12:03 -07:00
Javier Martinez Canillas	5bbd4a336c	x86/apic/es7000_32: Remove unused variable In today's linux-next I got this compile warning: arch/x86/kernel/apic/es7000_32.c:132: warning: 'base' defined but not used Current patch solves the issue removing the unused variable. Signed-off-by: Javier Martinez Canillas <martinez.javier@gmail.com> Cc: Rakib Mullick <rakib.mullick@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Tejun Heo <tj@kernel.org> LKML-Reference: <1278546719.9020.4.camel@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-08 09:14:37 +02:00
H. Peter Anvin	bdc802dcca	x86, cpu: Support the features flags in new CPUID leaf 7 Intel has defined CPUID leaf 7 as the next set of feature flags (see the AVX specification, version 007). Add support for this new feature flags word. Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <tip-*@vger.kernel.org>	2010-07-07 17:29:18 -07:00
Feng Tang	6d2cce6201	x86, mrst: Add i8042_detect API for Moorestwon platform It will just return 0 as there is no i8042 controller Signed-off-by: Feng Tang <feng.tang@intel.com> LKML-Reference: <1278342202-10973-3-git-send-email-feng.tang@intel.com> Acked-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-07 17:05:06 -07:00
Feng Tang	c516ac5839	x86: Add i8042 pre-detection hook to x86_platform_ops Some x86 platforms like Intel MID platforms don't have i8042 controllers, and i8042 driver's probe to some legacy IO ports may hang the MID processor. With this hook, i8042 driver can runtime check and skip the probe when the pretection fail which also saves some probe time [ hpa note: this is currently a compile-time check, which breaks the i386 allyesconfig build. This patch series thus does fix a regression. ] Signed-off-by: Feng Tang <feng.tang@intel.com> LKML-Reference: <1278342202-10973-2-git-send-email-feng.tang@intel.com> Acked-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-07 17:05:06 -07:00
H. Peter Anvin	72550b3ae5	x86, platform: Export x86_platform to modules Export x86_platform to modules in preparation of using it for i8042 discovery control. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <1278342202-10973-1-git-send-email-feng.tang@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Feng Tang <feng.tang@intel.com> Cc: Dmitry Torokhov <dtor@mail.ru>	2010-07-07 17:05:05 -07:00
H. Peter Anvin	83a7a2ad2a	x86, alternatives: Use 16-bit numbers for cpufeature index We already have cpufeature indicies above 255, so use a 16-bit number for the alternatives index. This consumes a padding field and so doesn't add any size, but it means that abusing the padding field to create assembly errors on overflow no longer works. We can retain the test simply by redirecting it to the .discard section, however. [ v3: updated to include open-coded locations ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <tip-f88731e3068f9d1392ba71cc9f50f035d26a0d4f@git.kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-07-07 10:36:28 -07:00
H. Peter Anvin	24da9c26f3	x86, cpu: Add CPU flags for F16C and RDRND Add support for the newly documented F16C (16-bit floating point conversions) and RDRND (RDRAND instruction) CPU feature flags. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-07-07 10:15:12 -07:00
Linus Torvalds	47a716cf0c	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rbtree: Undo augmented trees performance damage and regression x86, Calgary: Limit the max PHB number to 256	2010-07-06 17:16:09 -07:00
Suresh Siddha	8e221b6db4	x86: Avoid unnecessary __clear_user() and xrstor in signal handling fxsave/xsave doesn't touch all the bytes in the memory layout used by these instructions. Specifically SW reserved (bytes 464..511) fields in the fxsave frame and the reserved fields in the xsave header. To present a clean context for the signal handling, just clear these fields instead of clearing the complete fxsave/xsave memory layout, when we dump these registers directly to the user signal frame. Also avoid the call to second xrstor (which inits the state not passed in the signal frame) in restore_user_xstate() if all the state has already been restored by the first xrstor. These changes improve the performance of signal handling(by ~3-5% as measured by the lat_sig). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1277249017.2847.85.camel@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-07-06 16:31:04 -07:00
Avi Kivity	da38f43859	KVM: VMX: Fix host MSR_KERNEL_GS_BASE corruption enter_lmode() and exit_lmode() modify the guest's EFER.LMA before calling vmx_set_efer(). However, the latter function depends on the value of EFER.LMA to determine whether MSR_KERNEL_GS_BASE needs reloading, via vmx_load_host_state(). With EFER.LMA changing under its feet, it took the wrong choice and corrupted userspace's %gs. This causes 32-on-64 host userspace to fault. Fix not touching EFER.LMA; instead ask vmx_set_efer() to change it. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-07-06 11:41:31 +03:00
Peter Zijlstra	b945d6b255	rbtree: Undo augmented trees performance damage and regression Reimplement augmented RB-trees without sprinkling extra branches all over the RB-tree code (which lives in the scheduler hot path). This approach is 'borrowed' from Fabio's BFQ implementation and relies on traversing the rebalance path after the RB-tree-op to correct the heap property for insertion/removal and make up for the damage done by the tree rotations. For insertion the rebalance path is trivially that from the new node upwards to the root, for removal it is that from the deepest node in the path from the to be removed node that will still be around after the removal. [ This patch also fixes a video driver regression reported by Ali Gholami Rudi - the memtype->subtree_max_end was updated incorrectly. ] Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Venkatesh Pallipadi <venki@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Ali Gholami Rudi <ali@rudi.ir> Cc: Fabio Checconi <fabio@gandalf.sssup.it> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1275414172.27810.27961.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-05 14:43:50 +02:00
Cyrill Gorcunov	39ef13a4ac	perf, x86: P4 PMU -- redesign cache events To support cache events we have reserved the low 6 bits in hw_perf_event::config (which is a part of CCCR register configuration actually). These bits represent Replay Event mertic enumerated in enum P4_PEBS_METRIC. The caller should not care about which exact bits should be set and how -- the caller just chooses one P4_PEBS_METRIC entity and puts it into the config. The kernel will track it and set appropriate additional MSR registers (metrics) when needed. The reason for this redesign was the PEBS enable bit, which should not be set until DS (and PEBS sampling) support will be implemented properly. TODO ==== - PEBS sampling (note it's tricky and works with _one_ counter only so for HT machines it will be not that easy to handle both threads) - tracking of PEBS registers state, a user might need to turn PEBS off completely (ie no PEBS enable, no UOP_tag) but some other event may need it, such events clashes and should not run simultaneously, at moment we just don't support such events - eventually export user space bits in separate header which will allow user apps to configure raw events more conveniently. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1278295769.9540.15.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-05 08:34:36 +02:00
Ingo Molnar	08f8ba0799	Merge commit 'v2.6.35-rc4' into perf/core Merge reason: Pick up the latest perf fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-05 08:30:58 +02:00
Linus Torvalds	3f7d7b4bde	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, x86: Fix incorrect branches event on AMD CPUs perf tools: Fix find tids routine by excluding "." and ".." x86: Send a SIGTRAP for user icebp traps	2010-07-04 20:20:53 -07:00
Vince Weaver	f287d332ce	perf, x86: Fix incorrect branches event on AMD CPUs While doing some performance counter validation tests on some assembly language programs I noticed that the "branches:u" count was very wrong on AMD machines. It looks like the wrong event was selected. Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: <stable@kernel.org> LKML-Reference: <alpine.DEB.2.00.1007011526010.23160@cl320.eecs.utk.edu> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-03 15:19:34 +02:00
Alexander Duyck	7475271004	x86: Drop CONFIG_MCORE2 check around setting of NET_IP_ALIGN This patch removes the CONFIG_MCORE2 check from around NET_IP_ALIGN. It is based on a suggestion from Andi Kleen. The assumption is that there are not any x86 cores where unaligned access is really slow, and this change would allow for a performance improvement to still exist on configurations that are not necessarily optimized for Core 2. Cc: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-01 22:45:54 -07:00
Ingo Molnar	0a54cec0c2	Merge branch 'linus' into core/rcu Conflicts: fs/fs-writeback.c Merge reason: Resolve the conflict Note, i picked the version from Linus's tree, which effectively reverts the fs-writeback.c bits of: `b97181f`: fs: remove all rcu head initializations, except on_stack initializations As the upstream changes to this file changed this code heavily and the first attempt to resolve the conflict resulted in a non-booting kernel. It's safer to re-try this portion of the commit cleanly. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-01 09:31:25 +02:00
Darrick J. Wong	d596043d71	x86, Calgary: Limit the max PHB number to 256 The x3950 family can have as many as 256 PCI buses in a single system, so change the limits to the maximum. Since there can only be 256 PCI buses in one domain, we no longer need the BUG_ON check. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> LKML-Reference: <20100701004519.GQ15515@tux1.beaverton.ibm.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-06-30 22:41:42 -07:00
Alexander Duyck	ea812ca1b0	x86: Align skb w/ start of cacheline on newer core 2/Xeon Arch x86 architectures can handle unaligned accesses in hardware, and it has been shown that unaligned DMA accesses can be expensive on Nehalem architectures. As such we should overwrite NET_IP_ALIGN to resolve this issue. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-30 14:34:09 -07:00
Frederic Weisbecker	a1e80fafc9	x86: Send a SIGTRAP for user icebp traps Before we had a generic breakpoint layer, x86 used to send a sigtrap for any debug event that happened in userspace, except if it was caused by lazy dr7 switches. Currently we only send such signal for single step or breakpoint events. However, there are three other kind of debug exceptions: - debug register access detected: trigger an exception if the next instruction touches the debug registers. We don't use it. - task switch, but we don't use tss. - icebp/int01 trap. This instruction (0xf1) is undocumented and generates an int 1 exception. Unlike single step through TF flag, it doesn't set the single step origin of the exception in dr6. icebp then used to be reported in userspace using trap signals but this have been incidentally broken with the new breakpoint code. Reenable this. Since this is the only debug event that doesn't set anything in dr6, this is all we have to check. This fixes a regression in Wine where World Of Warcraft got broken as it uses this for software protection checks purposes. And probably other apps do. Reported-and-tested-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: 2.6.33.x 2.6.34.x <stable@kernel.org>	2010-06-30 16:16:20 +02:00
Masami Hiramatsu	567a9fd867	kprobes/x86: Fix kprobes to skip prefixes correctly Fix resume_execution() and is_IF_modifier() to skip x86 instruction prefixes correctly by using x86 instruction attribute. Without this fix, resume_execution() can't handle instructions which have non-REX prefixes (REX prefixes are skipped). This will cause unexpected kernel panic by hitting bad address when a kprobe hits on two-byte ret (e.g. "repz ret" generated for Athlon/K8 optimization), because it just checks "repz" and can't recognize the "ret" instruction. These prefixes can be found easily with x86 instruction attribute. This patch introduces skip_prefixes() and uses it in resume_execution() and is_IF_modifier() to skip prefixes. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <4C298A6E.8070609@hitachi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-29 10:43:41 +02:00
Tejun Heo	b71ab8c202	workqueue: increase max_active of keventd and kill current_is_keventd() Define WQ_MAX_ACTIVE and create keventd with max_active set to half of it which means that keventd now can process upto WQ_MAX_ACTIVE / 2 - 1 works concurrently. Unless some combination can result in dependency loop longer than max_active, deadlock won't happen and thus it's unnecessary to check whether current_is_keventd() before trying to schedule a work. Kill current_is_keventd(). (Lockdep annotations are broken. We need lock_map_acquire_read_norecurse()) Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com>	2010-06-29 10:07:14 +02:00
Thomas Gleixner	f384c954c9	Merge branch 'linus' into perf/core Reason: Further changes conflict with upstream fixes Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-06-28 22:33:24 +02:00
Linus Torvalds	5904b3b81d	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix undeclared ENOSYS in include/linux/tracepoint.h perf record: prevent kill(0, SIGTERM); perf session: Remove threads from tree on PERF_RECORD_EXIT perf/tracing: Fix regression of perf losing kprobe events perf_events: Fix Intel Westmere event constraints perf record: Don't call newt functions when not initialized	2010-06-28 12:24:43 -07:00
Linus Torvalds	ab8aadbda7	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, Calgary: Increase max PHB number x86: Fix rebooting on Dell Precision WorkStation T7400 x86: Fix vsyscall on gcc 4.5 with -Os x86, pat: Proper init of memtype subtree_max_end um, hweight: Fix UML boot crash due to x86 optimized hweight x86, setup: Set ax register in boot vga query percpu, x86: Avoid warnings of unused variables in per cpu x86, irq: Rename gsi_end gsi_top, and fix off by one errors x86: use __ASSEMBLY__ rather than __ASSEMBLER__	2010-06-28 12:06:25 -07:00
Darrick J. Wong	499a00e92d	x86, Calgary: Increase max PHB number Newer systems (x3950M2) can have 48 PHBs per chassis and 8 chassis, so bump the limits up and provide an explanation of the requirements for each class. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Corinna Schultz <cschultz@linux.vnet.ibm.com> Cc: <stable@kernel.org> LKML-Reference: <20100624212647.GI15515@tux1.beaverton.ibm.com> [ v2: Fixed build bug, added back PHBS_PER_CALGARY == 4 ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-25 16:14:58 +02:00
Frederic Weisbecker	f7809daf64	x86: Support for instruction breakpoints Instruction breakpoints need to have a specific length of 0 to be working. Bring this support but also take care the user is not trying to set an unsupported length, like a range breakpoint for example. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Jason Wessel <jason.wessel@windriver.com>	2010-06-24 23:35:27 +02:00
Frederic Weisbecker	0c4519e825	x86: Set resume bit before returning from breakpoint exception Instruction breakpoints trigger before the instruction executes, and returning back from the breakpoint handler brings us again to the instruction that breakpointed. This naturally bring to a breakpoint recursion. To solve this, x86 has the Resume Bit trick. When the cpu flags have the RF flag set, the next instruction won't trigger any instruction breakpoint, and once this instruction is executed, RF is cleared back. This let's us jump back to the instruction that triggered the breakpoint without recursion. Use this when an instruction breakpoint triggers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Wessel <jason.wessel@windriver.com>	2010-06-24 23:35:15 +02:00
Andres Salomon	75a9cac430	x86, olpc: Add comment about implicit optimization barrier Signed-off-by: Andres Salomon <dilinger@queued.net> Cc: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <20100618174653.7755a39a@dev.queued.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-24 11:40:47 +02:00
Thomas Backlund	890ffedc7c	x86: Fix rebooting on Dell Precision WorkStation T7400 Dell Precision WorkStation T7400 freezes on reboot unless reboot=b is used. Reference: https://qa.mandriva.com/show_bug.cgi?id=58017 Signed-off-by: Thomas Backlund <tmb@mandriva.org> LKML-Reference: <4C1CC6E9.6000701@mandriva.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-20 09:24:13 +02:00
Andres Salomon	fd699c7655	x86, olpc: Add support for calling into OpenFirmware Add support for saving OFW's cif, and later calling into it to run OFW commands. OFW remains resident in memory, living within virtual range 0xff800000 - 0xffc00000. A single page directory entry points to the pgdir that OFW actually uses, so rather than saving the entire page table, we grab and install that one entry permanently in the kernel's page table. This is currently only used by the OLPC XO. Note that this particular calling convention breaks PAE and PAT, and so cannot be used on newer x86 hardware. Signed-off-by: Andres Salomon <dilinger@queued.net> LKML-Reference: <20100618174653.7755a39a@dev.queued.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-18 14:54:36 -07:00
H. Peter Anvin	05d0b0889c	x86, vdso: Error out if the vdso contains external references The vdso is a piece of userspace code which is supposed to be fully self-contained. Any external (undefined) reference is an error, and should be caught at compile time. This was giving us trouble when compiling with -Os on gcc 4.5.0, for example (failed inline). The need to do a buildtime check was pointed out by Andi Kleen. Reported-by: Andi Kleen <andi@firstfloor.org> LKML-Reference: <tip-*@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-18 14:46:21 -07:00
Andi Kleen	124482935f	x86: Fix vsyscall on gcc 4.5 with -Os This fixes the -Os breaks with gcc 4.5 bug. rdtsc_barrier needs to be force inlined, otherwise user space will jump into kernel space and kill init. This also addresses http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44129 I believe. Signed-off-by: Andi Kleen <ak@linux.intel.com> LKML-Reference: <20100618210859.GA10913@basil.fritz.box> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>	2010-06-18 14:16:31 -07:00
Jiri Slaby	d7a0380dc3	x86-64, mm: Initialize VDSO earlier on 64 bits When initrd is in use and a driver does request_module() in its module_init (i.e. __initcall or device_initcall), a modprobe process is created with VDSO mapping. But VDSO is inited even in __initcall, i.e. on the same level (at the same time), so it may not be inited yet (link order matters). Move the VDSO initialization code earlier by switching to something before rootfs_initcall where initrd is loaded as rootfs. Specifically to subsys_initcall. Do it for standard 64-bit path (init_vdso_vars) and for compat (sysenter_setup), just in case people have 32-bit initrd and ia32 emulation built-in. i386 (pure 32-bit) is not affected, since sysenter_setup() is called from check_bugs()->identify_boot_cpu() in start_kernel() before rest_init()->kernel_thread(kernel_init) where even kernel_init() calls do_basic_setup()->do_initcalls(). What this patch fixes are early modprobe crashes such as: Unpacking initramfs... Freeing initrd memory: 9324k freed modprobe[368]: segfault at 7fff4429c020 ip 00007fef397e160c \ sp 00007fff442795c0 error 4 in ld-2.11.2.so[7fef397df000+1f000] Signed-off-by: Jiri Slaby <jslaby@suse.cz> LKML-Reference: <1276720242-13365-1-git-send-email-jslaby@suse.cz> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-18 13:48:14 -07:00
Robert Schöne	c882e0feb9	x86, perf: Add power_end event to process_*.c cpu_idle routine Systems using the idle thread from process_32.c and process_64.c do not generate power_end events which could be traced using perf. This patch adds the event generation for such systems. Signed-off-by: Robert Schoene <robert.schoene@tu-dresden.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1276515440.5441.45.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-18 11:35:10 +02:00
Marcin Slusarz	8b8f79b927	x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages After every iounmap mmiotrace has to free kmmio_fault_pages, but it can't do it directly, so it defers freeing by RCU. It usually works, but when mmiotraced code calls ioremap-iounmap multiple times without sleeping between (so RCU won't kick in and start freeing) it can be given the same virtual address, so at every iounmap mmiotrace will schedule the same pages for release. Obviously it will explode on second free. Fix it by marking kmmio_fault_pages which are scheduled for release and not adding them second time. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Tested-by: Marcin Kocielnicki <koriakin@0x04.net> Tested-by: Shinpei KATO <shinpei@il.is.s.u-tokyo.ac.jp> Acked-by: Pekka Paalanen <pq@iki.fi> Cc: Stuart Bennett <stuart@freedesktop.org> Cc: Marcin Kocielnicki <koriakin@0x04.net> Cc: nouveau@lists.freedesktop.org Cc: <stable@kernel.org> LKML-Reference: <20100613215654.GA3829@joi.lan> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-18 11:30:09 +02:00
Ingo Molnar	646b1db495	Merge commit 'v2.6.35-rc3' into perf/core Merge reason: Go from -rc1 base to -rc3 base, merge in fixes.	2010-06-18 10:53:19 +02:00
Venkatesh Pallipadi	23016bf0d2	x86: Look for IA32_ENERGY_PERF_BIAS support The new IA32_ENERGY_PERF_BIAS MSR allows system software to give hardware a hint whether OS policy favors more power saving, or more performance. This allows the OS to have some influence on internal hardware power/performance tradeoffs where the OS has previously had no influence. The support for this feature is indicated by CPUID.06H.ECX.bit3, as documented in the Intel Architectures Software Developer's Manual. This patch discovers support of this feature and displays it as "epb" in /proc/cpuinfo. Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <alpine.LFD.2.00.1006032310160.6669@localhost.localdomain> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-16 13:37:32 -07:00
Jiri Kosina	f1bbbb6912	Merge branch 'master' into for-next	2010-06-16 18:08:13 +02:00
Uwe Kleine-König	421f91d21a	fix typos concerning "initiali[zs]e" Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-06-16 18:05:05 +02:00
Paul E. McKenney	ec8c27e04f	mce: convert to rcu_dereference_index_check() The mce processing applies rcu_dereference_check() to integers used as array indices. This patch therefore moves mce to the new RCU API rcu_dereference_index_check() that avoids the sparse processing that would otherwise result in compiler errors. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com>	2010-06-14 16:37:28 -07:00
Len Brown	42de5532f4	Merge branch 'bugzilla-13931-sleep-nvs' into release Conflicts: drivers/acpi/sleep.c Signed-off-by: Len Brown <len.brown@intel.com>	2010-06-12 01:15:40 -04:00
Linus Torvalds	7ae1277a52	Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM / x86: Save/restore MISC_ENABLE register	2010-06-11 14:19:45 -07:00
Linus Torvalds	eda054770e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: clear bridge resource range if BIOS assigned bad one PCI: hotplug/cpqphp, fix NULL dereference Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/" PCI: change resource collision messages from KERN_ERR to KERN_INFO	2010-06-11 14:15:44 -07:00
Venkatesh Pallipadi	6a4f3b5237	x86, pat: Proper init of memtype subtree_max_end subtree_max_end that was recently added to struct memtype was not getting properly initialized resulting in WARNING: kmemcheck: Caught 64-bit read from uninitialized memory in memtype_rb_augment_cb() reported here https://bugzilla.kernel.org/show_bug.cgi?id=16092 This change fixes the problem. Reported-by: Christian Casteyde <casteyde.christian@free.fr> Tested-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <1276217101-11515-1-git-send-email-venki@google.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com>	2010-06-11 14:12:22 -07:00
Yinghai Lu	837c4ef13c	PCI: clear bridge resource range if BIOS assigned bad one Yannick found that video does not work with 2.6.34. The cause of this bug was that the BIOS had assigned the wrong range to the PCI bridge above the video device. Before 2.6.34 the kernel would have shrunk the size of the bridge window, but since `d65245c` PCI: don't shrink bridge resources the kernel will avoid shrinking BIOS ranges. So zero out the old range if we fail to claim it at boot time; this will cause us to allocate a new range at startup, restoring the 2.6.34 behavior. Fixes regression https://bugzilla.kernel.org/show_bug.cgi?id=16009. Reported-by: Yannick <yannick.roehlly@free.fr> Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-06-11 13:24:51 -07:00
Huang Ying	a2d7b0d485	x86, mce: Use HW_ERR in MCE handler Use HW_ERR printk prefix in MCE handler. To make it more explicit that this is hardware error instead of software error. Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1275978939.3444.668.camel@yhuang-dev.sh.intel.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-06-10 21:28:49 -07:00
Huang Ying	3c41758860	x86, mce: Fix MSR_IA32_MCI_CTL2 CMCI threshold setup It is reported that CMCI is not raised when number of corrected error reaches preset threshold. After inspection, it is found that MSR_IA32_MCI_CTL2 threshold field is not setup properly. This patch fixed it. Value of MCI_CTL2_CMCI_THRESHOLD_MASK is fixed according to x86_64 Software Developer's Manual too. Reported-by: Shaohui Zheng <shaohui.zheng@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1275977350.3444.660.camel@yhuang-dev.sh.intel.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-06-10 21:27:36 -07:00
Huang Ying	1f9a0bd498	x86, mce: Rename MSR_IA32_MCx_CTL2 value Rename CMCI_EN to MCI_CTL2_CMCI_EN and CMCI_THRESHOLD_MASK to MCI_CTL2_CMCI_THRESHOLD_MASK to make naming consistent. Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1275977348.3444.659.camel@yhuang-dev.sh.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-06-10 21:27:26 -07:00
Andi Kleen	cf3bdc29fc	x86, setup: Set ax register in boot vga query Catch missing conversion to the register structure "glove box" scheme. Found by gcc 4.6's new warnings. Signed-off-by: Andi Kleen <ak@linux.intel.com> LKML-Reference: <20100610111040.F1781B1A2B@basil.firstfloor.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-06-10 15:24:29 -07:00
Andi Kleen	23b764d056	percpu, x86: Avoid warnings of unused variables in per cpu Avoid hundreds of warnings with a gcc 4.6 -Wall build. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-11 00:03:45 +02:00
Matthew Garrett	dd4c4f17d7	suspend: Move NVS save/restore code to generic suspend functionality Saving platform non-volatile state may be required for suspend to RAM as well as hibernation. Move it to more generic code. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Tested-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2010-06-10 11:02:34 -04:00
Stephane Eranian	d11007703c	perf_events: Fix Intel Westmere event constraints Based on Intel Vol3b (March 2010), the event SNOOPQ_REQUEST_OUTSTANDING is restricted to counters 0,1 so update the event table for Intel Westmere accordingly. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: paulus@samba.org Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: perfmon2-devel@lists.sf.net Cc: eranian@gmail.com Cc: <stable@kernel.org> # .34.x LKML-Reference: <4c10cb56.5120e30a.2eb4.ffffc3de@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-10 14:16:32 +02:00
Borislav Petkov	12d8a96128	x86, AMD: Extend support to future families Extend support to future families, and in particular: * extend direct mapping split of Tseg SMM area. * extend K8 flavored alternatives (NOPS). * rep movs* prefix is fast in ucode. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100602182921.GA21557@aftab> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-09 15:57:47 -07:00
Borislav Petkov	8cc1176e5d	x86, cacheinfo: Carve out L3 cache slot accessors This is in preparation for disabling L3 cache indices after having received correctable ECCs in the L3 cache. Now we allow for initial setting of a disabled index slot (write once) and deny writing new indices to it after it has been disabled. Also, we deny using both slots to disable one and the same index. Userspace can restore the previously disabled indices by rewriting those sysfs entries when booting. Cleanup and reorganize code while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100602161840.GI18327@aftab> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-09 15:57:41 -07:00
Dan Carpenter	d6d4d4205c	x86, xsave: Cleanup return codes in check_for_xstate() The places which call check_for_xstate() only care about zero or non-zero so this patch doesn't change how the code runs, but it's a cleanup. The main reason for this patch is that I'm looking for places which don't return -EFAULT for copy_from_user() failures. Signed-off-by: Dan Carpenter <error27@gmail.com> LKML-Reference: <20100603100746.GU5483@bicker> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com>	2010-06-09 15:57:36 -07:00
Eric W. Biederman	a4384df3e2	x86, irq: Rename gsi_end gsi_top, and fix off by one errors When I introduced the global variable gsi_end I thought gsi_end on io_apics was one past the end of the gsi range for the io_apic. After it was pointed out the the range on io_apics was inclusive I changed my global variable to match. That was a big mistake. Inclusive semantics without a range start cannot describe the case when no gsi's are allocated. Describing the case where no gsi's are allocated is important in sfi.c and mpparse.c so that we can assign gsi numbers instead of blindly copying the gsi assignments the BIOS has done as we do in the acpi case. To keep from getting the global variable confused with the gsi range end rename it gsi_top. To allow describing the case where no gsi's are allocated have gsi_top be one place the highest gsi number seen in the system. This fixes an off by one bug in sfi.c: Reported-by: jacob pan <jacob.jun.pan@linux.intel.com> This fixes the same off by one bug in mpparse.c: This fixes an off unreachable by one bug in acpi/boot.c:irq_to_gsi Reported-by: Yinghai <yinghai.lu@oracle.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <m17hm9jre7.fsf_-_@fess.ebiederm.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-09 13:34:06 -07:00
Ingo Molnar	c726b61c6a	Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core	2010-06-09 18:55:57 +02:00
Avi Kivity	69325a1225	KVM: MMU: Remove user access when allowing kernel access to gpte.w=0 page If cr0.wp=0, we have to allow the guest kernel access to a page with pte.w=0. We do that by setting spte.w=1, since the host cr0.wp must remain set so the host can write protect pages. Once we allow write access, we must remove user access otherwise we mistakenly allow the user to write the page. Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-06-09 18:48:37 +03:00
Marcelo Tosatti	3be2264be3	KVM: MMU: invalidate and flush on spte small->large page size change Always invalidate spte and flush TLBs when changing page size, to make sure different sized translations for the same address are never cached in a CPU's TLB. Currently the only case where this occurs is when a non-leaf spte pointer is overwritten by a leaf, large spte entry. This can happen after dirty logging is disabled on a memslot, for example. Noticed by Andrea. KVM-Stable-Tag Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-06-09 18:48:36 +03:00
Joerg Roedel	67ec660777	KVM: SVM: Implement workaround for Erratum 383 This patch implements a workaround for AMD erratum 383 into KVM. Without this erratum fix it is possible for a guest to kill the host machine. This patch implements the suggested workaround for hypervisors which will be published by the next revision guide update. [jan: fix overflow warning on i386] [xiao: fix unused variable warning] Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-06-09 18:47:20 +03:00
Joerg Roedel	fe5913e4e1	KVM: SVM: Handle MCEs early in the vmexit process This patch moves handling of the MC vmexits to an earlier point in the vmexit. The handle_exit function is too late because the vcpu might alreadry have changed its physical cpu. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-06-09 18:39:10 +03:00
Oleg Nesterov	018378c55b	x86: Unify save_stack_address() and save_stack_address_nosched() Cleanup. Factor the common code in save_stack_address() and save_stack_address_nosched(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100603193243.GA31534@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-06-09 17:32:19 +02:00
Oleg Nesterov	147ec4d236	x86: Make save_stack_address() !CONFIG_FRAME_POINTER friendly If CONFIG_FRAME_POINTER=n, print_context_stack() shouldn't neglect the non-reliable addresses on stack, this is all we have if dump_trace(bp) is called with the wrong or zero bp. For example, /proc/pid/stack doesn't work if CONFIG_FRAME_POINTER=n. This patch obviously has no effect if CONFIG_FRAME_POINTER=y, otherwise it reverts `1650743c` "x86: don't save unreliable stack trace entries". Also, remove the unnecessary type-cast. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <20100603193239.GA31530@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-06-09 17:32:15 +02:00
Peter Zijlstra	e78505958c	perf: Convert perf_event to local_t Since now all modification to event->count (and ->prev_count and ->period_left) are local to a cpu, change then to local64_t so we avoid the LOCK'ed ops. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-09 11:12:37 +02:00
Peter Zijlstra	1996bda2a4	arch: Implement local64_t On 64bit, local_t is of size long, and thus we make local64_t an alias. On 32bit, we fall back to atomic64_t. (architecture can provide optimized 32-bit version) (This new facility is to be used by perf events optimizations.) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linux-arch@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-09 11:12:36 +02:00
Cyrill Gorcunov	68aa00ac0a	perf, x86: Make a second write to performance counter if needed On Netburst PMU we need a second write to a performance counter due to cpu erratum. A simple flag test instead of alternative instructions was choosen because wrmsrl is already a macro and if virtualization is turned on will need an additional wrapper call which is more expencise. nb: we should propably switch to jump-labels as only this facility reach the mainline. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100602212304.GC5264@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-09 11:12:35 +02:00
Peter Zijlstra	8d2cacbbb8	perf: Cleanup {start,commit,cancel}_txn details Clarify some of the transactional group scheduling API details and change it so that a successfull ->commit_txn also closes the transaction. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1274803086.5882.1752.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-09 11:12:34 +02:00
Frederic Weisbecker	b0f82b81fe	perf: Drop the skip argument from perf_arch_fetch_regs_caller Drop this argument now that we always want to rewind only to the state of the first caller. It means frame pointers are not necessary anymore to reliably get the source of an event. But this also means we need this helper to be a macro now, as an inline function is not an option since we need to know when to provide a default implentation. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-06-08 23:31:27 +02:00
Frederic Weisbecker	c9cf4dbb4d	x86: Unify dumpstack.h and stacktrace.h arch/x86/include/asm/stacktrace.h and arch/x86/kernel/dumpstack.h declare headers of objects that deal with the same topic. Actually most of the files that include stacktrace.h also include dumpstack.h Although dumpstack.h seems more reserved for internals of stack traces, those are quite often needed to define specialized stack trace operations. And perf event arch headers are going to need access to such low level operations anyway. So don't continue to bother with dumpstack.h as it's not anymore about isolated deep internals. v2: fix struct stack_frame definition conflict in sysprof Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Soeren Sandmann <sandmann@daimi.au.dk>	2010-06-08 23:29:52 +02:00
Livio Soares	e768aee89c	perf, x86: Small fix to cpuid10_edx Fixes to 'cpuid10_edx' to comply with Intel documentation. According to the Intel Manual, Volume 2A, Table 3-12, the cpuid for architecture performance monitoring returns, in EDX, two pieces of information: 1) Number of fixed-function counters (5 bits, not 4) 2) Width of fixed-function counters (8 bits) Signed-off-by: Livio Soares <livio@eecg.toronto.edu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-08 20:27:04 +02:00
Andres Salomon	fc4ac7a5f5	x86: use __ASSEMBLY__ rather than __ASSEMBLER__ As Ingo pointed out in a separate patch, we should be using __ASSEMBLY__. Make that the case in pgtable headers. Signed-off-by: Andres Salomon <dilinger@queued.net> LKML-Reference: <20100605114042.35ac69c1@dev.queued.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-06-07 17:27:11 -07:00
Ondrej Zary	85a0e75397	PM / x86: Save/restore MISC_ENABLE register Save/restore MISC_ENABLE register on suspend/resume. This fixes OOPS (invalid opcode) on resume from STR on Asus P4P800-VM, which wakes up with MWAIT disabled. Fixes https://bugzilla.kernel.org/show_bug.cgi?id=15385 Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2010-06-08 00:32:49 +02:00
Linus Torvalds	9a9620db07	Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (83 commits) i7core_edac: Better describe the supported devices Add support for Westmere to i7core_edac driver i7core_edac: don't free on success i7core_edac: Add support for X5670 Always call i7core_[ur]dimm_check_mc_ecc_err i7core_edac: fix memory leak of i7core_dev EDAC: add __init to i7core_xeon_pci_fixup i7core_edac: Fix wrong device id for channel 1 devices i7core: add support for Lynnfield alternate address i7core_edac: Add initial support for Lynnfield i7core_edac: do not export static functions edac: fix i7core build edac: i7core_edac produces undefined behaviour on 32bit i7core_edac: Use a more generic approach for probing PCI devices i7core_edac: PCI device is called NONCORE, instead of NOCORE i7core_edac: Fix ringbuffer maxsize i7core_edac: First store, then increment i7core_edac: Better parse "any" addrmask i7core_edac: Use a lockless ringbuffer edac: Create an unique instance for each kobj ...	2010-06-04 15:39:54 -07:00
Robert Richter	d8a382d266	Merge remote branch 'tip/perf/urgent' into oprofile/urgent	2010-06-04 11:33:10 +02:00
Linus Torvalds	167b712904	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, smpboot: Fix cores per node printing on boot x86/amd-iommu: Fall back to GART if initialization fails x86/amd-iommu: Fix crash when request_mem_region fails x86/mm: Remove unused DBG() macro arch/x86/kernel: Add missing spin_unlock	2010-06-03 15:47:22 -07:00
Linus Torvalds	b01b7dc283	Merge branch 'for-linus/bugfixes' of git://xenbits.xensource.com/people/ianc/linux-2.6 * 'for-linus/bugfixes' of git://xenbits.xensource.com/people/ianc/linux-2.6: xen: avoid allocation causing potential swap activity on the resume path xen: ensure timer tick is resumed even on CPU driving the resume	2010-06-03 15:46:09 -07:00
Linus Torvalds	f150dba6d4	Merge branch 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix crash in swevents perf buildid-list: Fix --with-hits event processing perf scripts python: Give field dict to unhandled callback perf hist: fix objdump output parsing perf-record: Check correct pid when forking perf: Do the comm inheritance per thread in event__process_task perf: Use event__process_task from perf sched perf: Process comm events by tid blktrace: Fix new kernel-doc warnings perf_events: Fix unincremented buffer base on partial copy perf_events: Fix event scheduling issues introduced by transactional API perf_events, trace: Fix perf_trace_destroy(), mutex went missing perf_events, trace: Fix probe unregister race perf_events: Fix races in group composition perf_events: Fix races and clean up perf_event and perf_mmap_data interaction	2010-06-03 15:45:26 -07:00
Ian Campbell	cd52e17ea8	xen: ensure timer tick is resumed even on CPU driving the resume The core suspend/resume code is run from stop_machine on CPU0 but parts of the suspend/resume machinery (including xen_arch_resume) are run on whichever CPU happened to schedule the xenwatch kernel thread. As part of the non-core resume code xen_arch_resume is called in order to restart the timer tick on non-boot processors. The boot processor itself is taken care of by core timekeeping code. xen_arch_resume uses smp_call_function which does not call the given function on the current processor. This means that we can end up with one CPU not receiving timer ticks if the xenwatch thread happened to be scheduled on CPU > 0. Use on_each_cpu instead of smp_call_function to ensure the timer tick is resumed everywhere. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Stable Kernel <stable@kernel.org> # .32.x	2010-06-03 09:34:04 +01:00
Borislav Petkov	4adc8b71cc	x86, smpboot: Fix cores per node printing on boot Percpu initialization happens now after booting the cores on the machine and this causes them all to be displayed as belonging to node 0: Jun 8 05:57:21 kepek kernel: [ 0.106999] Booting Node 0, Processors #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 Ok. Use early_cpu_to_node() to get the correct node of each core instead. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Mike Travis <travis@sgi.com> LKML-Reference: <20100601190455.GA14237@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-02 09:38:53 +02:00
Linus Torvalds	1f73897861	Merge branch 'for-35' of git://repo.or.cz/linux-kbuild * 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits) kbuild: Revert part of `e8d400a` to resolve a conflict kbuild: Fix checking of scm-identifier variable gconfig: add support to show hidden options that have prompts menuconfig: add support to show hidden options which have prompts gconfig: remove show_debug option gconfig: remove dbg_print_ptype() and dbg_print_stype() kconfig: fix zconfdump() kconfig: some small fixes add random binaries to .gitignore kbuild: Include gen_initramfs_list.sh and the file list in the .d file kconfig: recalc symbol value before showing search results .gitignore: ignore *.lzo files headerdep: perlcritic warning scripts/Makefile.lib: Align the output of LZO kbuild: Generate modules.builtin in make modules_install Revert "kbuild: specify absolute paths for cscope" kbuild: Do not unnecessarily regenerate modules.builtin headers_install: use local file handles headers_check: fix perl warnings export_report: fix perl warnings ...	2010-06-01 08:55:52 -07:00
Ingo Molnar	c8fcb14fec	Merge branch 'amd-iommu/2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent	2010-06-01 11:45:45 +02:00
Joerg Roedel	d7f0776975	x86/amd-iommu: Fall back to GART if initialization fails This patch implements a fallback to the GART IOMMU if this is possible and the AMD IOMMU initialization failed. Otherwise the fallback would be nommu which is very problematic on machines with more than 4GB of memory or swiotlb which hurts io-performance. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2010-06-01 10:20:15 +02:00
Joerg Roedel	e82752d8b5	x86/amd-iommu: Fix crash when request_mem_region fails When request_mem_region fails the error path tries to disable the IOMMUs. This accesses the mmio-region which was not allocated leading to a kernel crash. This patch fixes the issue. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2010-06-01 10:03:08 +02:00
Joerg Roedel	1d61e73ab4	Merge commit 'v2.6.35-rc1' into amd-iommu/2.6.35	2010-06-01 09:57:49 +02:00
Akinobu Mita	e565813ab9	x86/mm: Remove unused DBG() macro DBG() macro for CONFIG_DEBUG_PER_CPU_MAPS is unused. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> LKML-Reference: <1274706291-13554-1-git-send-email-akinobu.mita@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-31 10:01:53 +02:00
Stephane Eranian	90151c35b1	perf_events: Fix event scheduling issues introduced by transactional API The transactional API patch between the generic and model-specific code introduced several important bugs with event scheduling, at least on X86. If you had pinned events, e.g., watchdog, and were over-committing the PMU, you would get bogus counts. The bug was showing up on Intel CPU because events would move around more often that on AMD. But the problem also existed on AMD, though harder to expose. The issues were: - group_sched_in() was missing a cancel_txn() in the error path - cpuc->n_added was not properly maintained, leading to missing actions in hw_perf_enable(), i.e., n_running being 0. You cannot update n_added until you know the transaction has succeeded. In case of failed transaction n_added was not adjusted back. - in case of failed transactions, event_sched_out() was called and eventually invoked x86_disable_event() to touch the HW reg. But with transactions, on X86, event_sched_in() does not touch HW registers, it simply collects events into a list. Thus, you could end up calling x86_disable_event() on a counter which did not correspond to the current event when idx != -1. The patch modifies the generic and X86 code to avoid all those problems. First, we keep track of the number of events added last. In case the transaction fails, we substract them from n_added. This approach is necessary (as opposed to delaying updates to n_added) because not all event updates use the transaction API, e.g., single events. Second, we encapsulate the event_sched_in() and event_sched_out() in group_sched_in() inside the transaction. That makes the operations symmetrical and you can also detect that you are inside a transaction and skip the HW reg access by checking cpuc->group_flag. With this patch, you can now overcommit the PMU even with pinned system-wide events present and still get valid counts. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1274796225.5882.1389.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-31 08:46:10 +02:00
Linus Torvalds	17d30ac077	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (47 commits) mfd: Rename twl5031 sih modules mfd: Storage class for timberdale should be before const qualifier mfd: Remove unneeded and dangerous clearing of clientdata mfd: New AB8500 driver gpio: Fix inverted rdc321x gpio data out registers mfd: Change rdc321x resources flags to IORESOURCE_IO mfd: Move pcf50633 irq related functions to its own file. mfd: Use threaded irq for pcf50633 mfd: pcf50633-adc: Fix potential race in pcf50633_adc_sync_read mfd: Fix pcf50633 bitfield logic in interrupt handler gpio: rdc321x needs to select MFD_CORE mfd: Use menuconfig for quicker config editing ARM: AB3550 board configuration and irq for U300 mfd: AB3550 core driver mfd: AB3100 register access change to abx500 API mfd: Renamed ab3100.h to abx500.h gpio: Add TC35892 GPIO driver mfd: Add Toshiba's TC35892 MFD core mfd: Delay to mask tsc irq in max8925 mfd: Remove incorrect wm8350 kfree ...	2010-05-30 09:13:08 -07:00
Linus Torvalds	021fad8b70	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cpufeature: Unbreak compile with gcc 3.x x86, pat: Fix memory leak in free_memtype x86, k8: Fix section mismatch for powernowk8_exit() lib/atomic64_test: fix missing include of linux/kernel.h x86: remove last traces of quicklist usage x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012 x86: "nosmp" command line option should force the system into UP mode arch/x86/pci: use kasprintf x86, apic: ack all pending irqs when crashed/on kexec	2010-05-30 09:06:13 -07:00
Linus Torvalds	35926ff5fb	Revert "cpusets: randomize node rotor used in cpuset_mem_spread_node()" This reverts commit `0ac0c0d0f8`, which caused cross-architecture build problems for all the wrong reasons. IA64 already added its own version of __node_random(), but the fact is, there is nothing architectural about the function, and the original commit was just badly done. Revert it, since no fix is forthcoming. Requested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-30 09:00:03 -07:00
Linus Torvalds	e4f2e5eaac	Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: intel_idle: native hardware cpuidle driver for latest Intel processors ACPI: acpi_idle: touch TS_POLLING only in the non-MWAIT case acpi_pad: uses MONITOR/MWAIT, so it doesn't need to clear TS_POLLING sched: clarify commment for TS_POLLING ACPI: allow a native cpuidle driver to displace ACPI cpuidle: make cpuidle_curr_driver static cpuidle: add cpuidle_unregister_driver() error check cpuidle: fail to register if !CONFIG_CPU_IDLE	2010-05-28 16:14:17 -07:00
Linus Torvalds	9a90e09854	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits) ACPI: Don't let acpi_pad needlessly mark TSC unstable drivers/acpi/sleep.h: Checkpatch cleanup ACPI: Minor cleanup eliminating redundant PMTIMER_TICKS to NS conversion ACPI: delete unused c-state promotion/demotion data strucutures ACPI: video: fix acpi_backlight=video ACPI: EC: Use kmemdup drivers/acpi: use kasprintf ACPI, APEI, EINJ injection parameters support Add x64 support to debugfs ACPI, APEI, Use ERST for persistent storage of MCE ACPI, APEI, Error Record Serialization Table (ERST) support ACPI, APEI, Generic Hardware Error Source memory error support ACPI, APEI, UEFI Common Platform Error Record (CPER) header Unified UUID/GUID definition ACPI Hardware Error Device (PNP0C33) support ACPI, APEI, PCIE AER, use general HEST table parsing in AER firmware_first setup ACPI, APEI, Document for APEI ACPI, APEI, EINJ support ACPI, APEI, HEST table parsing ACPI, APEI, APEI supporting infrastructure ...	2010-05-28 14:42:18 -07:00
Len Brown	d3b383338f	Merge branch 'ht-delete-2.6.35' into release	2010-05-28 16:20:35 -04:00
Len Brown	91dd696439	Merge branch 'acpi_enable' into release	2010-05-28 16:17:27 -04:00
Len Brown	dc1544ea5d	Merge branch 'bjorn-pci-root-v4-2.6.35' into release	2010-05-28 16:17:16 -04:00
Len Brown	e45b7fa230	sched: clarify commment for TS_POLLING TS_POLLING set tells the scheduler an idle_task will poll need_resched() to look for work. TS_POLLING clear tells resched_task() and wake_up_idle_cpu() that the remote CPU's idle_task is now sleeping in idle, and thus requires a reschedule interrupt notice work. Update the description of TS_POLLING to reflect how it works. "idle task polling need_resched, skip sending interrupt" Wordsmithing-by: Milton Miller <miltonm@bga.com> Signed-off-by: Len Brown <len.brown@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org>	2010-05-27 21:07:05 -04:00
Florian Fainelli	f52e62dcc1	x86: remove rdc321x_defs.h This file is replaced by a cleaner version with the adding of a MFD driver for the southbridge. Signed-off-by: Florian Fainelli <florian@openwrt.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2010-05-28 01:37:30 +02:00
Linus Torvalds	c5617b200a	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (61 commits) tracing: Add __used annotation to event variable perf, trace: Fix !x86 build bug perf report: Support multiple events on the TUI perf annotate: Fix up usage of the build id cache x86/mmiotrace: Remove redundant instruction prefix checks perf annotate: Add TUI interface perf tui: Remove annotate from popup menu after failure perf report: Don't start the TUI if -D is used perf: Fix getline undeclared perf: Optimize perf_tp_event_match() perf: Remove more code from the fastpath perf: Optimize the !vmalloc backed buffer perf: Optimize perf_output_copy() perf: Fix wakeup storm for RO mmap()s perf-record: Share per-cpu buffers perf-record: Remove -M perf: Ensure that IOC_OUTPUT isn't used to create multi-writer buffers perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to track events perf, trace: Optimize tracepoints by removing IRQ-disable from perf/tracepoint interaction perf tui: Allow disabling the TUI on a per command basis in ~/.perfconfig ...	2010-05-27 15:23:47 -07:00
H. Peter Anvin	1ba4f22c42	x86, cpufeature: Unbreak compile with gcc 3.x gcc 3 is too braindamaged to be able to compile static_cpu_has() -- apparently it can't tell that a constant passed to an inline function is still a constant -- so if we're using gcc 3, just use the dynamic test. This is bad for performance, but if you care about performance, don't use an ancient, known-to-optimize-poorly compiler. Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> LKML-Reference: <4BF2FF82.7090005@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-05-27 12:02:00 -07:00
Lee Schermerhorn	e534c7c5f8	numa: x86_64: use generic percpu var numa_node_id() implementation x86 arch specific changes to use generic numa_node_id() based on generic percpu variable infrastructure. Back out x86's custom version of numa_node_id() Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Tejun Heo <tj@kernel.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Eric Whitney <eric.whitney@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:57 -07:00
Lee Schermerhorn	7281201922	numa: add generic percpu var numa_node_id() implementation Rework the generic version of the numa_node_id() function to use the new generic percpu variable infrastructure. Guard the new implementation with a new config option: CONFIG_USE_PERCPU_NUMA_NODE_ID. Archs which support this new implemention will default this option to 'y' when NUMA is configured. This config option could be removed if/when all archs switch over to the generic percpu implementation of numa_node_id(). Arch support involves: 1) converting any existing per cpu variable implementations to use this implementation. x86_64 is an instance of such an arch. 2) archs that don't use a per cpu variable for numa_node_id() will need to initialize the new per cpu variable "numa_node" as cpus are brought on-line. ia64 is an example. 3) Defining USE_PERCPU_NUMA_NODE_ID in arch dependent Kconfig--e.g., when NUMA is configured. This is required because I have retained the old implementation by default to allow archs to be modified incrementally, as desired. Subsequent patches will convert x86_64 and ia64 to use this implemenation. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Tejun Heo <tj@kernel.org> Cc: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Eric Whitney <eric.whitney@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:57 -07:00
FUJITA Tomonori	1ef04370d8	asm-generic: remove ARCH_HAS_SG_CHAIN in scatterlist.h There are more architectures that don't support ARCH_HAS_SG_CHAIN than those that support it. This removes removes ARCH_HAS_SG_CHAIN in asm-generic/scatterlist.h and lets arhictectures to define it. It's clearer than defining ARCH_HAS_SG_CHAIN asm-generic/scatterlist.h and undefing it in arhictectures that don't support it. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:54 -07:00
Andrew Morton	4a14d84ea2	x86_32: use asm-generic/scatterlist.h Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:54 -07:00
FUJITA Tomonori	18e98307de	asm-generic: add NEED_SG_DMA_LENGTH to define sg_dma_len() There are only two ways to define sg_dma_len(); use sg->dma_length or sg->length. This patch introduces NEED_SG_DMA_LENGTH that enables architectures to choose sg->dma_length or sg->length. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:54 -07:00
FUJITA Tomonori	de006a071c	x86: remove unnecessary sync_single_range_* in swiotlb_dma_ops sync_single_range_for_cpu and sync_single_range_for_device hooks in swiotlb_dma_ops are unnecessary because sync_single_for_cpu and sync_single_for_device are used there. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:52 -07:00

... 3 4 5 6 7 ...

11137 commits