Commit graph

236 commits

Author SHA1 Message Date
Alexander Graf
bd7cdbb7fc KVM: PPC: Add emulation for dcba
Mac OS X uses the dcba instruction. According to the specification it doesn't
guarantee any functionality, so let's just emulate it as nop.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:15 +03:00
Alexander Graf
9fb244a2c2 KVM: PPC: Fix dcbz emulation
On most systems we need to emulate dcbz when running 32 bit guests. So
far we've been rather slack, not giving correct DSISR values to the guest.

This patch makes the emulation more accurate, introducing a difference
between "page not mapped" and "write protection fault". While at it, it
also speeds up dcbz emulation by an order of magnitude by using kmap.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:14 +03:00
Alexander Graf
a2b07664f6 KVM: PPC: Make build work without CONFIG_VSX/ALTIVEC
The FPU/Altivec/VSX enablement also brought access to some structure
elements that are only defined when the respective config options
are enabled.

Unfortuately I forgot to check for the config options at some places,
so let's do that now.

Unbreaks the build when CONFIG_VSX is not set.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:12 +03:00
Alexander Graf
ad0a048b09 KVM: PPC: Add OSI hypercall interface
MOL uses its own hypercall interface to call back into userspace when
the guest wants to do something.

So let's implement that as an exit reason, specify it with a CAP and
only really use it when userspace wants us to.

The only user of it so far is MOL.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:10 +03:00
Alexander Graf
71fbfd5f38 KVM: Add support for enabling capabilities per-vcpu
Some times we don't want all capabilities to be available to all
our vcpus. One example for that is the OSI interface, implemented
in the next patch.

In order to have a generic mechanism in how to enable capabilities
individually, this patch introduces a new ioctl that can be used
for this purpose. That way features we don't want in all guests or
userspace configurations can just not be enabled and we're good.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:09 +03:00
Alexander Graf
ca7f4203b9 KVM: PPC: Implement alignment interrupt
Mac OS X has some applications - namely the Finder - that require alignment
interrupts to work properly. So we need to implement them.

But the spec for 970 and 750 also looks different. While 750 requires the
DSISR and DAR fields to reflect some instruction bits (DSISR) and the fault
address (DAR), the 970 declares this as an optional feature. So we need
to reconstruct DSISR and DAR manually.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:07 +03:00
Alexander Graf
1c85e73303 KVM: PPC: Implement emulation for lbzux and lhax
We get MMIOs with the weirdest instructions. But every time we do,
we need to improve our emulator to implement them.

So let's do that - this time it's lbzux and lhax's round.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:06 +03:00
Alexander Graf
1bec1677ca KVM: PPC: Make XER load 32 bit
We have a 32 bit value in the PACA to store XER in. We also do an stw
when storing XER in there. But then we load it with ld, completely
screwing it up on every entry.

Welcome to the Big Endian world.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:04 +03:00
Alexander Graf
c04a695a44 KVM: PPC: Implement BAT reads
BATs can't only be written to, you can also read them out!
So let's implement emulation for reading BAT values again.

While at it, I also made BAT setting flush the segment cache,
so we're absolutely sure there's no MMU state left when writing
BATs.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:03 +03:00
Alexander Graf
c664876c6d KVM: PPC: Implement mfsr emulation
We emulate the mfsrin instruction already, that passes the SR number
in a register value. But we lacked support for mfsr that encoded the
SR number in the opcode.

So let's implement it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:01 +03:00
Alexander Graf
a56cf347c2 KVM: PPC: Load VCPU for register fetching
When trying to read or store vcpu register data, we should also make
sure the vcpu is actually loaded, so we're 100% sure we get the correct
values.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:59 +03:00
Alexander Graf
c2453693d4 KVM: PPC: Don't reload FPU with invalid values
When the guest activates the FPU, we load it up. That's fine when
it wasn't activated before on the host, but if it was we end up
reloading FPU values from last time the FPU was deactivated on the
host without writing the proper values back to the vcpu struct.

This patch checks if the FPU is enabled already and if so just doesn't
bother activating it, making FPU operations survive guest context switches.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:57 +03:00
Alexander Graf
8963221d7d KVM: PPC: Split instruction reading out
The current check_ext function reads the instruction and then does
the checking. Let's split the reading out so we can reuse it for
different functions.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:56 +03:00
Alexander Graf
4b389ca2e7 KVM: PPC: Book3S_32 guest MMU fixes
This patch makes the VSID of mapped pages always reflecting all special cases
we have, like split mode.

It also changes the tlbie mask to 0x0ffff000 according to the spec. The mask
we used before was incorrect.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:54 +03:00
Alexander Graf
c8027f1652 KVM: PPC: Make DSISR 32 bits wide
DSISR is only defined as 32 bits wide. So let's reflect that in the
structs too.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:53 +03:00
Alexander Graf
18978768d8 KVM: PPC: Allow userspace to unset the IRQ line
Userspace can tell us that it wants to trigger an interrupt. But
so far it can't tell us that it wants to stop triggering one.

So let's interpret the parameter to the ioctl that we have anyways
to tell us if we want to raise or lower the interrupt line.

Signed-off-by: Alexander Graf <agraf@suse.de>

v2 -> v3:

 - Add CAP for unset irq
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:51 +03:00
Alexander Graf
3eeafd7da2 KVM: PPC: Ensure split mode works
On PowerPC we can go into MMU Split Mode. That means that either
data relocation is on but instruction relocation is off or vice
versa.

That mode didn't work properly, as we weren't always flushing
entries when going into a new split mode, potentially mapping
different code or data that we're supposed to.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:49 +03:00
Wei Yongjun
06056bfb94 KVM: PPC: Do not create debugfs if fail to create vcpu
If fail to create the vcpu, we should not create the debugfs
for it.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Alexander Graf <agraf@suse.de>
Cc: stable@kernel.org
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:15:21 +03:00
Roel Kluin
4f018c513a KVM: PPC: Keep index within boundaries in kvmppc_44x_emul_tlbwe()
An index of KVM44x_GUEST_TLB_SIZE is already one too large.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-13 01:33:04 -03:00
Alexander Graf
a595405df9 KVM: PPC: Destory timer on vcpu destruction
When we destory a vcpu, we should also make sure to kill all pending
timers that could still be up. When not doing this, hrtimers might
dereference null pointers trying to call our code.

This patch fixes spontanious kernel panics seen after closing VMs.

Signed-off-by: Alexander Graf <alex@csgraf.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:39:25 +03:00
Alexander Graf
7e821d3920 KVM: PPC: Memset vcpu to zeros
While converting the kzalloc we used to allocate our vcpu struct to
vmalloc, I forgot to memset the contents to zeros. That broke quite
a lot.

This patch memsets it to zero again.

Signed-off-by: Alexander Graf <alex@csgraf.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:39:21 +03:00
Alexander Graf
032c340731 KVM: PPC: Allocate vcpu struct using vmalloc
We used to use get_free_pages to allocate our vcpu struct. Unfortunately
that call failed on me several times after my machine had a big enough
uptime, as memory became too fragmented by then.

Fortunately, we don't need it to be page aligned any more! We can just
vmalloc it and everything's great.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:38:04 +03:00
Alexander Graf
964b6411af KVM: PPC: Simplify kvmppc_load_up_(FPU|VMX|VSX)
We don't need as complex code. I had some thinkos while writing it, figuring
I needed to support PPC32 paths on PPC64 which would have required DR=0, but
everything just runs fine with DR=1.

So let's make the functions simple C call wrappers that reserve some space on
the stack for the respective functions to clobber.

Fixes out-of-RMA-access (and thus guest FPU loading) on the PS3.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:38:01 +03:00
Alexander Graf
20a340abd3 KVM: PPC: Enable use of secondary htab bucket
We had code to make use of the secondary htab buckets, but kept that
disabled because it was unstable when I put it in.

I checked again if that's still the case and apparently it was only
exposing some instability that was there anyways before. I haven't
seen any badness related to usage of secondary htab entries so far.

This should speed up guest memory allocations by quite a bit, because
we now have more space to put PTEs in.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:37:58 +03:00
Alexander Graf
c10207fe86 KVM: PPC: Add capability for paired singles
We need to tell userspace that we can emulate paired single instructions.
So let's add a capability export.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:37:47 +03:00
Alexander Graf
831317b605 KVM: PPC: Implement Paired Single emulation
The one big thing about the Gekko is paired singles.

Paired singles are an extension to the instruction set, that adds 32 single
precision floating point registers (qprs), some SPRs to modify the behavior
of paired singled operations and instructions to deal with qprs to the
instruction set.

Unfortunately, it also changes semantics of existing operations that affect
single values in FPRs. In most cases they get mirrored to the coresponding
QPR.

Thanks to that we need to emulate all FPU operations and all the new paired
single operations too.

In order to achieve that, we use the just introduced FPU call helpers to
call the real FPU whenever the guest wants to modify an FPR. Additionally
we also fix up the QPR values along the way.

That way we can execute paired single FPU operations without implementing a
soft fpu.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:27 +03:00
Alexander Graf
e5c29e926c KVM: PPC: Enable program interrupt to do MMIO
When we get a program interrupt we usually don't expect it to perform an
MMIO operation. But why not? When we emulate paired singles, we can end
up loading or storing to an MMIO address - and the handling of those
happens in the program interrupt handler.

So let's teach the program interrupt handler how to deal with EMULATE_MMIO.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:24 +03:00
Alexander Graf
dba2e123e7 KVM: PPC: Fix error in BAT assignment
BATs didn't work. Well, they did, but only up to BAT3. As soon as we
came to BAT4 the offset calculation was screwed up and we ended up
overwriting BAT0-3.

Fortunately, Linux hasn't been using BAT4+. It's still a good
idea to write correct code though.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:18 +03:00
Alexander Graf
963cf3dc63 KVM: PPC: Add helpers to call FPU instructions
To emulate paired single instructions, we need to be able to call FPU
operations from within the kernel. Since we don't want gcc to spill
arbitrary FPU code everywhere, we tell it to use a soft fpu.

Since we know we can really call the FPU in safe areas, let's also add
some calls that we can later use to actually execute real world FPU
operations on the host's FPU.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:15 +03:00
Alexander Graf
aba3bd7ffe KVM: PPC: Make ext giveup non-static
We need to call the ext giveup handlers from code outside of book3s.c.
So let's make it non-static.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:12 +03:00
Alexander Graf
5467a97d0f KVM: PPC: Make software load/store return eaddr
The Book3S KVM implementation contains some helper functions to load and store
data from and to virtual addresses.

Unfortunately, this helper used to keep the physical address it so nicely
found out for us to itself. So let's change that and make it return the
physical address it resolved.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:09 +03:00
Alexander Graf
71db408936 KVM: PPC: Implement mtsr instruction emulation
The Book3S_32 specifications allows for two instructions to modify segment
registers: mtsrin and mtsr.

Most normal operating systems use mtsrin, because it allows to define which
segment it wants to change using a register. But since I was trying to run
an embedded guest, it turned out to be using mtsr with hardcoded values.

So let's also emulate mtsr. It's a valid instruction after all.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:06 +03:00
Alexander Graf
e425a6de1a KVM: PPC: Fix typo in book3s_32 debug code
There's a typo in the debug ifdef of the book3s_32 mmu emulation. While trying
to debug something I stumbled across that and wanted to save anyone after me
(or myself later) from having to debug that again.

So let's fix the ifdef.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:03 +03:00
Alexander Graf
d1bab74c51 KVM: PPC: Preload FPU when possible
There are some situations when we're pretty sure the guest will use the
FPU soon. So we can save the churn of going into the guest, finding out
it does want to use the FPU and going out again.

This patch adds preloading of the FPU when it's reasonable.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:59 +03:00
Alexander Graf
c8c0b6f2f7 KVM: PPC: Combine extension interrupt handlers
When we for example get an Altivec interrupt, but our guest doesn't support
altivec, we need to inject a program interrupt, not an altivec interrupt.

The same goes for paired singles. When an altivec interrupt arrives, we're
pretty sure we need to emulate the instruction because it's a paired single
operation.

So let's make all the ext handlers aware that they need to jump to the
program interrupt handler when an extension interrupt arrives that
was not supposed to arrive for the guest CPU.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:56 +03:00
Alexander Graf
d6d549b207 KVM: PPC: Add Gekko SPRs
The Gekko has some SPR values that differ from other PPC core values and
also some additional ones.

Let's add support for them in our mfspr/mtspr emulator.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:53 +03:00
Alexander Graf
3c402a75ea KVM: PPC: Add hidden flag for paired singles
The Gekko implements an extension called paired singles. When the guest wants
to use that extension, we need to make sure we're not running the host FPU,
because all FPU instructions need to get emulated to accomodate for additional
operations that occur.

This patch adds an hflag to track if we're in paired single mode or not.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:50 +03:00
Alexander Graf
37f5bca64e KVM: PPC: Add AGAIN type for emulation return
Emulation of an instruction can have different outcomes. It can succeed,
fail, require MMIO, do funky BookE stuff - or it can just realize something's
odd and will be fixed the next time around.

Exactly that is what EMULATE_AGAIN means. Using that flag we can now tell
the caller that nothing happened, but we still want to go back to the
guest and see what happens next time we come around.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:47 +03:00
Alexander Graf
3587d5348c KVM: PPC: Teach MMIO Signedness
The guest I was trying to get to run uses the LHA and LHAU instructions.
Those instructions basically do a load, but also sign extend the result.

Since we need to fill our registers by hand when doing MMIO, we also need
to sign extend manually.

This patch implements sign extended MMIO and the LHA(U) instructions.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:44 +03:00
Alexander Graf
b104d06632 KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs
Right now MMIO access can only happen for GPRs and is at most 32 bit wide.
That's actually enough for almost all types of hardware out there.

Unfortunately, the guest I was using used FPU writes to MMIO regions, so
it ended up writing 64 bit MMIOs using FPRs and QPRs.

So let's add code to handle those odd cases too.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:41 +03:00
Takuya Yoshikawa
87bf6e7de1 KVM: fix the handling of dirty bitmaps to avoid overflows
Int is not long enough to store the size of a dirty bitmap.

This patch fixes this problem with the introduction of a wrapper
function to calculate the sizes of dirty bitmaps.

Note: in mark_page_dirty(), we have to consider the fact that
  __set_bit() takes the offset as int, not long.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-04-20 13:06:55 +03:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Linus Torvalds
c812a51d11 Merge branch 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (145 commits)
  KVM: x86: Add KVM_CAP_X86_ROBUST_SINGLESTEP
  KVM: VMX: Update instruction length on intercepted BP
  KVM: Fix emulate_sys[call, enter, exit]()'s fault handling
  KVM: Fix segment descriptor loading
  KVM: Fix load_guest_segment_descriptor() to inject page fault
  KVM: x86 emulator: Forbid modifying CS segment register by mov instruction
  KVM: Convert kvm->requests_lock to raw_spinlock_t
  KVM: Convert i8254/i8259 locks to raw_spinlocks
  KVM: x86 emulator: disallow opcode 82 in 64-bit mode
  KVM: x86 emulator: code style cleanup
  KVM: Plan obsolescence of kernel allocated slots, paravirt mmu
  KVM: x86 emulator: Add LOCK prefix validity checking
  KVM: x86 emulator: Check CPL level during privilege instruction emulation
  KVM: x86 emulator: Fix popf emulation
  KVM: x86 emulator: Check IOPL level during io instruction emulation
  KVM: x86 emulator: fix memory access during x86 emulation
  KVM: x86 emulator: Add Virtual-8086 mode of emulation
  KVM: x86 emulator: Add group9 instruction decoding
  KVM: x86 emulator: Add group8 instruction decoding
  KVM: do not store wqh in irqfd
  ...

Trivial conflicts in Documentation/feature-removal-schedule.txt
2010-03-05 13:12:34 -08:00
Liu Yu
daf5e27109 KVM: ppc/booke: Set ESR and DEAR when inject interrupt to guest
Old method prematurely sets ESR and DEAR.
Move this part after we decide to inject interrupt,
which is more like hardware behave.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:36:10 -03:00
Liu Yu
da15bf436b KVM: PPC E500: fix tlbcfg emulation
commit 55fb1027c1cf9797dbdeab48180da530e81b1c39 doesn't update tlbcfg correctly.
Fix it.

And since guest OS likes 'fixed' hardware,
initialize tlbcfg everytime when guest access is useless.
So move this part to init code.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:36:06 -03:00
Liu Yu
a9040f2742 KVM: PPC: Add PVR/PIR init for E500
commit 513579e3a3 change the way
we emulate PVR/PIR,
which left PVR/PIR uninitialized on E500, and make guest puzzled.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:36:05 -03:00
Liu Yu
d86be077a4 KVM: PPC E500: Add register l1csr0 emulation
Latest kernel start to access l1csr0 to contron L1.
We just tell guest no operation is on going.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:36:05 -03:00
Marcelo Tosatti
6474920477 KVM: fix cleanup_srcu_struct on vm destruction
cleanup_srcu_struct on VM destruction remains broken:

BUG: unable to handle kernel paging request at ffffffffffffffff
IP: [<ffffffff802533d2>] srcu_read_lock+0x16/0x21
RIP: 0010:[<ffffffff802533d2>]  [<ffffffff802533d2>] srcu_read_lock+0x16/0x21
Call Trace:
 [<ffffffffa05354c4>] kvm_arch_vcpu_uninit+0x1b/0x48 [kvm]
 [<ffffffffa05339c6>] kvm_vcpu_uninit+0x9/0x15 [kvm]
 [<ffffffffa0569f7d>] vmx_free_vcpu+0x7f/0x8f [kvm_intel]
 [<ffffffffa05357b5>] kvm_arch_destroy_vm+0x78/0x111 [kvm]
 [<ffffffffa053315b>] kvm_put_kvm+0xd4/0xfe [kvm]

Move it to kvm_arch_destroy_vm.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
2010-03-01 12:36:01 -03:00
Alexander Graf
a76f8497fd KVM: PPC: Move Shadow MSR calculation to function
We keep a copy of the MSR around that we use when we go into the guest context.

That copy is basically the normal process MSR flags OR some allowed guest
specified MSR flags. We also AND the external providers into this, so we get
traps on FPU usage when we haven't activated it on the host yet.

Currently this calculation is part of the set_msr function that we use whenever
we set the guest MSR value. With the external providers, we also have the case
that we don't modify the guest's MSR, but only want to update the shadow MSR.

So let's move the shadow MSR parts to a separate function that we then use
whenever we only need to update it. That way we don't accidently kvm_vcpu_block
within a preempt notifier context.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:56 -03:00
Alexander Graf
f7adbba1e5 KVM: PPC: Keep SRR1 flags around in shadow_msr
SRR1 stores more information that just the MSR value. It also stores
valuable information about the type of interrupt we received, for
example whether the storage interrupt we just got was because of a
missing htab entry or not.

We use that information to speed up the exit path.

Now if we get preempted before we can interpret the shadow_msr values,
we get into vcpu_put which then calls the MSR handler, which then sets
all the SRR1 information bits in shadow_msr to 0. Great.

So let's preserve the SRR1 specific bits in shadow_msr whenever we set
the MSR. They don't hurt.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:56 -03:00
Alexander Graf
180a34d2d3 KVM: PPC: Add support for FPU/Altivec/VSX
When our guest starts using either the FPU, Altivec or VSX we need to make
sure Linux knows about it and sneak into its process switching code
accordingly.

This patch makes accesses to the above parts of the system work inside the
VM.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:52 -03:00
Alexander Graf
d5e528136c KVM: PPC: Add helper functions to call real mode loaders
Linux contains quite some bits of code to load FPU, Altivec and VSX lazily for
a task. It calls those bits in real mode, coming from an interrupt handler.

For KVM we better reuse those, so let's wrap a bit of trampoline magic around
them and then we can call them from normal module code.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:52 -03:00
Alexander Graf
4b5c9b7f9b KVM: PPC: Make large pages work
An SLB entry contains two pieces of information related to size:

  1) PTE size
  2) SLB size

The L bit defines the PTE be "large" (usually means 16MB),
SLB_VSID_B_1T defines that the SLB should span 1 GB instead of the
default 256MB.

Apparently I messed things up and just put those two in one box,
shaked it heavily and came up with the current code which handles
large pages incorrectly, because it also treats large page SLB entries
as "1TB" segment entries.

This patch splits those two features apart, making Linux guests boot
even when they have > 256MB.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:50 -03:00
Alexander Graf
5f2b105a1d KVM: PPC: Pass through program interrupts
When we get a program interrupt in guest kernel mode, we try to emulate the
instruction.

If that doesn't fail, we report to the user and try again - at the exact same
instruction pointer. So if the guest kernel really does trigger an invalid
instruction, we loop forever.

So let's better go and forward program exceptions to the guest when we don't
know the instruction we're supposed to emulate.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf
ff1ca3f983 KVM: PPC: Pass program interrupt flags to the guest
When we need to reinject a program interrupt into the guest, we also need to
reinject the corresponding flags into the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf
d35feb26ef KVM: PPC: Fix HID5 setting code
The code to unset HID5.dcbz32 is broken.
This patch makes it do the right rotate magic.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf
25a8a02d26 KVM: PPC: Emulate trap SRR1 flags properly
Book3S needs some flags in SRR1 to get to know details about an interrupt.

One such example is the trap instruction. It tells the guest kernel that
a program interrupt is due to a trap using a bit in SRR1.

This patch implements above behavior, making WARN_ON behave like WARN_ON.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf
021ec9c69f KVM: PPC: Call SLB patching code in interrupt safe manner
Currently we're racy when doing the transition from IR=1 to IR=0, from
the module memory entry code to the real mode SLB switching code.

To work around that I took a look at the RTAS entry code which is faced
with a similar problem and did the same thing:

  A small helper in linear mapped memory that does mtmsr with IR=0 and
  then RFIs info the actual handler.

Thanks to that trick we can safely take page faults in the entry code
and only need to be really wary of what to do as of the SLB switching
part.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf
bc90923e27 KVM: PPC: Get rid of unnecessary RFI
Using an RFI in IR=1 is dangerous. We need to set two SRRs and then do an RFI
without getting interrupted at all, because every interrupt could potentially
overwrite the SRR values.

Fortunately, we don't need to RFI in at least this particular case of the code,
so we can just replace it with an mtmsr and b.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf
b4433a7cce KVM: PPC: Implement 'skip instruction' mode
To fetch the last instruction we were interrupted on, we enable DR in early
exit code, where we are still in a very transitional phase between guest
and host state.

Most of the time this seemed to work, but another CPU can easily flush our
TLB and HTAB which makes us go in the Linux page fault handler which totally
breaks because we still use the guest's SLB entries.

To work around that, let's introduce a second KVM guest mode that defines
that whenever we get a trap, we don't call the Linux handler or go into
the KVM exit code, but just jump over the faulting instruction.

That way a potentially bad lwz doesn't trigger any faults and we can later
on interpret the invalid instruction we fetched as "fetch didn't work".

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:48 -03:00
Alexander Graf
7e57cba060 KVM: PPC: Use PACA backed shadow vcpu
We're being horribly racy right now. All the entry and exit code hijacks
random fields from the PACA that could easily be used by different code in
case we get interrupted, for example by a #MC or even page fault.

After discussing this with Ben, we figured it's best to reserve some more
space in the PACA and just shove off some vcpu state to there.

That way we can drastically improve the readability of the code, make it
less racy and less complex.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:48 -03:00
Alexander Graf
992b5b29b5 KVM: PPC: Add helpers for CR, XER
We now have helpers for the GPRs, so let's also add some for CR and XER.

Having them in the PACA simplifies code a lot, as we don't need to care
about where to store CC or not to overflow any integers.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:47 -03:00
Alexander Graf
8e5b26b55a KVM: PPC: Use accessor functions for GPR access
All code in PPC KVM currently accesses gprs in the vcpu struct directly.

While there's nothing wrong with that wrt the current way gprs are stored
and loaded, it doesn't suffice for the PACA acceleration that will follow
in this patchset.

So let's just create little wrapper inline functions that we call whenever
a GPR needs to be read from or written to. The compiled code shouldn't really
change at all for now.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:47 -03:00
Alexander Graf
97c4cfbe89 KVM: PPC: Enable lightweight exits again
The PowerPC C ABI defines that registers r14-r31 need to be preserved across
function calls. Since our exit handler is written in C, we can make use of that
and don't need to reload r14-r31 on every entry/exit cycle.

This technique is also used in the BookE code and is called "lightweight exits"
there. To follow the tradition, it's called the same in Book3S.

So far this optimization was disabled though, as the code didn't do what it was
expected to do, but failed to work.

This patch fixes and enables lightweight exits again.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:35:46 -03:00
Alexander Graf
b480f780f0 KVM: PPC: Fix typo in rebolting code
When we're loading bolted entries into the SLB again, we're checking if an
entry is in use and only slbmte it when it is.

Unfortunately, the check always goes to the skip label of the first entry,
resulting in an endless loop when it actually gets triggered.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:35:46 -03:00
Marcelo Tosatti
79fac95ecf KVM: convert slots_lock to a mutex
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:35:45 -03:00
Marcelo Tosatti
f7784b8ec9 KVM: split kvm_arch_set_memory_region into prepare and commit
Required for SRCU convertion later.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:35:44 -03:00
Marcelo Tosatti
46a26bf557 KVM: modify memslots layout in struct kvm
Have a pointer to an allocated region inside struct kvm.

[alex: fix ppc book 3s]

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:35:43 -03:00
Alexander Graf
0bb1fb7178 KVM: powerpc: Remove AGGRESSIVE_DEC
Because we now emulate the DEC interrupt according to real life behavior,
there's no need to keep the AGGRESSIVE_DEC hack around.

Let's just remove it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:42 -03:00
Alexander Graf
7706664d39 KVM: powerpc: Improve DEC handling
We treated the DEC interrupt like an edge based one. This is not true for
Book3s. The DEC keeps firing until mtdec is issued again and thus clears
the interrupt line.

So let's implement this logic in KVM too. This patch moves the line clearing
from the firing of the interrupt to the mtdec emulation.

This makes PPC64 guests work without AGGRESSIVE_DEC defined.

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:42 -03:00
Alexander Graf
583617b786 KVM: powerpc: Move vector to irqprio resolving to separate function
We're using a switch table to find the irqprio that belongs to a specific
interrupt vector. This table is part of the interrupt inject logic.

Since we'll add a new function to stop interrupts, let's move this table
out of the injection logic into a separate function.

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:41 -03:00
Avi Kivity
50eb2a3cd0 KVM: Add KVM_MMIO kconfig item
s390 doesn't have mmio, this will simplify ifdefing it out.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:41 -03:00
David S. Miller
2bb4646fce Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-02-16 22:09:29 -08:00
Alexander Graf
e1f829b6f4 KVM: powerpc: Show timing option only on embedded
Embedded PowerPC KVM has an exit timing implementation to track and evaluate
how much time was spent in which exit path.

For Book3S, we don't implement it. So let's not expose it as a config option
either.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-01-25 12:26:36 -02:00
David S. Miller
51c24aaaca Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-01-23 00:31:06 -08:00
Michael S. Tsirkin
3a4d5c94e9 vhost_net: a kernel-level virtio server
What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.

There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
  migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)

common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear.  I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.

What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.

How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device.  Backend is also configured by userspace, including vlan/mac
etc.

Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.

Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use

Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item.  The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.

(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)

Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-15 01:43:29 -08:00
Alexander Graf
5279aeb4b9 KVM: powerpc: Fix mtsrin in book3s_64 mmu
We were shifting the Ks/Kp/N bits one bit too far on mtsrin. It took
me some time to figure that out, so I also put in some debugging and a
comment explaining the conversion.

This fixes current OpenBIOS boot on PPC64 KVM.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-27 13:36:34 -02:00
Benjamin Herrenschmidt
bcd6acd51f Merge commit 'origin/master' into next
Conflicts:
	include/linux/kvm.h
2009-12-09 17:14:38 +11:00
Alexander Graf
e15a113700 powerpc/kvm: Sync guest visible MMU state
Currently userspace has no chance to find out which virtual address space we're
in and resolve addresses. While that is a big problem for migration, it's also
unpleasent when debugging, as gdb and the monitor don't work on virtual
addresses.

This patch exports enough of the MMU segment state to userspace to make
debugging work and thus also includes the groundwork for migration.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-08 16:02:50 +11:00
Hollis Blanchard
c0a187e12d KVM: powerpc: Fix BUILD_BUG_ON condition
The old BUILD_BUG_ON implementation didn't work with __builtin_constant_p().
Fixing that revealed this test had been inverted for a long time without
anybody noticing...

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:22 +02:00
Alexander Graf
10474ae894 KVM: Activate Virtualization On Demand
X86 CPUs need to have some magic happening to enable the virtualization
extensions on them. This magic can result in unpleasant results for
users, like blocking other VMMs from working (vmx) or using invalid TLB
entries (svm).

Currently KVM activates virtualization when the respective kernel module
is loaded. This blocks us from autoloading KVM modules without breaking
other VMMs.

To circumvent this problem at least a bit, this patch introduces on
demand activation of virtualization. This means, that instead
virtualization is enabled on creation of the first virtual machine
and disabled on destruction of the last one.

So using this, KVM can be easily autoloaded, while keeping other
hypervisors usable.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:10 +02:00
Avi Kivity
367e1319b2 KVM: Return -ENOTTY on unrecognized ioctls
Not the incorrect -EINVAL.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:08 +02:00
Benjamin Herrenschmidt
e0ea8b2c06 powerpc/kvm: Fix non-modular build
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 17:17:12 +11:00
Benjamin Herrenschmidt
d4e09f8432 Merge branch 'kvm' into next 2009-11-05 17:16:13 +11:00
Benjamin Herrenschmidt
41c8c46bfe powerpc/kvm: Remove problematic BUILD_BUG_ON statement
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 17:16:08 +11:00
Benjamin Herrenschmidt
38634e6769 powerpc/kvm: Remove problematic BUILD_BUG_ON statement
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 17:10:34 +11:00
Alexander Graf
544c6761bb Use hrtimers for the decrementer
Following S390's good example we should use hrtimers for the decrementer too!
This patch converts the timer from the old mechanism to hrtimers.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:51:05 +11:00
Alexander Graf
346b2762a7 Fix trace.h
It looks like the variable "pc" is defined. At least the current code always
failed on me stating that "pc" is already defined somewhere else.

Let's use _pc instead, because that doesn't collide.

Is this the right approach? Does it break on 440 too? If not, why not?

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:50:27 +11:00
Alexander Graf
c4f9c779f1 Include Book3s_64 target in buildsystem
Now we have everything in place to be able to build KVM, so let's add it
as config option and in the Makefile.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:50:26 +11:00
Alexander Graf
0186fd0373 Export KVM symbols for module
To be able to keep KVM as module, we need to export the SLB trampoline
addresses to the module, so it knows where to jump to.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:50:25 +11:00
Alexander Graf
513579e3a3 Add desktop PowerPC specific emulation
Little opcodes behave differently on desktop and embedded PowerPC cores.
In order to reflect those differences, let's add some #ifdef code to emulate.c.

We could probably also handle them in the core specific emulation files, but I
would prefer to reuse as much code as possible.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:57 +11:00
Alexander Graf
9a7a9b09fe Add mfdec emulation
We support setting the DEC to a certain value right now. Doing that basically
triggers the CPU local timer.

But there's also an mfdec command that enabled the OS to read the decrementor.

This is required at least by all desktop and server PowerPC Linux kernels. It
can't really hurt to allow embedded ones to do it as well though.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:56 +11:00
Alexander Graf
c215c6e49f Add book3s_64 specific opcode emulation
There are generic parts of PowerPC that can be shared across all
implementations and specific parts that only apply to BookE or desktop PPCs.

This patch adds emulation for desktop specific opcodes that don't apply
to BookE CPUs.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:55 +11:00
Alexander Graf
0123518081 Add book3s_32 guest MMU
This patch adds an implementation for a G3/G4 MMU, so we can run G3 and
G4 guests in KVM on Book3s_64.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:55 +11:00
Alexander Graf
e71b2a39af Add book3s_64 guest MMU
To be able to run a guest, we also need to implement a guest MMU.

This patch adds MMU handling for Book3s_64 guests.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:55 +11:00
Alexander Graf
0d8dc681c8 Add book3s_64 Host MMU handling
We designed the Book3S port of KVM as modular as possible. Most
of the code could be easily used on a Book3S_32 host as well.

The main difference between 32 and 64 bit cores is the MMU. To keep
things well separated, we treat the book3s_64 MMU as one possible compile
option.

This patch adds all the MMU helpers the rest of the code needs in
order to modify the host's MMU, like setting PTEs and segments.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:54 +11:00
Alexander Graf
2f4cf5e42d Add book3s.c
This adds the book3s core handling file. Here everything that is generic to
desktop PowerPC cores is handled, including interrupt injections, MSR settings,
etc.

It basically takes over the same role as booke.c for embedded PowerPCs.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:54 +11:00
Alexander Graf
c862125c8a Add interrupt handling code
Getting from host state to the guest is only half the story. We also need
to return to our host context and handle whatever happened to get us out of
the guest.

On PowerPC every guest exit is an interrupt. So all we need to do is trap
the host's interrupt handlers and get into our #VMEXIT code to handle it.

PowerPCs also have a register that can add an offset to the interrupt handlers'
adresses which is what the booke KVM code uses. Unfortunately that is a
hypervisor ressource and we also want to be able to run KVM when we're running
in an LPAR. So we have to hook into the Linux interrupt handlers.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:54 +11:00
Alexander Graf
5126ed3760 Add SLB switching code for entry/exit
This is the really low level of guest entry/exit code.

Book3s_64 has an SLB, which stores all ESID -> VSID mappings we're
currently aware of.

The segments in the guest differ from the ones on the host, so we need
to switch the SLB to tell the MMU that we're in a new context.

So we store a shadow of the guest's SLB in the PACA, switch to that on
entry and only restore bolted entries on exit, leaving the rest to the
Linux SLB fault handler.

That way we get a really clean way of switching the SLB.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:53 +11:00
Alexander Graf
29eb61bca1 Add book3s_64 highmem asm code
This is the of entry / exit code. In order to switch between host and guest
context, we need to switch register state and call the exit code handler on
exit.

This assembly file does exactly that. To finally enter the guest it calls
into book3s_64_slb.S. On exit it gets jumped at from book3s_64_slb.S too.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:53 +11:00