Whilst IOMMU is enabled for the Intel GPU on Ironlake, it appears that
using WC writes to update the PTE on the GPU fails miserably. The
result looks like the majority of the writes do not land leading to
lots of screen corruption and a hard system hang.
v2: s/</<=/ to preserve the current exclusion of Sandybridge
Reported-by: Nathan Myers <ncm@cantrip.org>
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=60391
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Nathan Myers <ncm@cantrip.org>
[danvet: Remove cc: stable and add tested-by.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When I refactored the code initially, I forgot that gen2 uses a
different bar for the CPU mappable aperture. The agp-less code knows
nothing of generations less than 5, so we have to expand the gtt_probe
function to include the mappable base and end.
It was originally broken by me:
commit baa09f5fd8
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Thu Jan 24 13:49:57 2013 -0800
drm/i915: Add probe and remove to the gtt ops
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the probe call in our dispatch table, we can now cut away the
last three remaining members in the intel_gtt shared struct and so
remove it completely.
v2: Rebased on top of Daniel's series
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: bikeshed commit message a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It is no longer used in the i915 code, so isolate it from the shared
struct.
This was originally part of:
commit 0e275518f325418d559c05327775bff894b237f7
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Mon Jan 14 13:35:33 2013 -0800
agp/intel: decouple more of the agp-i915 sharing
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
That commit had some other hunks which can't be used due to issues
Daniel found in previous commits.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: drop squash notice from the commit since it's imo ok to keep
this one separate.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The reasoning behind our code taking two paths depending upon whether or
not we may have been configured for IOMMU isn't clear to me. It should
always be safe to use the pci mapping functions as they are designed to
abstract the decision we were handling in i915.
Aside from simpler code, removing another member for the intel_gtt
struct is a nice motivation.
I ran this by Chris, and he wasn't concerned about the extra kzalloc,
and memory references vs. page_to_phys calculation in the case without
IOMMU.
v2: Update commit message
v3: Remove needs_dmar addition from Zhenyu upstream
This reverts (and then other stuff)
commit 20652097da
Author: Zhenyu Wang <zhenyuw@linux.intel.com>
Date: Thu Dec 13 23:47:47 2012 +0800
drm/i915: Fix missed needs_dmar setting
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in follow-up fix to remove the bogus hunk which
deleted the dma_mask configuration for gen6+.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We already had a mapping in both (minus the phys_addr in AGP).
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And, move it to where the rest of the logic is.
There is some slight functionality changes. There was extra paranoid
checks in AGP code making sure we never do idle maps on gen2 parts. That
was not duplicated as the simple PCI id check should do the right thing.
v2: use IS_GEN5 && IS_MOBILE check instead. For now, this is the same as
IS_IRONLAKE_M but is more future proof. The workaround docs hint that
more than one platform may be effected, but we've never seen such a
platform in the wild. (Rodrigo, Daniel)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v1)
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This removes an unused field from the AGP structure and moves it into
the dev_priv structure (with a slightly better name). This builds upon
the kill-agp series already merged.
GSM is a well defined term in the bspec:
GSM: Graphics Stolen Memory
GTT stolen space is defined for storage of the GFX GTT entries in
physical memory. IA can not access GSM directly , it can only access via
GTTMMADR. GT can access GSM directly or through GTTMMADR.
This is not the entire stolen space.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
v2: Accidently removed an ILK case in i9xx_setup (Nicely found by Chris)
CC: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by [v1] : Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As a quick hack we make the old intel_gtt structure mutable so we can
fool a bunch of the existing code which depends on elements in that data
structure. We can/should try to remove this in a subsequent patch.
This should preserve the old gtt init behavior which upon writing these
patches seems incorrect. The next patch will fix these things.
The one exception is VLV which doesn't have the preserved flush control
write behavior. Since we want to do that for all GEN6+ stuff, we'll
handle that in a later patch. Mainstream VLV support doesn't actually
exist yet anyway.
v2: Update the comment to remove the "voodoo"
Check that the last pte written matches what we readback
v3: actually kill cache_level_to_agp_type since most of the flags will
disappear in an upcoming patch
v4: v3 was actually not what we wanted (Daniel)
Make the ggtt bind assertions better and stricter (Chris)
Fix some uncaught errors at gtt init (Chris)
Some other random stuff that Chris wanted
v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a
tad more robust by mapping everything != CACHE_NONE to the cached agp
flag - we have a 1:1 uncached mapping, but different modes of
cacheable (at least on later generations). Suggested by Chris Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It doesn't work since the gtt pte range sits in the middle of the mmio
bar. We didn't notice that since both my and Chris' gen2 machines
don't support PAT and hence all wc io mapping request will
automatically be demoted to uc.
This regression has been introduce in
commit edef7e685d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Sep 14 11:57:47 2012 +0100
agp/intel: Use a write-combining map for updating PTEs
Reported-by: Egbert Eich <eich@pdx.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55834
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rewriting the PTE entries using an WC mapping is roughly an order of
magnitude faster than through the uncached mapping. This makes an
observable difference on workloads that cycle through large numbers of
buffers, for example Chromium using ShmPixmaps where virtually all the
CPU time is currently spent rebinding the userptr.
v2: Limit the WC mapping to older generations as we have observed that
the TLB invalidation on SandyBridge+ is unreliable with WC updates.
See i-g-t/tests/gem_gtt_cpu_tlb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rather than have multiple data structures for describing our page layout
in conjunction with the array of pages, we can migrate all users over to
a scatterlist.
One major advantage, other than unifying the page tracking structures,
this offers is that we replace the vmalloc'ed array (which can be up to
a megabyte in size) with a chain of individual pages which helps reduce
memory pressure.
The disadvantage is that we then do not have a simple array to iterate,
or to access randomly. The common case for this is in the relocation
processing, which will typically fit within a single scatterlist page
and so be almost the same cost as the simple array. For iterating over
the array, the extra function call could be optimised away, but in
reality is an insignificant cost of either binding the pages, or
performing the pwrite/pread.
v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the
trivial compile error from rebasing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
They've changed it ... for no apparent reason. Meh.
V2: remove unused 'is_hsw' field.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Also properly indent the HB IDs.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
VLV is a gen7 device, but we don't currently handle that in the switch.
So add it and write the PTEs correctly.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The PTE format is similar to SNB, but we don't support an MLC and don't
need chipset flushing.
Note: I have my questions whether this is right, given that MLC died
for snb & ivb, that ivb has grown a L3$ cache instead (which vlv seems
to have, too) and that the LLC bit here isn't actually LLC, but just
means 'snoop cpu caches'.
But I plan to burn this all with the heat of a thousands suns in my
gtt rework, so who cares ;-)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Added note.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When drm/i915 is in control of the gtt, we need to call
the enable function at all the relevant places ourselves.
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need this thing much earlier, and it doesn't make sense
in the hw enabling function intel_enable_gtt - this does not
change over a suspend/resume cycle ...
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To be able to directly set up the intel-gtt code from drm/i915 and
avoid setting up the fake-agp driver we need to prepare a few things:
- pass both the bridge and gpu pci_dev to the probe function and add
code to handle the gpu pdev both being present (for drm/i915) and
not present (fake agp).
- add refcounting to the remove function so that unloading drm/i915
doesn't kill the fake agp driver
v2: Fix up the cleanup and refcount, noticed by Jani Nikula.
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We only need it to fake the agp interface and don't actually
use it in the driver anywhere. Hence conditionalize that.
This is just a prep patch to eventually disable the fake agp
driver on gen6+.
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For that to work we need to export the base address of the gtt
mmio window from intel-gtt. Also replace all other uses of
dev->agp by values we already have at hand.
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is a leftover from the conversion of the i81x fake agp driver
over to the new intel-gtt code layoute.
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter wrote
First pull request for 3.5-next, slightly large than usual because new
things kept coming in since the last pull for 3.4.
Highlights:
- first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci
ids are not yet added, and there's still quite a few patches to merge
(mostly modesetting). To make QA easier I've decided to merge this stuff
in pieces.
- loads of cleanups and prep patches spurred by the above. Especially vlv
is a real frankenstein chip, but also hsw is stretching our driver's
code design. Expect more to come in this area for 3.5.
- more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again,
there are more patches needed (and some already queued up), but I wanted
to split this a bit for better testing.
- pwrite/pread rework and retuning. This series has been in the works for
a few months already and a lot of i-g-t tests have been created for it.
Now it's finally ready to be merged. Note that one patch in this series
touches include/pagemap.h, that patch is acked-by akpm.
- reduce mappable pressure and relocation throughput improvements from
Chris.
- mmap offset exhaustion mitigation by Chris Wilson.
- a start at figuring out which codepaths in our messy dri1/ums+gem/kms
driver we actually need to support by bailing out of unsupported case.
The driver now refuses to load without kms on gen6+ and disallows a few
ioctls that userspace never used in certain cases. More of this will
definitely come.
- More decoupling of global gtt and ppgtt.
- Improved dual-link lvds detection by Takashi Iwai.
- Shut up the compiler + plus fix the fallout (Ben)
- Inverted panel brightness handling (mostly Acer manages to break things
in this way).
- Small fixlets and adjustements and some minor things to help debugging.
Regression-wise QA reported quite a few issues on ivb, but all of them
turned out to be hw stability issues which are already fixed in
drm-intel-fixes (QA runs the nightly regression tests on -next alone,
without -fixes automatically merged in). There's still one issue open on
snb, it looks like occlusion query writes are not quite as cache coherent
as we've expected. With some of the pwrite adjustements we can now
reliably hit this. Kernel workaround for it is in the works."
* 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits)
drm/i915: VCS is not the last ring
drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2
drm/i915: make quirks more verbose
drm/i915: dump the DMA fetch addr register on pre-gen6
drm/i915/sdvo: Include YRPB as an additional TV output type
drm/i915: disallow gem init ioctl on ilk
drm/i915: refuse to load on gen6+ without kms
drm/i915: extract gt interrupt handler
drm/i915: use render gen to switch ring irq functions
drm/i915: rip out old HWSTAM missed irq WA for vlv
drm/i915: open code gen6+ ring irqs
drm/i915: ring irq cleanups
drm/i915: add SFUSE_STRAP registers for digital port detection
drm/i915: add WM_LINETIME registers
drm/i915: add WRPLL clocks
drm/i915: add LCPLL control registers
drm/i915: add SSC offsets for SBI access
drm/i915: add port clock selection support for HSW
drm/i915: add S PLL control
drm/i915: add PIXCLK_GATE register
...
Conflicts:
drivers/char/agp/intel-agp.h
drivers/char/agp/intel-gtt.c
drivers/gpu/drm/i915/i915_debugfs.c
This adds product definitions for desktop, mobile and server boards.
v2: split into a separate patch, add .has_pch_split feature.
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Totally unexpected that this regressed. Luckily it sounds like we just
need to have dmar disable on the igfx, not the entire system. At least
that's what a few days of testing between Tony Vroon and me indicates.
Reported-by: Tony Vroon <tony@linx.net>
Cc: Tony Vroon <tony@linx.net>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43024
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds PCI ID for IVB GT2 server variant which we were missing.
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
[danvet: fix up conflict because the patch has been diffed against next. tsk.]
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... and bind it right to the PCI id.
Note that there are still a few things to fix here:
- we need to move the tlb flush to a better place in drm/i915.
- we need to check snoop support on vlv and implement it.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: squash follow-on patch and add todo items to commit msg.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need to flush the Gunit TLB when we update GTT PTEs on VLV, but the
register for doing so is above the range we normally map. Map the whole
register space to make sure we can get it.
v2: only map the larger space on gen7+ (Daniel)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need this because ppgtt page directory entries need to be in the
global gtt pagetable.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To implement a PPGTT for drm/i915 that fully aliases the GTT, we also
need to properly alias the scratch page.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Kernels with no iommu support cannot ever need the Ironlake
work-around, so never enable it in that case.
Might be better to completely remove the work-around from the kernel
in this case?
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
The semi-colon is a typo here and it makes the if statement
unconditional.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
* 'drm-core-next' of git://people.freedesktop.org/~airlied/linux: (290 commits)
Revert "drm/ttm: add a way to bo_wait for either the last read or last write"
Revert "drm/radeon/kms: add a new gem_wait ioctl with read/write flags"
vmwgfx: Don't pass unused arguments to do_dirty functions
vmwgfx: Emulate depth 32 framebuffers
drm/radeon: Lower the severity of the radeon lockup messages.
drm/i915/dp: Fix eDP on PCH DP on CPT/PPT
drm/i915/dp: Introduce is_cpu_edp()
drm/i915: use correct SPD type value
drm/i915: fix ILK+ infoframe support
drm/i915: add DP test request handling
drm/i915: read full receiver capability field during DP hot plug
drm/i915/dp: Remove eDP special cases from bandwidth checks
drm/i915/dp: Fix the math in intel_dp_link_required
drm/i915/panel: Always record the backlight level again (but cleverly)
i915: Move i915_read/write out of line
drm/i915: remove transcoder PLL mashing from mode_set per specs
drm/i915: if transcoder disable fails, say which
drm/i915: set watermarks for third pipe on IVB
drm/i915: export a CPT mode set verification function
drm/i915: fix transcoder PLL select masking
...
Idle the GPU before doing any unmaps. We know if VT-d is in use through
an exported variable from iommu code.
This should avoid a known HW issue.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
Just use the Sandy Bridge routines.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
We can only utilize the stolen portion of the GTT if we are in sole
charge of the hardware. This is only true if using GEM and KMS,
otherwise VESA continues to access stolen memory.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
They got mixed up when the switch was converted to a table in 2007.
Signed-off-by: Oswald Buddenhagen <ossi@kde.org>
[ickle: minor changes for 2.6.37+]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Previous to the last GTT rework we always rewrote the GTT then unmapped the
object, somehow this got reversed in the rework in 2.6.37-rc5 timeframe.
This fix needs to go to stable in an alternate form since the code changed.
This fixes DMAR reports on my Ironlake HP2540p.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This code was setting up the status page before setting the DMAR-is-on-bit,
so we were getting DMAR errors on the status page. Reverse the two bits
of init code to the correct result.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Flush the chipset write buffers before and after adjusting the GTT base
register, just in case. We only modify this value upon initialisation
(boot and resume) so there should be no outstanding writes, however
there are always those persistent PGTBL_ER that keep getting reported
upon resume.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This fixes regression from a6963596a1,
that missed to set cached memory type in GTT entry.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Add a missing NULL check and fix the wrong address passed to kunmap()
in i830_cleanup().
Cc: stable@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[danvet: added cc stable]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Just some minor shuffling to get rid of any agp traces in the
exported functions.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The intel drm calls the chipset functions now directly. Userspace
never called the corresponding ioctl, hence it can be killed, too.
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>