Intel Merrifield is known to have an SDIO interface and on Intel Edison board a
WiFi card is wired to it.
Enable the interface here to allow WiFi card enumeration.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Refactor intel_mrfld_mmc_probe_slot() to use switch case. The change allows to
add a support for SD and SDIO interfaces without any pain.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Hisilicon Hikey have no tuning function in dw_mmc-k3.c,
so we must do the tuning function stub when we init UHS card.
Signed-off-by: Jin Guojun <kid.jin@hisilicon.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The following log we found indicate the fact that dw_mmc
didn't treat EBE or SBE as a similar problem as CRC error.
-EIO is quite not informative as it may indicate that the device
is broken rather than that of tuning stuff.
...
[ 89.057226] bcmsdh_sdmmc: Failed to Read byte F1:@0x1001f=ff, Err: -5
[ 89.058811] bcmsdh_sdmmc: Failed to Read byte F1:@0x1001f=ff, Err: -5
[ 89.059415] bcmsdh_sdmmc: Failed to Read byte F1:@0x1000e=ff, Err: -84
[ 89.254248] dwmmc_rockchip fe310000.dwmmc: Successfully tuned phase to 199
[ 89.273912] dhd_set_suspend: Remove extra suspend setting
[ 89.274478] dhd_enable_packet_filter: enter, value = 0
64 bytes from 112.90.83.112: icmp_seq=24 ttl=53 time=1321 ms
64 bytes from 112.90.83.112: icmp_seq=25 ttl=53 time=319 ms
64 bytes from 112.90.83.112: icmp_seq=26 ttl=53 time=69.8 ms
64 bytes from 112.90.83.112: icmp_seq=27 ttl=53 time=37.5 ms
...
For the host, when failing to sample cmd's response due to
tuning stuff, we still return -EIO as it's quite vague to figure
out whether it related to signal or just the broken devices, especially
for the card type detection when booting kernel as all things go well
but the cmd set used.
But for the data phase, if receiving the cmd's response which
carriess data transfer, we should have more confidence that it
is very probably related to the tuning stuff.
Just as the log shown above, we sometimes suffer too much
this kind of pain as the dw_mmc return -EIO for the case, so
mmc-core will not do retune and caller drivers like bcm's wifi
driver, still retry the failure more and more until dw_mmc
finally generate CRC.
Adrian suggested that drivers who care the specific cases should
call mmc_retune_needed rather than doing it in mmc core. It makes
sense but I'm considering that -EILSEQ actually means illegal sequence
, so we use it for CRC cases. Meanwhile, SBE/EBE indicate the illegal
sequence of start bit or end bit for data0~7. So I realize that we should
use -EILSEQ for them both as well CRC cases.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Dwmmc host controller may in unknown state when entering kernel boot. One
example is when booting from eMMC, bootloader need initialize MMC host
controller into some state so it can read. In order to make sure MMC host
controller in a clean initial state, this reset support is added.
With this patch, a 'resets' property can be added into dw_mmc device
tree node. The hardware logic is: dwmmc host controller IP receives a reset
signal from a 'reset provider' (eg. power management unit). The 'resets'
property points to this reset signal. So, during dwmmc driver probe,
it can use this signal to reset itself.
Refer to [1] for more information.
[1] Documentation/devicetree/bindings/reset/reset.txt
Signed-off-by: Guodong Xu <guodong.xu@linaro.org>
Signed-off-by: Xinwei Kong <kong.kongxinwei@hisilicon.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Add resets property to synopsys-dw-mshc bindings. It is intended to
represent the hardware reset signal present internally in some host
controller IC designs.
See Documentation/devicetree/bindings/reset/reset.txt for details.
Signed-off-by: Guodong Xu <guodong.xu@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
If ciu/biu clock are NULL, clk_disable_unprepare should be just
returned. In clk_disable_unprepare(), already checked whether clk is
error or NULL.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The intention to remove it comes from the conflict of
what the mmc-core does with the way dw_mmc treats disable-wp.
We could see that 'disable-wp' is supported by core but
it's deprecated by dw_mmc as we don't expect it to be existed
for each slot subnode but should be in the parent node. Based
on searching for all the upstream dts using dw_mmc, we're
confident that none of them use the deprecated way. Maybe
we should take old dtb in consideration but it was a flag day
since the time we was considering to take it away. The fact is
that there are none of dts using the deprecated way since v3.18
or even earlier. So personally I don't believe the old dtb
would/could bootup current kernel(may not?). Let's remove it now.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Chipsets before Exynos5420 did not support HS400 so if MMC core tries to
configure HS400 timing, this might or might not work. Warn in such
cases because this is DTB misconfiguration.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
CMD23 aka SET_BLOCK_COUNT was introduced with MMC v3.1.
Older versions of the specification allowed to terminate
multi-block transfers only with CMD12.
The patch fixes the following problem:
mmc0: new MMC card at address 0001
mmcblk0: mmc0:0001 SDMB-16 15.3 MiB
mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
...
blk_update_request: I/O error, dev mmcblk0, sector 0
Buffer I/O error on dev mmcblk0, logical block 0, async page read
mmcblk0: unable to read partition table
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The commit 1ef5e49e46 ("mmc: sdhci-of-esdhc: add/remove some quirks
according to vendor version") moved sdhci-of-esdhc away from using the
->platform_init() callback.
As it was the only user of it and that it seems reasonable to believe that
it won't be needed again, let's just remove it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
The commit 52ac7acf41 ("mmc: sdhci-pci: Convert to use managed functions
pcim_* and devm_*") converted ->probe() / ->remove() functions to use device
managed resource API. Here is a follow up to cover sdhci_pci_probe_slot() and
sdhci_pci_remove_slot().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
When using mmc_io_rw_extended, it's intent to avoid null
pointer of card and invalid func number. But actually it
didn't prevent that as the seg_size already use the card.
Currently the wrapper function sdio_io_rw_ext_helper already
use card before calling mmc_io_rw_extended, so we should move
this check to there. As to the func number, it was token from
'(ocr & 0x70000000) >> 28' which should be enough to guarantee
that it won't be larger than 7. But we should prevent the
caller like wifi drivers modify this value. So let's move this
check into sdio_io_rw_ext_helper either.
Also we remove the BUG_ON for mmc_send_io_op_cond since all
possible paths calling this function are protected by checking
the arguments in advance. After deploying these changes, we
could not see any panic within SDIO API even if func drivers
abuse the SDIO func APIs.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
packed should always exist without calling its cleanup function
explicitly. Moreover, we have use it when preparing packed list.
So I don't believe we should ever fall into this check again when
doing mmc_blk_packed_hdr_wrq_prep or mmc_blk_end_packed_req,etc.
And the code of mmc_blk_end_packed_req is trying to use packed before
checking it which makes it quite weird. This patch is trying to
remove these two checks and move it to the mmc_blk_prep_packed_list.
If we find packed is null, then we should never use MMC_BLK_PACKED_CMD.
By doing this, we could fall back to non-packed request if finding null
packed, though it's impossible theoretically.
After removing these two BUG_ONs, we also remove all other similar
checks within the routine of mmc_blk_issue_rw_rq which checks the
error handling of packed request.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The switch failure message in mmc_select_timing() had been removed
since that is invalid: commit 0400ed0a08 ("mmc: core: remove the
invalid message in mmc_select_timing")
Now, in the case when mmc_select_hs() return error in mmc_select_timing(),
there is nothing to print failure message.
Let's make for mmc_select_hs() print message itself in the failure case.
Signed-off-by: Jungseung Lee <js07.lee@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Populating card_busy caused a side-effect on a chip variant we don't
have documentation for (r8a73a4). So, enable it and voltage switching
only on devices known to support those features.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Fixes: 452e5eef6d ("mmc: tmio: Add UHS-I mode support")
card_busy is only used/tested on SDHI for R-Car Gen2 and later.
Move it to the SDHI driver, so we can then activate it conditionally
depending on the SDHI type.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The sunxi-mmc driver does not take into account the processor may be big
endian when writing the DMA descriptors. This causes cards not to be
detected when running a big-endian kernel. Change the descriptors for
IDMA to use __le32 and ensure they are suitably swapped before writing.
Tested successfully on the Cubieboard2.
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: linux-mmc@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
clk_round_rate() may return an error. Check it.
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
PHY intended to be used with the Arasan SDHCI 5.1 controller has trouble
turning on when the card clock is slow or off. Strangely these problems
appear to show up consistently on some boards while other boards work
fine, but on the boards where it shows up the problem reproduces 100% of
the time and is quite consistent in its behavior.
These problems can be fixed by always making sure that we power on the
PHY (and turn on its DLL) when the card clock is faster than about 50
MHz. Once on, we need to make sure that we never power down the PHY /
turn off its DLL until the clock is faster again.
We'll add logic for handling this into the sdhci-of-arasan driver. Note
that right now the only user of a PHY in the sdhci-of-arasan driver is
arasan,sdhci-5.1. It's presumed that all arasan,sdhci-5.1 PHY
implementations need this workaround, so the logic is only contingent on
having a PHY to control. If future Arasan controllers don't have this
problem we can add code to decide if we want this flow or not.
Also note that we check for slow clocks by checking for <= 400 kHz
rather than checking for 50 MHz. This keeps things the most consistent
and also means we can power the PHY on at max speed (where the DLL will
lock fastest). Presumably anyone who intends to run with a card clock
of < 50 MHz and > 400 kHz will be running on a device where this problem
is fixed anyway.
I believe this brings some resolution to the problems reported before.
See the commit 6fc09244d7 ("mmc: sdhci-of-arasan: Revert: Always power
the PHY off/on when clock changes").
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
wait_event_interruptible_timeout() will return early if the blocked
process receives a signal, causing the driver to abort the tuning
procedure and possibly leaving the controller in a bad state. Since the
tuning command is expected to complete quickly (<50ms) and we've set a
timeout, use wait_event_timeout() instead.
Signed-off-by: Christopher Freeman <cfreeman@nvidia.com>
Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
i.MX USDHC Reference Manual has a mistake, for the register SYS_CTRL,
the DTOCV(bit 19~16) means the data timeout counter value. When DTOCV
is set to 0xF, it means SDCLK << 29, not SDCLK << 28.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Now, when call esdhc_set_timeout() to set the data timeout counter value,
IPP_RST_N(bit 23) is wrongly affected. This patch add a mask to avoid this.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The field "owner" is set by core. Thus delete an extra initialisation.
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The SD Status register contains several important fields related to the
SD Card proprietary features.
Those fields may be used by user space applications for vendor specific
usage.
None of those fields are exported today by the driver to user space.
In this patch, we are reading the SD Status register and exporting
(using MMC_DEV_ATTR) the SD Status register to the user space.
Signed-off-by: Uri Yanai <uri.yanai@sandisk.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
kmalloc will print enough information in case of failure.
Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Use of_property_read_bool to check for the existence of a property.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e1,e2;
statement S2,S1;
@@
- if (of_get_property(e1,e2,NULL))
+ if (of_property_read_bool(e1,e2))
S1 else S2
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The core MMC code adds two (optional) regulator properites that drivers
should use to get their supplies. This is not documented anywhere so add
information on it.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In clk_set_rate() or clk_prepare_enable() error handling case, the
error return code ret is not set, so sdhci_bcm_kona_probe() return
0 in those error cases.
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Ray Jui <ray.jui@broadcom.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
wait_for_completion_timeout_interruptible returns long not unsigned long
so dma_time, which is used exclusively here, is changed to long.
Fixes: 1b66e94e6b ("mmc: moxart: Add MOXA ART SD/MMC driver")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Some devices need a while to boot their firmware after providing clks /
de-asserting resets before they are ready to receive sdio commands.
This commits adds a post-power-on-delay-ms devicetree property to
mmc-pwrseq-simple for use with such devices.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
On some boards (android tablets) different batches use different sdio
wifi modules. This is not a problem since mmc/sdio is an enumerable bus,
so we only need to describe and activate the mmc controller in dt and
then the kernel will automatically load the right driver.
Sometimes it is useful to specify certain ethernet properties for these
"unknown" sdio devices, specifically we want the boot-loader to be able
to set "local-mac-address" as some of these sdio wifi modules come without
an eeprom / without a factory programmed mac address.
Since the exact device is unknown (differs per batch) we cannot use
a wifi-chip specific compatible, thus sometimes it is desirable to have a
mmc function node, without having to make up an otherwise unused compatible
for the node, so make the compatible property optional.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Use the new sun7i-a20-mmc compatible for the mmc controllers on sun7i
and newer.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
It turns out that sun4i (A10) and sun5i (A13 & co) do not have sample
clocks, so add a new sun7i-a20-mmc compatible and do not try to use
sample clocks on sun4i / sun5i.
Since sun4i / sun5i do not have sample clocks, they cannot (reliably) do
DDR rates, so only set MMC_CAP_1_8V_DDR when we do have sample clks.
Note this patch leaves the clk_prepare_enable() / clk_disable_unprepare()
calls to the sample clks as-is, without adding checks for them being
NULL. All the clk_foo calls accept a NULL clk and will return success when
called with a NULL clk.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Add a sunxi_mmc_clk_set_phase() helper function.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Create a struct to hold the various model / compatible string dependend
settings.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
When support for the sample clks was added calls to prepare_enable
were added to the probe path, but matching calls to disable_unprepare
were forgotten in the remove path, this fixes this.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
some issues. This contains one fix by me and one by Al. I'm sure that
he'll come up with more but for now I tested these patches and they
don't appear to have any negative impact on tracing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJX6FvrAAoJEKKk/i67LK/8EuIH/Arf6vJidYsmbe57WQp8PU3I
bldem6ePj6zgZ2ZqPlSGCs1J2DcK4Bh3lPVxdx7rRKVWSd/Zoj+i83hvObusR8M7
Qs1G92bJTvvVO3aPfiN0GvKGdKfGn45L+j0BcBauiTRKqnj3PkhOhIP2/ks0ewSk
qeq7R3xxo/FDs26AHS69Hm0PIIw7btyhXNX4GB3Il7IIA5/nUknw3C+bjVj86tYX
R4iElcHEhplgoSjKuLgNIRZGUnEFtsm/fnohYXpHacLTUKNXnTDY230x/OKc1yyB
1vOfHS/y5s3XSJ1lcgSjYeNc51lK8NiDASaptZSUnOookKSAooUTFELNzpbc0sg=
=+Fr3
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracefs fixes from Steven Rostedt:
"Al Viro has been looking at the tracefs code, and has pointed out some
issues. This contains one fix by me and one by Al. I'm sure that
he'll come up with more but for now I tested these patches and they
don't appear to have any negative impact on tracing"
* tag 'trace-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
fix memory leaks in tracing_buffers_splice_read()
tracing: Move mutex to protect against resetting of seq data
When building XFS with -Werror, it now fails with:
include/linux/pagemap.h: In function 'fault_in_multipages_readable':
include/linux/pagemap.h:602:16: error: variable 'c' set but not used [-Werror=unused-but-set-variable]
volatile char c;
^
This is a regression caused by commit e23d4159b1 ("fix
fault_in_multipages_...() on architectures with no-op access_ok()").
Fix it by re-adding the "(void)c" trick taht was previously used to make
the compiler think the variable is used.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
PMDs respectively as requiring balancing upon a subsequent page fault.
User-defined PROT_NONE memory regions which also have this flag set will
not normally invoke the NUMA balancing code as do_page_fault() will send
a segfault to the process before handle_mm_fault() is even called.
However if access_remote_vm() is invoked to access a PROT_NONE region of
memory, handle_mm_fault() is called via faultin_page() and
__get_user_pages() without any access checks being performed, meaning
the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
region.
A simple means of triggering this problem is to access PROT_NONE mmap'd
memory using /proc/self/mem which reliably results in the NUMA handling
functions being invoked when CONFIG_NUMA_BALANCING is set.
This issue was reported in bugzilla (issue 99101) which includes some
simple repro code.
There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
added at commit c0e7cad to avoid accidentally provoking strange
behaviour by attempting to apply NUMA balancing to pages that are in
fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.
This patch moves the PROT_NONE check into mm/memory.c rather than
invoking BUG_ON() as faulting in these pages via faultin_page() is a
valid reason for reaching the NUMA check with the PROT_NONE page table
flag set and is therefore not always a bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101
Reported-by: Trevor Saunders <tbsaunde@tbsaunde.org>
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- powernv/pci: Fix m64 checks for SR-IOV and window alignment from Russell Currey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJX50CvAAoJEFHr6jzI4aWAqJAP/0/0D8YGwOuIoYD2GmfoasKR
TFbbuhX3xnfdiRG6w/sFBI3oh7icCw7hC+Qj1lNu9D3L/UkxOTBny+W07KvWzX44
Yu74nEHgq3mVrRAU4McztbKIUBK2zagGwwCcGZXZl/uQI1ylvmmpcH3xClQzF+oA
xKk8eB1OW2Ay6+y+FkSuyBHHSfww6QCk7ERPqaStCW9Uy+dDBjIwStLQuOpAhN/o
Z9K+JwpPJ8qgw1Pe9pvrD5MjcM0hR+tUZm6LklZCCk89feqlwcrz9cpOrmTdGuF+
n1iacpDaFf6IOlhI+6ImrT15llTgSk/nu9GNIRFDwOjVCuGy5aDQBtWuRFiVNggp
vkZWFSl594Jn5H9/s6MpMXygSl36NMKgM/ZKvUsEAe6mF0Kb9pZRB7b/aV+ajkCQ
rkQCe0KKSF6+D3wu3SmMe0NTc3/GkgxZN0lTnqUaB5PSRqwvVwurXugnAKr7arhj
JSu9/QSeOxNI5ytDF1Nf9/RN0DT+L1w0vun083DupyJkG1hrjzm9kI0lACQTr/QX
TxAWXGjiTsUOeM4pfNzqaJE4fNUc0TIc41jgWMx9qXzbKjhijgKEPtmyDMz93GVY
hFXyRAMsWUOsQGP5tiLFYG0PkNsmCDIwca+yg47EicBQGTpEsGLYUBRvIILYNBKI
ULl0yMLZWekl1rzthDdB
=35xQ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull one more powerpc fix from Michael Ellerman:
"powernv/pci: Fix m64 checks for SR-IOV and window alignment from
Russell Currey"
* tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment
The fixes to the radix tree test suite show that the multi-order case is
broken. The basic reason is that the radix tree code uses tagged
pointers with the "internal" bit in the low bits, and calculating the
pointer indices was supposed to mask off those bits. But gcc will
notice that we then use the index to re-create the pointer, and will
avoid doing the arithmetic and use the tagged pointer directly.
This cleans the code up, using the existing is_sibling_entry() helper to
validate the sibling pointer range (instead of open-coding it), and
using entry_to_node() to mask off the low tag bit from the pointer. And
once you do that, you might as well just use the now cleaned-up pointer
directly.
[ Side note: the multi-order code isn't actually ever used in the kernel
right now, and the only reason I didn't just delete all that code is
that Kirill Shutemov piped up and said:
"Well, my ext4-with-huge-pages patchset[1] uses multi-order entries.
It also converts shmem-with-huge-pages and hugetlb to them.
I'm okay with converting it to other mechanism, but I need
something. (I looked into Konstantin's RFC patchset[2]. It looks
okay, but I don't feel myself qualified to review it as I don't
know much about radix-tree internals.)"
[1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com
[2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg ]
Reported-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Cedric Blancher <cedric.blancher@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we replace a multiorder entry, check that all indices reflect the
new value.
Also, compile the test suite with -O2, which shows other problems with
the code due to some dodgy pointer operations in the radix tree code.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The iter->seq can be reset outside the protection of the mutex. So can
reading of user data. Move the mutex up to the beginning of the function.
Fixes: d7350c3f45 ("tracing/core: make the read callbacks reentrants")
Cc: stable@vger.kernel.org # 2.6.30+
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Commit 432c6bacbd ("MIPS: Use per-mm page to execute branch delay slot
instructions") accidentally removed use of the MIPS_FPU_EMU_INC_STATS
macro from do_dsemulret, leading to the ds_emul file in debugfs always
returning zero even though we perform delay slot emulations.
Fix this by re-adding the use of the MIPS_FPU_EMU_INC_STATS macro.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 432c6bacbd ("MIPS: Use per-mm page to execute branch delay slot instructions")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14301/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch fixes the possibility of a deadlock when bringing up
secondary CPUs.
The deadlock occurs because the set_cpu_online() is called before
synchronise_count_slave(). This can cause a deadlock if the boot CPU,
having scheduled another thread, attempts to send an IPI to the
secondary CPU, which it sees has been marked online. The secondary is
blocked in synchronise_count_slave() waiting for the boot CPU to enter
synchronise_count_master(), but the boot cpu is blocked in
smp_call_function_many() waiting for the secondary to respond to it's
IPI request.
Fix this by marking the CPU online in cpu_callin_map and synchronising
counters before declaring the CPU online and calculating the maps for
IPIs.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reported-by: Justin Chen <justinpopo6@gmail.com>
Tested-by: Justin Chen <justinpopo6@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: stable@vger.kernel.org # v4.1+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14302/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:
- add a missing NULL pointer check in the intel BTS driver
- make BTS an exclusive PMU because BTS can only handle one event at
a time
- ensure that exclusive events are limited to one PMU so that several
exclusive events can be scheduled on different PMU instances"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Limit matching exclusive events to one PMU
perf/x86/intel/bts: Make it an exclusive PMU
perf/x86/intel/bts: Make sure debug store is valid