Pull block-related fixes from Jens Axboe:
- Improvements to the buffered and direct write IO plugging from
Fengguang.
- Abstract out the mapping of a bio in a request, and use that to
provide a blk_bio_map_sg() helper. Useful for mapping just a bio
instead of a full request.
- Regression fix from Hugh, fixing up a patch that went into the
previous release cycle (and marked stable, too) attempting to prevent
a loop in __getblk_slow().
- Updates to discard requests, fixing up the sizing and how we align
them. Also a change to disallow merging of discard requests, since
that doesn't really work properly yet.
- A few drbd fixes.
- Documentation updates.
* 'for-linus' of git://git.kernel.dk/linux-block:
block: replace __getblk_slow misfix by grow_dev_page fix
drbd: Write all pages of the bitmap after an online resize
drbd: Finish requests that completed while IO was frozen
drbd: fix drbd wire compatibility for empty flushes
Documentation: update tunable options in block/cfq-iosched.txt
Documentation: update tunable options in block/cfq-iosched.txt
Documentation: update missing index files in block/00-INDEX
block: move down direct IO plugging
block: remove plugging at buffered write time
block: disable discard request merge temporarily
bio: Fix potential memory leak in bio_find_or_create_slab()
block: Don't use static to define "void *p" in show_partition_start()
block: Add blk_bio_map_sg() helper
block: Introduce __blk_segment_map_sg() helper
fs/block-dev.c:fix performance regression in O_DIRECT writes to md block devices
block: split discard into aligned requests
block: reorganize rounding of max_discard_sectors
2) additional or corrected drive quirks for ata_blacklist
3) Kconfig text tweaking
4) new PCI IDs
5) pata_atiixp: quirk for MSI motherboard
6) export ahci_dev_classify for an ahci_platform driver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUDjfByWzCDIBeCsvAQJSrA//UXV3Vfg6nnSXb714m3I1zaqBPCyl/eZF
42dt62xsykfhVJGy10Ld1Ss8glxomSHzRWgIw7PG32VOTlJ/5KWvp1ZkjtqJnJG5
7XLIhjBQWVdtiZ1RMRUZRJRCrcJ1UMXZfCEpW8PPVqjdqKrK3gxSdOJtdxfQKkS6
imRv3SHytUsxfUKtJCsIr8YnSpRbbYNz/b3qYQxYVIw6MILVQLJYP2+DHvtrDaMP
XFdoCT4PI1sIrgn2TIV1nNRP/KONHSlDKQ9IMeDGO7C0C++WYWKf9F0COnIDem3j
Oc8zAjbIRBG8pWo16qTN6D01kcww5Wqqzv0xwW8SGxFqhYeOmJyexGuya8nch5+T
GUihCo2dqsj3SiGwVnHGzTpo2QDnxpGKY6gPPmJubpkU2czCc6Fl4w/XhO1S+saC
olm2BqMNpfWErD7pDIuZIrBrCxv8UtnFK9HNfUF4OQsyHPXcDjlFfeiZBmvHolgU
+tjSSMsgBf0+QLGLgjAnA7DZ0mv5v/lQ4NlyGd+HfXWM/vyjLdU4I2zYYol2LlYg
YtEUJwgT7Qk7iDW6QSP47NzVS7f10ejdTEfE5aDmiaFfDCticfUuJ3704eAUI08E
BgoGYqIyHShOAIAZFfc8ILBvMBjPZ6SXk7jwq4Nf5zTtUP3sgy96dnWOBEl0YFLP
VXPUcodWouM=
=pjUK
-----END PGP SIGNATURE-----
Merge tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
Pull libata fixes from Jeff Garzik:
- libata-acpi regression fix
- additional or corrected drive quirks for ata_blacklist
- Kconfig text tweaking
- new PCI IDs
- pata_atiixp: quirk for MSI motherboard
- export ahci_dev_classify for an ahci_platform driver
* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry
[libata] new quirk, lift bridge limits for Buffalo DriveStation Quattro
[libata] Kconfig: Elaborate that SFF is meant for legacy and PATA stuff
[libata] acpi: call ata_acpi_gtm during ata port init time
ata_piix: Add Device IDs for Intel Lynx Point-LP PCH
ahci: Add Device IDs for Intel Lynx Point-LP PCH
pata_atiixp: override cable detection on MSI E350DM-E33
ahci: un-staticize ahci_dev_classify
commit d70e551c8e, Add " 2GB ATA Flash
Disk"/"ADMA428M" to DMA blacklist, should have added a space before 2GB.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The GPIO buttons are named SW3, SW4, SW5 and SW6 on the board
silkscreen. Update the buttons descriptions accordingly.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Ben Hutchings says:
====================
Simple fix for a braino. Please also queue this for the 3.4 and 3.5
stable series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde says:
====================
here are two fixes for the v3.6 release cycle. Alexey Khoroshilov submitted a
fix for a memory leak in the softing driver (in softing_load_fw()) in case a
krealloc() fails. Sven Schmitt fixed the misuse of the IRQF_SHARED flag in the
irq resouce of the sja1000 platform driver, now the correct flag is used. There
are no mainline users of this feature which need to be converted.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
This batch of fixes is intended for 3.6...
Johannes Berg gives us a pair of iwlwifi fixes. One corrects some
improperly defined ifdefs that lead to crashes and BUG_ONs. The other
prevents attempts to read SRAM for devices that aren't actually started.
Julia Lawall provides an ipw2100 fix to properly set the return code
from a function call before testing it! :-)
Thomas Huehn corrects the improper use of a constant related to a power
setting in ath5k.
Thomas Pedersen offers a mac80211 fix to properly handle destination
addresses of unicast frames passing though a mesh gate.
Vladimir Zapolskiy provides a brcmsmac fix to properly mark the
interface state when the device goes down.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The cwnd reduction in fast recovery is based on the number of packets
newly delivered per ACK. For non-sack connections every DUPACK
signifies a packet has been delivered, but the sender mistakenly
skips counting them for cwnd reduction.
The fix is to compute newly_acked_sacked after DUPACKs are accounted
in sacked_out for non-sack connections.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Non-root user-space processes can send Netlink messages to other
processes that are well-known for being subscribed to Netlink
asynchronous notifications. This allows ilegitimate non-root
process to send forged messages to Netlink subscribers.
The userspace process usually verifies the legitimate origin in
two ways:
a) Socket credentials. If UID != 0, then the message comes from
some ilegitimate process and the message needs to be dropped.
b) Netlink portID. In general, portID == 0 means that the origin
of the messages comes from the kernel. Thus, discarding any
message not coming from the kernel.
However, ctnetlink sets the portID in event messages that has
been triggered by some user-space process, eg. conntrack utility.
So other processes subscribed to ctnetlink events, eg. conntrackd,
know that the event was triggered by some user-space action.
Neither of the two ways to discard ilegitimate messages coming
from non-root processes can help for ctnetlink.
This patch adds capability validation in case that dst_pid is set
in netlink_sendmsg(). This approach is aggressive since existing
applications using any Netlink bus to deliver messages between
two user-space processes will break. Note that the exception is
NETLINK_USERSOCK, since it is reserved for netlink-to-netlink
userspace communication.
Still, if anyone wants that his Netlink bus allows netlink-to-netlink
userspace, then they can set NL_NONROOT_SEND. However, by default,
I don't think it makes sense to allow to use NETLINK_ROUTE to
communicate two processes that are sending no matter what information
that is not related to link/neighbouring/routing. They should be using
NETLINK_USERSOCK instead for that.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds "#ifndef __<header>_H" for protecting header from double
inclusion.
Signed-off-by: Rayagond Kokatanur <rayagond@vayavyalabs.com>
Hacked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Multicast traffic allocates dst with DST_NOCACHE, but dst is
not inserted into rt_uncached_list.
This slowdown multicast workloads on SMP because rt_uncached_lock is
contended.
Change the test before taking the lock to actually check the dst
was inserted into rt_uncached_list.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This board is incorrectly detected as having an LVDS connector,
resulting in the VGA output (the only available output on the board)
showing the console only in the top-left 1024x768 pixels, and an extra
LVDS connector appearing in X.
It's a desktop Mini-ITX board using an Atom D525 CPU with an NM10
chipset.
I've had this board for about a year, but this is the first time I
noticed the issue because I've been running it headless for most of its
life.
Signed-off-by: Calvin Walton <calvin.walton@kepstin.ca>
This reverts commit b1acf1bb54.
Something went horribly wrong when I did savedefconfig, not sure what,
but what's in there is busted so let's revert it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For certain speculative events on Power7, 'perf stat' reports far higher
event count than 'perf record' for the same event.
As described in following commit, a performance monitor exception is raised
even when the the performance events are rolled back.
commit 0837e3242c
Author: Anton Blanchard <anton@samba.org>
Date: Wed Mar 9 14:38:42 2011 +1100
perf_event_interrupt() records an event only when an overflow occurs. But
this check for overflow is a simple 'if (val < 0)'.
Because the events are rolled back, this check for overflow fails and the
event is not recorded. perf_event_interrupt() later uses pmc_overflow() to
detect the overflow and resets the counters and the events are lost completely.
To properly detect the overflow of rolled back events, use pmc_overflow()
even when recording events.
To reproduce:
$ cat strcpy.c
#include <stdio.h>
#include <string.h>
main()
{
char buf[256];
alarm(5);
while(1)
strcpy(buf, "string1");
}
$ perf record -e r20014 ./strcpy
$ perf report -n > report.1
$ perf stat -e r20014 > report.2
# Compare report.1 and report.2
Reported-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The enhanced prefetch hint patches corrupt the condition register
that was used to check if we are in interrupt. Fix this by using cr1.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
"powerpc: Use enhanced touch instructions in POWER7
copy_to_user/copy_from_user" was applied twice. Remove one.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Directly comparing current->personality against PER_LINUX32 doesn't work
in cases when any of the personality flags stored in the top three bytes
are used.
Directly forcefully setting personality to PER_LINUX32 or PER_LINUX
discards any flags stored in the top three bytes
Use personality() macro to compare only PER_MASK bytes and make sure that
we are setting only the bits that should be set, instead of overwriting
the whole value.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Checking for device mask to cover the whole IOMMU table is too strict.
IOMMU allocators should handle mask constraint properly for each
allocation.
The patch enables to use old AirPort Extreme cards on PowerMacs with
more than 1GB of memory; without the patch the driver init fails with:
b43-pci-bridge 0001:01:01.0: Warning: IOMMU window too big for device mask
b43-pci-bridge 0001:01:01.0: mask: 0x3fffffff, table end: 0x80000000
b43-phy0 ERROR: The machine/kernel does not support the required 30-bit DMA mask
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For powerpc BooKE and e200, singlestep is handled on the critical/dbg
exception stack. This causes current_thread_info() to fail for kgdb
internal, so previously We work around this issue by copying
the thread_info from the kernel stack before calling kgdb_handle_exception,
and copying it back afterwards.
But actually we don't do this properly. We should backup current_thread_info
then restore that when exit.
Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We need to skip a breakpoint exception when it occurs after
a breakpoint has already been removed.
Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The kgdb_single_step flag has the possibility to indefinitely
hang the system on an SMP system.
The x86 arch have the same problem, and that problem was fixed by
commit 8097551d9ab9b9e3630(kgdb,x86: do not set kgdb_single_step
on x86). This patch does the same behaviors as x86's patch.
Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add several #includes that mpic_msgr relies on being pulled implicitly,
which only happens on certain configs.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Meador Inge <meador_inge@mentor.com>
Cc: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently if you are doing a global perf recording with hardware
breakpoints (ie perf record -e mem:0xdeadbeef -a), you can oops with:
Faulting instruction address: 0xc000000000738890
cpu 0xc: Vector: 300 (Data Access) at [c0000003f76af8d0]
pc: c000000000738890: .hw_breakpoint_handler+0xa0/0x1e0
lr: c000000000738830: .hw_breakpoint_handler+0x40/0x1e0
sp: c0000003f76afb50
msr: 8000000000001032
dar: 6f0
dsisr: 42000000
current = 0xc0000003f765ac00
paca = 0xc00000000f262a00 softe: 0 irq_happened: 0x01
pid = 6810, comm = loop-read
enter ? for help
[c0000003f76afbe0] c00000000073cd04 .notifier_call_chain.isra.0+0x84/0xe0
[c0000003f76afc80] c00000000073cdbc .notify_die+0x3c/0x60
[c0000003f76afd20] c0000000000139f0 .do_dabr+0x40/0xf0
[c0000003f76afe30] c000000000005a9c handle_dabr_fault+0x14/0x48
--- Exception: 300 (Data Access) at 0000000010000480
SP (ff8679e0) is in userspace
This is because we don't check to see if the break point is associated
with task before we deference the task_struct pointer.
This changes the update to use current.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There are a few whitespace goolies in xmon.c, some of them appear to
be my fault. Fix them all in one go.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since the printk internals were reworked the xmon 'dl' command which
dumps the content of __log_buf has stopped working.
It is now a structured buffer, so just dumping it doesn't really work.
Use the helpers added for kgdb to print out the content.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current layout is to place the per-process tables at the end of the
GTT. However, this is currently using a hardcoded maximum size for the GTT
and not taking in account limitations imposed by the BIOS. Use the value
for the total number of entries allocated in the table as provided by
the configuration registers.
Reported-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Matthew Garret <mjg@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The sja1000 platform driver wrongly assumes that a shared IRQ is indicated
with the IRQF_SHARED flag in irq resource flags. This patch changes the
driver to handle the correct flag IORESOURCE_IRQ_SHAREABLE instead.
There are no mainline users of the platform driver which wrongly make use
of IRQF_SHARED.
Signed-off-by: Sven Schmitt <sven.schmitt@volkswagen.de>
Acked-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Do not leak memory by updating pointer with potentially NULL realloc return value.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pull crypto fixes from Herbert Xu:
"This push fixes a build error on 32-bit archs in the hifn driver as
well as a potential deadlock in the caam driver."
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: caam - fix possible deadlock condition
crypto: hifn_795x - fix 64bit division and undefined __divdi3 on 32bit archs
Pull UDF, ext3 & reiserfs fixes from Jan Kara:
"A couple of fixes (udf, reiserfs, ext3) that accumulated over my
vacation."
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: fix retun value on error path in udf_load_logicalvol
jbd: don't write superblock when unmounting an ro filesystem
reiserfs: fix deadlocks with quotas
quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
UDF: During mount free lvid_bh before rescanning with different blocksize
udf: fix udf_setsize() for file data in ICB
Pull IDE power management bugfix from David S. Miller.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: fix generic_ide_suspend/resume Oops
Pull x86 fixes from Ingo Molnar:
"This tree contains assorted fixlets: an alternatives patching crash
fix, an irq migration/hotplug interaction fix, a fix for large AMD
microcode images and a comment fixlet."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode, AMD: Fix broken ucode patch size check
x86/alternatives: Fix p6 nops on non-modular kernels
x86/fixup_irq: Use cpu_online_mask instead of cpu_all_mask
x86/spinlocks: Fix comment in spinlock.h
Pull timer fixes from Thomas Gleixner:
"Mostly small fixes for the fallout of the timekeeping overhaul in 3.6
along with stable fixes to address an accumulation problem and missing
sanity checks for RTC readouts and user space provided values."
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Avoid making adjustments if we haven't accumulated anything
time: Avoid potential shift overflow with large shift values
time: Fix casting issue in timekeeping_forward_now
time: Ensure we normalize the timekeeper in tk_xtime_add
time: Improve sanity checking of timekeeping inputs
Pull HID fix from Jiri Kosina:
"Fix for one particular device not being properly claimed by
hid-multitouch driver"
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: Remove QUANTA from special drivers list
ETHTOOL_GRXCLSRULE returns filters for a TCP/IPv4 or UDP/IPv4 4-tuple
with source and destination swapped.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
DRM_MODE() macro doesn't initialize the type of the base drm object.
When a copy is made of the mode, the object type is overwritten with
zero, and the the object can no longer be found by
drm_mode_object_find() due to the type check failing.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If range.start or range.minlen is bigger than filesystem size, return
invalid value error. This fixes possible overflow in BTOBB macro when
passed value was nearly ULLONG_MAX.
Signed-off-by: Tomas Racek <tracek@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Also update some commens in the area to make the code easier to read.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
If we try to print to the console device while its decoding is disabled,
the system will hang.
Reported-and-tested-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Olof Johansson <olof@lixom.net>
If the P2M revectoring would fail, we would try to continue on by
cleaning the PMD for L1 (PTE) page-tables. The xen_cleanhighmap
is greedy and erases the PMD on both boundaries. Since the P2M
array can share the PMD, we would wipe out part of the __ka
that is still used in the P2M tree to point to P2M leafs.
This fixes it by bypassing the revectoring and continuing on.
If the revector fails, a nice WARN is printed so we can still
troubleshoot this.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When we free the PFNs and then subsequently populate them back
during bootup:
Freeing 20000-20200 pfn range: 512 pages freed
1-1 mapping on 20000->20200
Freeing 40000-40200 pfn range: 512 pages freed
1-1 mapping on 40000->40200
Freeing bad80-badf4 pfn range: 116 pages freed
1-1 mapping on bad80->badf4
Freeing badf6-bae7f pfn range: 137 pages freed
1-1 mapping on badf6->bae7f
Freeing bb000-100000 pfn range: 282624 pages freed
1-1 mapping on bb000->100000
Released 283999 pages of unused memory
Set 283999 page(s) to 1-1 mapping
Populating 1acb8a-1f20e9 pfn range: 283999 pages added
We end up having the P2M array (that is the one that was
grafted on the P2M tree) filled with IDENTITY_FRAME or
INVALID_P2M_ENTRY) entries. The patch titled
"xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID."
recycles said slots and replaces the P2M tree leaf's with
&mfn_list[xx] with p2m_identity or p2m_missing.
And re-uses the P2M array sections for other P2M tree leaf's.
For the above mentioned bootup excerpt, the PFNs at
0x20000->0x20200 are going to be IDENTITY based:
P2M[0][256][0] -> P2M[0][257][0] get turned in IDENTITY_FRAME.
We can re-use that and replace P2M[0][256] to point to p2m_identity.
The "old" page (the grafted P2M array provided by Xen) that was at
P2M[0][256] gets put somewhere else. Specifically at P2M[6][358],
b/c when we populate back:
Populating 1acb8a-1f20e9 pfn range: 283999 pages added
we fill P2M[6][358][0] (and P2M[6][358], P2M[6][359], ...) with
the new MFNs.
That is all OK, except when we revector we assume that the PFN
count would be the same in the grafted P2M array and in the
newly allocated. Since that is no longer the case, as we have
holes in the P2M that point to p2m_missing or p2m_identity we
have to take that into account.
[v2: Check for overflow]
[v3: Move within the __va check]
[v4: Fix the computation]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We call memblock_reserve for [start of mfn list] -> [PMD aligned end
of mfn list] instead of <start of mfn list> -> <page aligned end of mfn list].
This has the disastrous effect that if at bootup the end of mfn_list is
not PMD aligned we end up returning to memblock parts of the region
past the mfn_list array. And those parts are the PTE tables with
the disastrous effect of seeing this at bootup:
Write protecting the kernel read-only data: 10240k
Freeing unused kernel memory: 1860k freed
Freeing unused kernel memory: 200k freed
(XEN) mm.c:2429:d0 Bad type (saw 1400000000000002 != exp 7000000000000000) for mfn 116a80 (pfn 14e26)
...
(XEN) mm.c:908:d0 Error getting mfn 116a83 (pfn 14e2a) from L1 entry 8000000116a83067 for l1e_owner=0, pg_owner=0
(XEN) mm.c:908:d0 Error getting mfn 4040 (pfn 5555555555555555) from L1 entry 0000000004040601 for l1e_owner=0, pg_owner=0
.. and so on.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>