Commit graph

286587 commits

Author SHA1 Message Date
Greg Kroah-Hartman
e032d80774 mce: fix warning messages about static struct mce_device
When suspending, there was a large list of warnings going something like:

	Device 'machinecheck1' does not have a release() function, it is broken and must be fixed

This patch turns the static mce_devices into dynamically allocated, and
properly frees them when they are removed from the system.  It solves
the warning messages on my laptop here.

Reported-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Djalal Harouni <tixxdz@opendz.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@amd64.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-16 17:08:42 -08:00
Linus Torvalds
5b3fcfed35 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-arm
* 'fixes' of git://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-arm:
  ARM: sa11x0: assabet: fix build warning
  ARM: Add arm_memblock_steal() to allocate memory away from the kernel
  ARM: 7275/1: LPAE: Check the CPU support for the long descriptor format
  ARM: 7274/1: NUC900: Rename nuc900-audio platform device to nuc900-ac97
  ARM: 7272/1: S3C24XX: Fix build error for missing <mach/system-reset.h>
  ARM: 7271/1: Fix typo in conversion of ARCH_NR_GPIOS to Kconfig
2012-01-16 15:34:44 -08:00
Linus Torvalds
a12587b003 NFS client bugfixes and cleanups for Linux 3.3 (pull 2)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPFJ45AAoJEGcL54qWCgDyy9QP/2LmBpYyu3L6vJKRDPxHO9/A
 geLuJsc+m+NU7V7k2H2PWj5TfLPyY33+FEdR0+iqmYVdCK+cb/fvXjeRmhwaFE78
 uoOs9x4LYtMUN+OCOdPJaNc8vosBxHfD9GE3Qb6KFKT4/2pjRu7kjbZvYEyufDhO
 FrPkSiJv5VUgV+xWsgp2C0FSbBFUQ4iGZ3nPeqJCLAT+1QEaJd3PIaJdUepGCi/X
 wakUGAWGKA/sWDcJgBbovVddB8j6FINmTrFme3ZSYqkOOF9mdmzxa/tvhFMjr/17
 sE2L3BHqQioRNXfjY4Ex3kGdVbuIVIIOSNAYeUysuunTSGsub0Ou+UJpj/CtNhur
 auZhjeZQrQUEd1j4Zb4knE3PyjF/GT1RV1CstmdDqty9+BPrzZdx8Rfj61IlLWEz
 b4DYT2HYxXifH5ZbDwA/uyApNKg95kjkp/OLGxwF640gIKXoeKo3xfJO+syQ5wKH
 AdBuTV0vPvkMwdY0H64IHghs47y7A3M/ZbK1jyAkaTzkq15GPqp49ojwFLc3UESb
 JiV/66hmmIJCAqEMu8OncHNtzsrZ2ZO5fqPSjhcrc4CEVCeqNy8WwP8she6BeR2U
 EwaWIqBs1zt5A6dZE8yxdu2Mvai3oRo/ioa6bPwxCAuy7o6/p2x/39TLe27fL18z
 1Qss51gQfFWe8lM8Rhj+
 =beF6
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

NFS client bugfixes and cleanups for Linux 3.3 (pull 2)

* tag 'nfs-for-3.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  pnfsblock: alloc short extent before submit bio
  pnfsblock: remove rpc_call_ops from struct parallel_io
  pnfsblock: move find lock page logic out of bl_write_pagelist
  pnfsblock: cleanup bl_mark_sectors_init
  pnfsblock: limit bio page count
  pnfsblock: don't spinlock when freeing block_dev
  pnfsblock: clean up _add_entry
  pnfsblock: set read/write tk_status to pnfs_error
  pnfsblock: acquire im_lock in _preload_range
  NFS4: fix compile warnings in nfs4proc.c
  nfs: check for integer overflow in decode_devicenotify_args()
  NFS: cleanup endian type in decode_ds_addr()
  NFS: add an endian notation
2012-01-16 15:08:13 -08:00
Linus Torvalds
adfeb6e9f4 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (sysfs-interface) Update tempX_type attribute to be more generic
  hwmon: (adm1031) Fix coding style issues
  hwmon: (it87) Add IT8728F support
  hwmon: (coretemp) Add missing section annotations
  hwmon: (lm90) Add range check to set_update_interval
  hwmon: (lm63) Support extended lookup table of LM96163
  hwmon: (lm63) Expose automatic fan speed control lookup table
  hwmon: (lm63) Fix incorrect comment about I2C address
  hwmon: (lm63) LM64 has a dedicated pin for tachometer
  hwmon: (lm63) Add sensor type attribute for external sensor on LM96163
  hwmon: (lm63) Add support for update_interval sysfs attribute
  hwmon: (lm63) Add support for writing the external critical temperature
  hwmon: (lm63) Add support for unsigned upper temperature limits
  hwmon: (lm63) Add support for LM96163
  hwmon: (lm63) Add support for external temperature offset register
  hwmon: (lm63) Fix checkpatch errors
  hwmon: (max1111) Change sysfs interface to in[0-3]_input in millivolts
2012-01-16 15:07:46 -08:00
Linus Torvalds
97740400bc Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
* 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / Hibernate: Drop the check of swap space size for compressed image
  PM / shmobile: fix A3SP suspend method
  PM / Domains: Skip governor functions for CONFIG_PM_RUNTIME unset
  PM / Domains: Fix build for CONFIG_PM_SLEEP unset
  PM: Make sysrq-o be available for CONFIG_PM unset
2012-01-16 15:02:30 -08:00
Linus Torvalds
408e057870 Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  scripts/coccinelle: improve the coverage of some semantic patches
  coccinelle: semantic patches related to devm_ functions (part 2)
  coccinelle: semantic patches related to devm_ functions (part 1)
  coccinelle.txt: update documentation to include M= option
  coccicheck: add M= option to control which dir is processed
  ctags: remove struct forward declarations
  scripts/tags.sh: Add Page flag function magic
2012-01-16 14:36:07 -08:00
Linus Torvalds
287b901dca Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  menuconfig: fix a regression when canceling the prompt dialog at exit
  kbuild: Fix compiler warning with assertion when calling 'fwrite'
  Improve update-po-config output
  menuconfig: let make not report error when not save configuration
  merge_config.sh: fix bug in final check
  merge_config.sh: whitespace cleanup
  merge_config.sh: use signal names compatible with dash and bash
  kconfig: add merge_config.sh script
  kconfig: use xfwrite wrapper function to silence warnings
  kconfig: fix set but not used warnings
  kconfig: fix warnings by specifing format arguments
2012-01-16 14:35:34 -08:00
Linus Torvalds
c63dbbd526 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  Kbuild: Use dtc's -d (dependency) option
  dtc: Implement -d option to write out a dependency file
  kbuild: Fix comment in Makefile.lib
  scripts/genksyms: clean lex/yacc generated files
  kbuild: Correctly deal with make options which contain an "s"
2012-01-16 14:34:54 -08:00
Russell King
a61c2332f8 ARM: sa11x0: assabet: fix build warning
Since a32618d2 (ARM: pgtable: switch to use pgtable-nopud.h), assabet
warns as follows:

arch/arm/mach-sa1100/assabet.c: In function 'map_sa1100_gpio_regs':
arch/arm/mach-sa1100/assabet.c:264: warning: passing argument 1 of 'pmd_offset' from incompatible pointer type

Fix this by adding the necessary pud_offset() macro.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-01-16 22:25:29 +00:00
Guenter Roeck
5f8b1f877e hwmon: (sysfs-interface) Update tempX_type attribute to be more generic
The temp[1-*]_type attribute reports the temperature sensor type. Sensor type 1
is described as "PII/Celeron Diode", which is quite restrictive; other CPUs
may also have an embedded temperature sensor diode with similar characteristics.
Change description to "CPU embedded diode" to be more generic.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:48 +01:00
Guenter Roeck
1c720093f6 hwmon: (adm1031) Fix coding style issues
Fix almost all coding style issues except for the multi-line macro errors,
which do not really apply since the macros are not multi-line statements
but declarations.

Based on merged patch series from Zac Storer; fixed remaining checkpatch
errors and warnings.

Cc: Zac Storer <zac.3.14159@gmail.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:48 +01:00
Jean Delvare
16b5dda22e hwmon: (it87) Add IT8728F support
Until we get a datasheet for the IT8728F, treat it as fully compatible
with the IT8721F, as it seems to work reasonably well.

This closes kernel bug #27262.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
2012-01-16 22:51:48 +01:00
Jean Delvare
d6db23c7ce hwmon: (coretemp) Add missing section annotations
Many functions in the coretemp driver lack a proper section
annotation. Add them to let the kernel free the memory after
initialization when possible.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Durgadoss R <durgadoss.r@intel.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
2012-01-16 22:51:47 +01:00
Guenter Roeck
6b101116ae hwmon: (lm90) Add range check to set_update_interval
When writing the update_interval attribute, the parameter value was
not range checked, which could cause an integer overflow and result
in an arbitrary update interval. Fix by limiting the value range to
<0, 100000>.

Reported-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:47 +01:00
Jean Delvare
2fe28ab51d hwmon: (lm63) Support extended lookup table of LM96163
The LM96163 has an extended lookup table with 12 entries instead of 8,
add support for that.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
2012-01-16 22:51:47 +01:00
Jean Delvare
d216f6809e hwmon: (lm63) Expose automatic fan speed control lookup table
The LM63 and compatible devices have a lookup table to control the fan
speed automatically. Expose it in sysfs. Values are cached for 5
seconds, independently of the other register values to avoid slowing
down "sensors". We might make the table values writable in the future.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
2012-01-16 22:51:47 +01:00
Jean Delvare
d93ab78070 hwmon: (lm63) Fix incorrect comment about I2C address
What was true of the LM63 doesn't apply to the LM64.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
2012-01-16 22:51:47 +01:00
Jean Delvare
409c0b5bdf hwmon: (lm63) LM64 has a dedicated pin for tachometer
On the LM64, the tachometer function has a dedicated pin and fan speed
monitoring is always enabled.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
2012-01-16 22:51:46 +01:00
Guenter Roeck
f496b2d4f1 hwmon: (lm63) Add sensor type attribute for external sensor on LM96163
On LM96163, the external temperature sensor type is configurable to
either a thermal diode or a 3904 transistor. The chip reports a wrong
temperature if misconfigured. Add writable attribute to support it.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:46 +01:00
Guenter Roeck
04738b2b2f hwmon: (lm63) Add support for update_interval sysfs attribute
The update interval is configurable on LM63 and compatibles. Add
support for it.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:46 +01:00
Guenter Roeck
94e55df48a hwmon: (lm63) Add support for writing the external critical temperature
On LM64, the external critical temperature limit is always writable. On LM96163,
it is writable if the chip is configured for it. Add conditional support for
writing the register dependent on chip type and configuration.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:46 +01:00
Guenter Roeck
e872c91e72 hwmon: (lm63) Add support for unsigned upper temperature limits
LM96163 supports unsigned upper limits for the external temperature sensor.
Add support for it.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:46 +01:00
Guenter Roeck
210961c436 hwmon: (lm63) Add support for LM96163
LM96163 is an enhanced version of LM63 with improved PWM resolution. Add chip
detection code as well as support for improved PWM resolution if the chip is
configured to use it.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Tested-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:45 +01:00
Guenter Roeck
786375f729 hwmon: (lm63) Add support for external temperature offset register
LM63 and compatibles support a temperature offset register for the external
temperature sensor. Add support for it.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:45 +01:00
Guenter Roeck
662bda2832 hwmon: (lm63) Fix checkpatch errors
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Tested-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:45 +01:00
Eric Miao
012f3b9118 hwmon: (max1111) Change sysfs interface to in[0-3]_input in millivolts
This patch fixed the inconsistent max1111 sysfs interface as pointed
out by Jean Delvare:

    It was pointed to me that the max1111 driver doesn't implement the
    standard sysfs interface for hwmon drivers (as described in
    Documentation/hwmon/sysfs-interface). It exports files adc[0-3]_in,
    which
    aren't part of the standard interface. Presumably these should be
    renamed to in[0-3]_input. Renaming them is probably not sufficient
    though, as I see no scaling done in the driver. As the MAX1111 chip has
    a documented full scale of 2.048V, I take it that the LSB of the ADC
    has a weight of 8 mV. Exporting raw register values to user-space is
    not OK.

Reported-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-01-16 22:51:45 +01:00
Chris Mason
96bdc7dc61 Btrfs: use larger system chunks
system chunks by default are very small.  This makes them slightly
larger and also fixes the conditional checks to make sure we don't
allocate a billion of them at once.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-16 15:38:24 -05:00
Josef Bacik
f248679e86 Btrfs: add a delalloc mutex to inodes for delalloc reservations
I was using i_mutex for this, but we're getting bogus lockdep warnings by doing
that and theres no real way to get rid of those, so just stop using i_mutex to
protect delalloc metadata reservations and use a delalloc mutex instead.  This
shouldn't be contended often at all, only if you are writing and mmap writing to
the file at the same time.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-01-16 15:29:43 -05:00
Josef Bacik
8c2a3ca20f Btrfs: space leak tracepoints
This in addition to a script in my btrfs-tracing tree will help track down space
leaks when we're getting space left over in block groups on umount.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-01-16 15:29:43 -05:00
Josef Bacik
90290e1982 Btrfs: protect orphan block rsv with spin_lock
We've been seeing warnings coming out of the orphan commit stuff forever from
ceph.  Turns out it's because we're racing with checking if the orphan block
reserve is set, because we clear it outside of the spin_lock.  So leave the
normal fastpath checks where they are, but take the spin_lock and _recheck_ to
make sure we haven't had an orphan block rsv added in the meantime.  Then clear
the root's orphan block rsv and release the lock.  With this patch a user said
the warnings went away and they usually showed up pretty soon after he started
ceph.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-01-16 15:29:42 -05:00
Josef Bacik
3f7de037fb Btrfs: add allocator tracepoints
I used these tracepoints when figuring out what the cluster stuff was doing, so
add them to mainline in case we need to profile this stuff again.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-01-16 15:29:42 -05:00
Josef Bacik
45a8090e62 Btrfs: don't call btrfs_throttle in file write
Btrfs_throttle will make us wait if there is a currently committing transaction
until we can open new transactions, which is ridiculous since we don't actually
start any transactions within the file write path anyway, so all this does is
introduce big latencies if we have a sync/fsync heavy workload going on while
somebody else is trying to do work.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-16 15:28:55 -05:00
Josef Bacik
ec39e180fd Btrfs: release space on error in page_mkwrite
If updating the inode gave us an ENOSPC we were just returning in page_mkwrite,
which is a problem since we make our reservation right before trying to update
the inode, so fix the out label so that we actually free our reservation.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-16 15:28:54 -05:00
Miao Xie
f70a9a6b94 Btrfs: fix btrfsck error 400 when truncating a compressed
Reproduce steps:
 # mkfs.btrfs /dev/sdb5
 # mount /dev/sdb5 -o compress=lzo /mnt
 # dd if=/dev/zero of=/mnt/tmpfile bs=128K count=1
 # sync
 # truncate -s 64K /mnt/tmpfile
 root 5 inode 257 errors 400

This is because of the wrong if condition, which is used to check if we should
subtract the bytes of the dropped range from i_blocks/i_bytes of i-node or not.
When we truncate a compressed extent, btrfs substracts the bytes of the whole
extent, it's wrong. We should substract the real size that we truncate, no
matter it is a compressed extent or not. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-16 15:28:54 -05:00
Josef Bacik
7ad85bb76a Btrfs: do not use btrfs_end_transaction_throttle everywhere
A user reported a problem where things like open with O_CREAT would take up to
30 seconds when he had nfs activity on the same mount.  This is because all of
our quick metadata operations, like create, symlink etc all do
btrfs_end_transaction_throttle, which if the transaction is blocked will wait
for the commit to complete before it returns.  This adds a ridiculous amount of
latency and isn't really needed.  The normal btrfs_end_transaction will mark the
transaction as blocked and wake the transaction kthread up if it thinks the
transaction needs to end (this being in the running out of global reserve space
scenario), and this is all that is really needed since we've already done
everything we're going to do, we just need to return.  This should help people
with the latency they were seeing when using synchronous heavy workloads.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-16 15:28:54 -05:00
Chris Mason
c126dea771 Merge branch 'integrity-check-patch-v2' of git://btrfs.giantdisaster.de/git/btrfs into integration
Conflicts:
	fs/btrfs/ctree.h
	fs/btrfs/super.c

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-16 15:27:58 -05:00
Chris Mason
9785dbdf26 Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into integration 2012-01-16 15:26:31 -05:00
Chris Mason
d756bd2d93 Merge branch 'for-chris' of git://repo.or.cz/linux-btrfs-devel into integration
Conflicts:
	fs/btrfs/volumes.c

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-16 15:26:17 -05:00
Chris Mason
27263e2832 Merge branch 'restriper' of git://github.com/idryomov/btrfs-unstable into integration 2012-01-16 15:26:02 -05:00
Chris Mason
64e05503ab Merge branch 'allocation-fixes' into integration 2012-01-16 15:25:42 -05:00
Ilya Dryomov
19a39dce3b Btrfs: add balance progress reporting
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:49 +02:00
Ilya Dryomov
de322263d3 Btrfs: allow for resuming restriper after it was paused
Recognize BTRFS_BALANCE_RESUME flag passed from userspace.  We use the
same heuristics used when recovering balance after a crash to try to
start where we left off last time.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:49 +02:00
Ilya Dryomov
a7e99c691a Btrfs: allow for canceling restriper
Implement an ioctl for canceling restriper.  Currently we wait until
relocation of the current block group is finished, in future this can be
done by triggering a commit.  Balance item is deleted and no memory
about the interrupted balance is kept.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:49 +02:00
Ilya Dryomov
837d5b6e46 Btrfs: allow for pausing restriper
Implement an ioctl for pausing restriper.  This pauses the relocation,
but balance is still considered to be "in progress": balance item is
not deleted, other volume operations cannot be started, etc.  If paused
in the middle of profile changing operation we will continue making
allocations with the target profile.

Add a hook to close_ctree() to pause restriper and free its data
structures on unmount.  (It's safe to unmount when restriper is in
"paused" state, we will resume with the same parameters on the next
mount)

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:49 +02:00
Ilya Dryomov
9555c6c180 Btrfs: add skip_balance mount option
Since restriper kthread starts involuntarily on mount and can suck cpu
and memory bandwidth add a mount option to forcefully skip it.  The
restriper in that case hangs around in paused state and can be resumed
from userspace when it's convenient.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:48 +02:00
Ilya Dryomov
596410151e Btrfs: recover balance on mount
On mount, if balance item is found, resume balance in a separate
kernel thread.

Try to be smart to continue roughly where previous balance (or convert)
was interrupted.  For chunk types that were being converted to some
profile we turn on soft convert, in case of a simple balance we turn on
usage filter and relocate only less-than-90%-full chunks of that type.
These are just heuristics but they help quite a bit, and can be improved
in future.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:48 +02:00
Ilya Dryomov
0940ebf6b9 Btrfs: save balance parameters to disk
Introduce a new btree objectid for storing balance item.  The reason is
to be able to resume restriper after a crash with the same parameters.
Balance item has a very high objectid and goes into tree of tree roots.

The key for the new item is as follows:

	[ BTRFS_BALANCE_OBJECTID ; BTRFS_BALANCE_ITEM_KEY ; 0 ]

Older kernels simply ignore it so it's safe to mount with an older
kernel and then go back to the newer one.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:48 +02:00
Ilya Dryomov
cfa4c961cc Btrfs: soft profile changing mode (aka soft convert)
When doing convert from one profile to another if soft mode is on
restriper won't touch chunks that already have the profile we are
converting to.  This is useful if e.g. half of the FS was converted
earlier.

The soft mode switch is (like every other filter) per-type.  This means
that we can convert for example meta chunks the "hard" way while
converting data chunks selectively with soft switch.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:48 +02:00
Ilya Dryomov
e4d8ec0f65 Btrfs: implement online profile changing
Profile changing is done by launching a balance with
BTRFS_BALANCE_CONVERT bits set and target fields of respective
btrfs_balance_args structs initialized.  Profile reducing code in this
case will pick restriper's target profile if it's available instead of
doing a blind reduce.  If target profile is not yet available it goes
back to a plain reduce.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:48 +02:00
Ilya Dryomov
70922617b0 Btrfs: do not reduce profile in do_chunk_alloc()
Every caller of do_chunk_alloc() feeds it the reduced allocation
profile, so stop trying to reduce it one more time.  Instead check the
validity of the passed profile.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-01-16 22:04:48 +02:00