Commit graph

1908 commits

Author SHA1 Message Date
Philipp Reisner
48acf86898 drbd: Microfix: Assigning sector once is sufficient
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:21 +02:00
Lars Ellenberg
0f0601f4ea drbd: new configuration parameter c-min-rate
We now track the data rate of locally submitted resync related requests,
and can thus detect non-resync activity on the lower level device.

If the current sync rate is above c-min-rate, and the lower level device
appears to be busy, we throttle the resyncer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:20 +02:00
Lars Ellenberg
80a40e439e drbd: reduce code duplication when receiving data requests
also canonicalize the return values of read_for_csum
and drbd_rs_begin_io to return -ESOMETHING, or 0 for success.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:19 +02:00
Lars Ellenberg
1d7734a0df drbd: use rolling marks for resync speed calculation
The current resync speed as displayed in /proc/drbd fluctuates a lot.
Using an array of rolling marks makes this calculation much more stable.
We used to have this (a long time ago with 0.7), but it got lost somehow.

If "stalled", do not discard the rest of the information, just add a
" (stalled)" tag to the progress line.

This patch also shortens a spinlock critical section somewhat, and
reduces the number of atomic operations in put_ldev.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:18 +02:00
Lars Ellenberg
0bb70bf601 drbd: remove outdated comment and dead code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:17 +02:00
Lars Ellenberg
c36c3ced69 drbd: let drbd_free_ee implicitly free any digest
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:16 +02:00
Philipp Reisner
85719573dd drbd: Replaced some casts by an union. Improved comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:15 +02:00
Philipp Reisner
d207450cf2 drbd: Bugfix: rs_in_flight could become wrong if read_for_csum() requested reschedule later
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:14 +02:00
Philipp Reisner
778f271dfe drbd: The new, smarter resync speed controller
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:14 +02:00
Philipp Reisner
8e26f9ccb9 drbd: New sync_param packet, that includes the parameters of the new controller
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:13 +02:00
Philipp Reisner
9a31d7164d drbd: New sync parameters for the smart resync rate controller
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:12 +02:00
Lars Ellenberg
d28fd092a5 drbd: fix list corruption (recent regression)
The commit 288f422ec1
 drbd: Track all IO requests on the TL, not writes only
moved a list_add_tail(req, ) into a region where req
may have just been freed due to conflict detection.

Fix this by adding a proper cleanup section for that code path.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:31:43 +02:00
Philipp Reisner
e756414f7d drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:12:07 +02:00
Philipp Reisner
6709893059 drbd: Make sure tl_restart(, resend) can not get called multiple times for a new connection
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:09:09 +02:00
Philipp Reisner
f70b351159 drbd: Do not try to free tl_hash in drbd_disconnect() when IO is suspended
We may not free tl_hash when IO is suspended, since we can not wait
until ap_bio_cnt reaches zero.

We can do this after susp reched 0, since then tl_clear was called

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:08:27 +02:00
Philipp Reisner
8f488156c0 drbd: Allow attach while IO is suspended
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:05:32 +02:00
Philipp Reisner
cfa03415a1 drbd: Allow tl_restart() to do IO completion while IO is suspended
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:05:08 +02:00
Philipp Reisner
84dfb9f564 drbd: Fixed a deadlock, probably only affected UP machines
After disconnect (most likely mdev->net_cnt == 0) and we are
still in an unstable state (!drbd_state_is_stable()). When we
get an IO request in drbd_get_max_buffers() (called from
__inc_ap_bio_cond(), called from inc_ap_bio()) we wake up
misc_wait. Misc_wait is also used in inc_ap_bio() to sleep
until the outcome of __inc_ap_bio_cond() changes. => Busy loop!

Solution: Have a dedicated wait queue for get_net_conf() and
put_net_conf().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:04:46 +02:00
Philipp Reisner
65d922c33e drbd: Do not do a hard state change when establishing a connection [bugz 304]
Make sure the state engine can deny two primaries to connect

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:04:10 +02:00
Philipp Reisner
481c6f5032 drbd: Ensure that the peer was not rebootet in the meantime before resending TL
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:01:37 +02:00
Philipp Reisner
43a5182ccc drbd: Delayed creation of current-UUID
When a fencing policy of "resource-and-stonith" is configured,
and DRBD looses connection to it's peer, we can delay the
creation of a new current-UUID until IO gets thawed.

That allows one to deploy fence-peer handlers that actually
commit suicide on the machine they get started.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:59:21 +02:00
Philipp Reisner
87f7be4cf8 drbd: Run the fence-peer helper asynchronously
Since we can not thaw the transfer log, the next logical step is
to allow reconnects while the fence-peer handler runs.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:58:36 +02:00
Philipp Reisner
1616a25493 drbd: Reduce the verbosity of some state transitions
State transitions in the space of non-allowed states used
to be very noisy. Reduce that, since that has little value
for the majority of the user base.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:57:22 +02:00
Philipp Reisner
999122bc18 drbd: Removing a by now obsolete clause in the state sanitizing
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:56:50 +02:00
Philipp Reisner
18a50fa213 drbd: Now we need to handle the ed_uuid of an diskless, unconnected primary correctly
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:56:00 +02:00
Philipp Reisner
894c6a9461 drbd: Disabled the crashed_primary detection for re-attach of last data while IO is frozen
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:55:11 +02:00
Philipp Reisner
47ff2d0a8e drbd: Do not allow a fencing-policy of resource-and-stonith with protocol A
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:53:42 +02:00
Philipp Reisner
265be2d098 drbd: Finished the "on-no-data-accessible suspend-io;" functionality
When no data is accessible (no connection to the peer, nor a local disk)
allow the user to select to freeze all IO operations instead of getting
IO errors.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:52:53 +02:00
Philipp Reisner
905cd7d8ac drbd: Removed redundant error checks in the request code path
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:39:38 +02:00
Philipp Reisner
5ba82308ea drbd: factored drbd_req_make_private_bio() out of drbd_req_new()
Preparing tl_thaw_dio()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:37:33 +02:00
Philipp Reisner
b9b98716f8 drbd: Do not send two barriers without any writes between them
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:36:51 +02:00
Philipp Reisner
11b58e73a3 drbd: factored tl_restart() out of tl_clear().
If IO was frozen for a temporal network outage, resend the
content of the transfer-log into the newly established connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:35:58 +02:00
Philipp Reisner
2a80699f80 drbd: mod_req has now a return value
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:26:45 +02:00
Philipp Reisner
288f422ec1 drbd: Track all IO requests on the TL, not writes only
With that the drbd_fail_pending_reads() function becomes obsolete.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:25:20 +02:00
Philipp Reisner
7e602c0aaf drbd: renamed drbd_tl_epoch.n_req to drbd_tl_epoch.n_writes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:23:45 +02:00
Dan Carpenter
93055c3104 ps3disk: passing wrong variable to bvec_kunmap_irq()
This should pass "buf" to bvec_kunmap_irq() instead of "bv".  The api is
like kmap_atomic() instead of kmap().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-12 18:56:33 +02:00
Mike Snitzer
e4c4776dea virtio-blk: fix request leak.
Must drop reference taken by blk_make_request().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org # .35.x
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-09 11:42:37 -07:00
Arnd Bergmann
2a48fc0ab2 block: autoconvert trivial BKL users to private mutex
The block device drivers have all gained new lock_kernel
calls from a recent pushdown, and some of the drivers
were already using the BKL before.

This turns the BKL into a set of per-driver mutexes.
Still need to check whether this is safe to do.

file=$1
name=$2
if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
            sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
    else
            sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
    fi
    sed -i ${file} \
        -e "/^#include.*linux.mutex.h/,$ {
                1,/^\(static\|int\|long\)/ {
                     /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

} }"  \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[      ]*cycle_kernel_lock();/d'
else
    sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                -e '/cycle_kernel_lock()/d'
fi

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-05 15:01:10 +02:00
Arnd Bergmann
613655fa39 drivers: autoconvert trivial BKL users to private mutex
All these files use the big kernel lock in a trivial
way to serialize their private file operations,
typically resulting from an earlier semi-automatic
pushdown from VFS.

None of these drivers appears to want to lock against
other code, and they all use the BKL as the top-level
lock in their file operations, meaning that there
is no lock-order inversion problem.

Consequently, we can remove the BKL completely,
replacing it with a per-file mutex in every case.
Using a scripted approach means we can avoid
typos.

These drivers do not seem to be under active
maintainance from my brief investigation. Apologies
to those maintainers that I have missed.

file=$1
name=$2
if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
            sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
    else
            sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
    fi
    sed -i ${file} \
        -e "/^#include.*linux.mutex.h/,$ {
                1,/^\(static\|int\|long\)/ {
                     /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

} }"  \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[      ]*cycle_kernel_lock();/d'
else
    sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                -e '/cycle_kernel_lock()/d'
fi

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-05 15:01:04 +02:00
Dan Rosenberg
252a52aa4f Fix pktcdvd ioctl dev_minor range check
The PKT_CTRL_CMD_STATUS device ioctl retrieves a pointer to a
pktcdvd_device from the global pkt_devs array.  The index into this
array is provided directly by the user and is a signed integer, so the
comparison to ensure that it falls within the bounds of this array will
fail when provided with a negative index.

This can be used to read arbitrary kernel memory or cause a crash due to
an invalid pointer dereference.  This can be exploited by users with
permission to open /dev/pktcdvd/control (on many distributions, this is
readable by group "cdrom").

Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
[ Rather than add a cast, just make the function take the right type -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-27 16:29:06 -07:00
Vivek Goyal
504c6d1b44 amiga floppy: Compile failure fixes
o Compile fixes for amiga floppy driver.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-26 12:23:25 +09:00
Vivek Goyal
639e2f2aa7 atari floppy: Stop sharing request queue across multiple gendisks
o Use one request queue per gendisk instead of sharing the queue.

o Don't have hardware. No compile testing or run time testing done. Completely
  untested.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-24 20:35:45 +02:00
Vivek Goyal
786029ff81 amiga floppy: Stop sharing request queue across multiple gendisks
o Use one request queue per gendisk instead of sharing request queue

o Don't have hardware. No compile testing or run time testing done. Completely
  untested.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-24 20:35:44 +02:00
Jens Axboe
488211844e floppy: switch to one queue per drive instead of sharing a queue
Pretty straight forward conversion. Note that we do round-robin
between the drives that have available requests, before we simply
used the drive that the IO scheduler told us to. Since the IO
scheduler doesn't care about multiple devices per queue, the resulting
sort would not have made sense.

Fixed by Vivek to get rid of a double lock problem in set_next_request()

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2010-09-22 09:32:36 +02:00
Dan Carpenter
b0722cb1ac cciss: freeing uninitialized data on error path
The "h->scatter_list" is allocated inside a for loop.  If any of those
allocations fail, then the rest of the list is uninitialized data.  When
we free it we should start from the top and free backwards so that we
don't call kfree() on uninitialized pointers.

Also if the allocation for "h->scatter_list" fails then we would get an
Oops here.  I should have noticed this when I send: 4ee69851c "cciss:
handle allocation failure."  but I didn't.  Sorry about that.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-21 11:49:17 +02:00
Christoph Hellwig
dd3932eddf block: remove BLKDEV_IFL_WAIT
All the blkdev_issue_* helpers can only sanely be used for synchronous
caller.  To issue cache flushes or barriers asynchronously the caller needs
to set up a bio by itself with a completion callback to move the asynchronous
state machine ahead.  So drop the BLKDEV_IFL_WAIT flag that is always
specified when calling blkdev_issue_* and also remove the now unused flags
argument to blkdev_issue_flush and blkdev_issue_zeroout.  For
blkdev_issue_discard we need to keep it for the secure discard flag, which
gains a more descriptive name and loses the bitops vs flag confusion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-16 20:52:58 +02:00
Martin K. Petersen
c8bf133682 Consolidate min_not_zero
We have several users of min_not_zero, each of them using their own
definition.  Move the define to kernel.h.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2010-09-10 20:07:38 +02:00
Linus Torvalds
ff3cb3fec3 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Range check cpu in blk_cpu_to_group
  scatterlist: prevent invalid free when alloc fails
  writeback: Fix lost wake-up shutting down writeback thread
  writeback: do not lose wakeup events when forking bdi threads
  cciss: fix reporting of max queue depth since init
  block: switch s390 tape_block and mg_disk to elevator_change()
  block: add function call to switch the IO scheduler from a driver
  fs/bio-integrity.c: return -ENOMEM on kmalloc failure
  bio-integrity.c: remove dependency on __GFP_NOFAIL
  BLOCK: fix bio.bi_rw handling
  block: put dev->kobj in blk_register_queue fail path
  cciss: handle allocation failure
  cfq-iosched: Documentation help for new tunables
  cfq-iosched: blktrace print per slice sector stats
  cfq-iosched: Implement tunable group_idle
  cfq-iosched: Do group share accounting in IOPS when slice_idle=0
  cfq-iosched: Do not idle if slice_idle=0
  cciss: disable doorbell reset on reset_devices
  blkio: Fix return code for mkdir calls
2010-09-10 07:26:27 -07:00
Tejun Heo
02c42b7a68 virtio_blk: drop REQ_HARDBARRIER support
Remove now unused REQ_HARDBARRIER support.  virtio_blk already
supports REQ_FLUSH and the usefulness of REQ_FUA for virtio_blk is
questionable at this point, so there's nothing else to do to support
new REQ_FLUSH/FUA interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00
Tejun Heo
6259f28459 block/loop: implement REQ_FLUSH/FUA support
Deprecate REQ_HARDBARRIER and implement REQ_FLUSH/FUA instead.  Also,
instead of checking file->f_op->fsync() directly, look at the value of
vfs_fsync() and ignore -EINVAL return.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00