linux/drivers/md
Guoqing Jiang f6766ff6af md: don't flush workqueue unconditionally in md_open
We need to check mddev->del_work before flush workqueu since the purpose
of flush is to ensure the previous md is disappeared. Otherwise the similar
deadlock appeared if LOCKDEP is enabled, it is due to md_open holds the
bdev->bd_mutex before flush workqueue.

kernel: [  154.522645] ======================================================
kernel: [  154.522647] WARNING: possible circular locking dependency detected
kernel: [  154.522650] 5.6.0-rc7-lp151.27-default #25 Tainted: G           O
kernel: [  154.522651] ------------------------------------------------------
kernel: [  154.522653] mdadm/2482 is trying to acquire lock:
kernel: [  154.522655] ffff888078529128 ((wq_completion)md_misc){+.+.}, at: flush_workqueue+0x84/0x4b0
kernel: [  154.522673]
kernel: [  154.522673] but task is already holding lock:
kernel: [  154.522675] ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590
kernel: [  154.522691]
kernel: [  154.522691] which lock already depends on the new lock.
kernel: [  154.522691]
kernel: [  154.522694]
kernel: [  154.522694] the existing dependency chain (in reverse order) is:
kernel: [  154.522696]
kernel: [  154.522696] -> #4 (&bdev->bd_mutex){+.+.}:
kernel: [  154.522704]        __mutex_lock+0x87/0x950
kernel: [  154.522706]        __blkdev_get+0x79/0x590
kernel: [  154.522708]        blkdev_get+0x65/0x140
kernel: [  154.522709]        blkdev_get_by_dev+0x2f/0x40
kernel: [  154.522716]        lock_rdev+0x3d/0x90 [md_mod]
kernel: [  154.522719]        md_import_device+0xd6/0x1b0 [md_mod]
kernel: [  154.522723]        new_dev_store+0x15e/0x210 [md_mod]
kernel: [  154.522728]        md_attr_store+0x7a/0xc0 [md_mod]
kernel: [  154.522732]        kernfs_fop_write+0x117/0x1b0
kernel: [  154.522735]        vfs_write+0xad/0x1a0
kernel: [  154.522737]        ksys_write+0xa4/0xe0
kernel: [  154.522745]        do_syscall_64+0x64/0x2b0
kernel: [  154.522748]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
kernel: [  154.522749]
kernel: [  154.522749] -> #3 (&mddev->reconfig_mutex){+.+.}:
kernel: [  154.522752]        __mutex_lock+0x87/0x950
kernel: [  154.522756]        new_dev_store+0xc9/0x210 [md_mod]
kernel: [  154.522759]        md_attr_store+0x7a/0xc0 [md_mod]
kernel: [  154.522761]        kernfs_fop_write+0x117/0x1b0
kernel: [  154.522763]        vfs_write+0xad/0x1a0
kernel: [  154.522765]        ksys_write+0xa4/0xe0
kernel: [  154.522767]        do_syscall_64+0x64/0x2b0
kernel: [  154.522769]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
kernel: [  154.522770]
kernel: [  154.522770] -> #2 (kn->count#253){++++}:
kernel: [  154.522775]        __kernfs_remove+0x253/0x2c0
kernel: [  154.522778]        kernfs_remove+0x1f/0x30
kernel: [  154.522780]        kobject_del+0x28/0x60
kernel: [  154.522783]        mddev_delayed_delete+0x24/0x30 [md_mod]
kernel: [  154.522786]        process_one_work+0x2a7/0x5f0
kernel: [  154.522788]        worker_thread+0x2d/0x3d0
kernel: [  154.522793]        kthread+0x117/0x130
kernel: [  154.522795]        ret_from_fork+0x3a/0x50
kernel: [  154.522796]
kernel: [  154.522796] -> #1 ((work_completion)(&mddev->del_work)){+.+.}:
kernel: [  154.522800]        process_one_work+0x27e/0x5f0
kernel: [  154.522802]        worker_thread+0x2d/0x3d0
kernel: [  154.522804]        kthread+0x117/0x130
kernel: [  154.522806]        ret_from_fork+0x3a/0x50
kernel: [  154.522807]
kernel: [  154.522807] -> #0 ((wq_completion)md_misc){+.+.}:
kernel: [  154.522813]        __lock_acquire+0x1392/0x1690
kernel: [  154.522816]        lock_acquire+0xb4/0x1a0
kernel: [  154.522818]        flush_workqueue+0xab/0x4b0
kernel: [  154.522821]        md_open+0xb6/0xc0 [md_mod]
kernel: [  154.522823]        __blkdev_get+0xea/0x590
kernel: [  154.522825]        blkdev_get+0x65/0x140
kernel: [  154.522828]        do_dentry_open+0x1d1/0x380
kernel: [  154.522831]        path_openat+0x567/0xcc0
kernel: [  154.522834]        do_filp_open+0x9b/0x110
kernel: [  154.522836]        do_sys_openat2+0x201/0x2a0
kernel: [  154.522838]        do_sys_open+0x57/0x80
kernel: [  154.522840]        do_syscall_64+0x64/0x2b0
kernel: [  154.522842]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
kernel: [  154.522844]
kernel: [  154.522844] other info that might help us debug this:
kernel: [  154.522844]
kernel: [  154.522846] Chain exists of:
kernel: [  154.522846]   (wq_completion)md_misc --> &mddev->reconfig_mutex --> &bdev->bd_mutex
kernel: [  154.522846]
kernel: [  154.522850]  Possible unsafe locking scenario:
kernel: [  154.522850]
kernel: [  154.522852]        CPU0                    CPU1
kernel: [  154.522853]        ----                    ----
kernel: [  154.522854]   lock(&bdev->bd_mutex);
kernel: [  154.522856]                                lock(&mddev->reconfig_mutex);
kernel: [  154.522858]                                lock(&bdev->bd_mutex);
kernel: [  154.522860]   lock((wq_completion)md_misc);
kernel: [  154.522861]
kernel: [  154.522861]  *** DEADLOCK ***
kernel: [  154.522861]
kernel: [  154.522864] 1 lock held by mdadm/2482:
kernel: [  154.522865]  #0: ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590
kernel: [  154.522868]
kernel: [  154.522868] stack backtrace:
kernel: [  154.522873] CPU: 1 PID: 2482 Comm: mdadm Tainted: G           O      5.6.0-rc7-lp151.27-default #25
kernel: [  154.522875] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
kernel: [  154.522878] Call Trace:
kernel: [  154.522881]  dump_stack+0x8f/0xcb
kernel: [  154.522884]  check_noncircular+0x194/0x1b0
kernel: [  154.522888]  ? __lock_acquire+0x1392/0x1690
kernel: [  154.522890]  __lock_acquire+0x1392/0x1690
kernel: [  154.522893]  lock_acquire+0xb4/0x1a0
kernel: [  154.522895]  ? flush_workqueue+0x84/0x4b0
kernel: [  154.522898]  flush_workqueue+0xab/0x4b0
kernel: [  154.522900]  ? flush_workqueue+0x84/0x4b0
kernel: [  154.522905]  ? md_open+0xb6/0xc0 [md_mod]
kernel: [  154.522908]  md_open+0xb6/0xc0 [md_mod]
kernel: [  154.522910]  __blkdev_get+0xea/0x590
kernel: [  154.522912]  ? bd_acquire+0xc0/0xc0
kernel: [  154.522914]  blkdev_get+0x65/0x140
kernel: [  154.522916]  ? bd_acquire+0xc0/0xc0
kernel: [  154.522918]  do_dentry_open+0x1d1/0x380
kernel: [  154.522921]  path_openat+0x567/0xcc0
kernel: [  154.522923]  ? __lock_acquire+0x380/0x1690
kernel: [  154.522926]  do_filp_open+0x9b/0x110
kernel: [  154.522929]  ? __alloc_fd+0xe5/0x1f0
kernel: [  154.522935]  ? kmem_cache_alloc+0x28c/0x630
kernel: [  154.522939]  ? do_sys_openat2+0x201/0x2a0
kernel: [  154.522941]  do_sys_openat2+0x201/0x2a0
kernel: [  154.522944]  do_sys_open+0x57/0x80
kernel: [  154.522946]  do_syscall_64+0x64/0x2b0
kernel: [  154.522948]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
kernel: [  154.522951] RIP: 0033:0x7f98d279d9ae

And md_alloc also flushed the same workqueue, but the thing is different
here. Because all the paths call md_alloc don't hold bdev->bd_mutex, and
the flush is necessary to avoid race condition, so leave it as it is.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2020-05-13 11:22:31 -07:00
..
bcache bcache: remove a duplicate ->make_request_fn assignment 2020-04-25 09:45:43 -06:00
persistent-data dm space map common: fix to ensure new block isn't already in use 2020-01-14 20:15:53 -05:00
dm-bio-prison-v1.c dm bio prison: replace spin_lock_irqsave with spin_lock_irq 2019-11-05 14:53:03 -05:00
dm-bio-prison-v1.h
dm-bio-prison-v2.c dm bio prison v2: use true/false for bool variable 2020-01-07 12:07:08 -05:00
dm-bio-prison-v2.h
dm-bio-record.h dm bio record: save/restore bi_end_io and bi_integrity 2020-03-03 10:02:46 -05:00
dm-bufio.c dm bufio: introduce a global cache replacement 2019-09-13 17:00:21 -04:00
dm-builtin.c
dm-cache-background-tracker.c
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c dm: bump version of core and various targets 2020-03-03 11:10:21 -05:00
dm-clone-metadata.c dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() 2020-03-27 14:42:51 -04:00
dm-clone-metadata.h dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() 2020-03-27 14:42:51 -04:00
dm-clone-target.c dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() 2020-03-27 14:42:51 -04:00
dm-core.h
dm-crypt.c dm crypt: use crypt_integrity_aead() helper 2020-03-24 11:17:33 -04:00
dm-delay.c
dm-dust.c dm dust: change ret to r in dust_map_write 2020-01-07 11:43:36 -05:00
dm-era-target.c
dm-exception-store.c
dm-exception-store.h
dm-flakey.c block: rework zone reporting 2019-11-12 19:12:07 -07:00
dm-init.c docs: device-mapper: move it to the admin-guide 2019-07-15 11:03:01 -03:00
dm-integrity.c dm integrity: fix logic bug in integrity tag testing 2020-04-03 13:07:41 -04:00
dm-io.c
dm-ioctl.c dm: introduce DM_GET_TARGET_VERSION 2019-09-16 10:18:01 -04:00
dm-kcopyd.c dm kcopyd: always complete failed jobs 2019-08-15 15:57:39 -04:00
dm-linear.c dm,dax: Add dax zero_page_range operation 2020-04-02 19:15:03 -07:00
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c dm,dax: Add dax zero_page_range operation 2020-04-02 19:15:03 -07:00
dm-log.c
dm-mpath.c dm: bump version of core and various targets 2020-03-03 11:10:21 -05:00
dm-mpath.h dm mpath: remove is_active from struct dm_path 2008-10-10 13:36:58 +01:00
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c
dm-raid.c dm raid: table line rebuild status fixes 2020-01-07 11:43:37 -05:00
dm-raid1.c dm raid1: use struct_size() with kzalloc() 2019-08-26 11:05:32 -04:00
dm-region-hash.c
dm-round-robin.c
dm-rq.c block: Delay default elevator initialization 2019-09-05 19:52:34 -06:00
dm-rq.h
dm-service-time.c
dm-snap-persistent.c block: fix an integer overflow in logical block size 2020-01-15 21:43:09 -07:00
dm-snap-transient.c
dm-snap.c dm snapshot: use true/false for bool variable 2020-01-07 12:07:17 -05:00
dm-stats.c dm stats: use struct_size() helper 2019-09-04 09:39:22 -04:00
dm-stats.h
dm-stripe.c dm,dax: Add dax zero_page_range operation 2020-04-02 19:15:03 -07:00
dm-switch.c dm switch: use struct_size() in kzalloc() 2019-03-05 14:48:51 -05:00
dm-sysfs.c
dm-table.c dm: remove the make_request_fn check in device_area_is_invalid 2020-04-25 09:45:43 -06:00
dm-target.c dm mpath: fix missing call of path selector type->end_io 2019-04-25 15:38:52 -04:00
dm-thin-metadata.c dm thin metadata: fix lockdep complaint 2020-02-27 12:00:53 -05:00
dm-thin-metadata.h dm thin metadata: Add support for a pre-commit callback 2019-12-05 17:05:24 -05:00
dm-thin.c dm thin: change data device's flush_bio to be member of struct pool 2020-01-14 20:23:13 -05:00
dm-uevent.c
dm-uevent.h
dm-unstripe.c
dm-verity-fec.c dm verity fec: fix memory leak in verity_fec_dtr 2020-03-24 12:00:29 -04:00
dm-verity-fec.h
dm-verity-target.c dm: bump version of core and various targets 2020-03-03 11:10:21 -05:00
dm-verity-verify-sig.c dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-verity-verify-sig.h dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-verity.h dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-writecache.c dm writecache: add cond_resched to avoid CPU hangs 2020-03-27 14:36:50 -04:00
dm-zero.c
dm-zoned-metadata.c dm zoned: remove duplicate nr_rnd_zones increase in dmz_init_zone() 2020-03-24 12:21:48 -04:00
dm-zoned-reclaim.c dm zoned: reduce overhead of backing device checks 2019-11-07 10:08:36 -05:00
dm-zoned-target.c dm: bump version of core and various targets 2020-03-03 11:10:21 -05:00
dm-zoned.h dm zoned: reduce overhead of backing device checks 2019-11-07 10:08:36 -05:00
dm.c block: bypass ->make_request_fn for blk-mq drivers 2020-04-25 09:45:44 -06:00
dm.h dm: make dm_table_find_target return NULL 2019-08-23 10:13:12 -04:00
Kconfig dm: Fix Kconfig indentation 2019-11-20 10:35:31 -05:00
Makefile dm: add clone target 2019-09-12 09:32:31 -04:00
md-bitmap.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-02-08 13:04:49 -08:00
md-bitmap.h
md-cluster.c
md-cluster.h
md-faulty.c
md-linear.c md: improve handling of bio with REQ_PREFLUSH in md_flush_request() 2019-10-24 15:22:40 -07:00
md-linear.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-multipath.c md: improve handling of bio with REQ_PREFLUSH in md_flush_request() 2019-10-24 15:22:40 -07:00
md-multipath.h
md.c md: don't flush workqueue unconditionally in md_open 2020-05-13 11:22:31 -07:00
md.h md: introduce a new struct for IO serialization 2020-01-13 11:44:10 -08:00
raid0.c block: fix an integer overflow in logical block size 2020-01-15 21:43:09 -07:00
raid0.h md/raid0: avoid RAID0 data corruption due to layout confusion. 2019-09-13 13:10:05 -07:00
raid1-10.c md: raid1-10: Unify r{1,10}bio_pool_free 2019-06-15 01:37:35 -06:00
raid1.c md/raid1: introduce wait_for_serialization 2020-01-13 11:44:10 -08:00
raid1.h
raid5-cache.c
raid5-log.h raid5: set write hint for PPL 2019-03-12 10:15:18 -07:00
raid5-ppl.c treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
raid5.c raid5: remove worker_cnt_per_group argument from alloc_thread_groups 2020-01-13 11:44:09 -08:00
raid5.h raid5: use bio_end_sector in r5_next_bio 2019-09-13 13:14:43 -07:00
raid10.c md/raid10: prevent access of uninitialized resync_pages offset 2019-11-11 16:47:39 -08:00
raid10.h