block: fix q->flush_rq NULL pointer crash on dm-mpath flush
Commit 1874198 ("blk-mq: rework flush sequencing logic") switched
->flush_rq from being an embedded member of the request_queue structure
to being dynamically allocated in blk_init_queue_node().
Request-based DM multipath doesn't use blk_init_queue_node(), instead it
uses blk_alloc_queue_node() + blk_init_allocated_queue(). Because
commit 1874198 placed the dynamic allocation of ->flush_rq in
blk_init_queue_node() any flush issued to a dm-mpath device would crash
with a NULL pointer, e.g.:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8125037e>] blk_rq_init+0x1e/0xb0
PGD bb3c7067 PUD bb01d067 PMD 0
Oops: 0002 [#1] SMP
...
CPU: 5 PID: 5028 Comm: dt Tainted: G W O 3.14.0-rc3.snitm+ #10
...
task: ffff88032fb270e0 ti: ffff880079564000 task.ti: ffff880079564000
RIP: 0010:[<ffffffff8125037e>] [<ffffffff8125037e>] blk_rq_init+0x1e/0xb0
RSP: 0018:ffff880079565c98 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000030
RDX: ffff880260c74048 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880079565ca8 R08: ffff880260aa1e98 R09: 0000000000000001
R10: ffff88032fa78500 R11: 0000000000000246 R12: 0000000000000000
R13: ffff880260aa1de8 R14: 0000000000000650 R15: 0000000000000000
FS: 00007f8d36a2a700(0000) GS:ffff88033fca0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000079b36000 CR4: 00000000000007e0
Stack:
0000000000000000 ffff880260c74048 ffff880079565cd8 ffffffff81257a47
ffff880260aa1de8 ffff880260c74048 0000000000000001 0000000000000000
ffff880079565d08 ffffffff81257c2d 0000000000000000 ffff880260aa1de8
Call Trace:
[<ffffffff81257a47>] blk_flush_complete_seq+0x2d7/0x2e0
[<ffffffff81257c2d>] blk_insert_flush+0x1dd/0x210
[<ffffffff8124ec59>] __elv_add_request+0x1f9/0x320
[<ffffffff81250681>] ? blk_account_io_start+0x111/0x190
[<ffffffff81253a4b>] blk_queue_bio+0x25b/0x330
[<ffffffffa0020bf5>] dm_request+0x35/0x40 [dm_mod]
[<ffffffff812530c0>] generic_make_request+0xc0/0x100
[<ffffffff81253173>] submit_bio+0x73/0x140
[<ffffffff811becdd>] submit_bio_wait+0x5d/0x80
[<ffffffff81257528>] blkdev_issue_flush+0x78/0xa0
[<ffffffff811c1f6f>] blkdev_fsync+0x3f/0x60
[<ffffffff811b7fde>] vfs_fsync_range+0x1e/0x20
[<ffffffff811b7ffc>] vfs_fsync+0x1c/0x20
[<ffffffff811b81f1>] do_fsync+0x41/0x80
[<ffffffff8118874e>] ? SyS_lseek+0x7e/0x80
[<ffffffff811b8260>] SyS_fsync+0x10/0x20
[<ffffffff8154c2d2>] system_call_fastpath+0x16/0x1b
Fix this by moving the ->flush_rq allocation from blk_init_queue_node()
to blk_init_allocated_queue(). blk_init_queue_node() also calls
blk_init_allocated_queue() so this change is functionality equivalent
for all blk_init_queue_node() callers.
Reported-by: Hannes Reinecke <hare@suse.de>
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
1 file changed