ext4: Fix potential reclaim deadlock when truncating partial block
The ext4_block_truncate_page() function previously called
grab_cache_page(), which called find_or_create_page() with the
__GFP_FS flag potentially set. This could cause a deadlock if the
system is low on memory and it attempts a memory reclaim, which could
potentially call back into ext4. So we need to call
find_or_create_page() directly, and remove the __GFP_FP flag to avoid
this potential deadlock.
Thanks to Roland Dreier for reporting a lockdep warning which showed
this problem.
[20786.363249] =================================
[20786.363257] [ INFO: inconsistent lock state ]
[20786.363265] 2.6.31-2-generic #14~rbd4gitd960eea9
[20786.363270] ---------------------------------
[20786.363276] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
[20786.363285] http/8397 [HC0[0]:SC0[0]:HE1:SE1] takes:
[20786.363291] (jbd2_handle){+.+.?.}, at: [<ffffffff812008bb>] jbd2_journal_start+0xdb/0x150
[20786.363314] {IN-RECLAIM_FS-W} state was registered at:
[20786.363320] [<ffffffff8108bef6>] mark_irqflags+0xc6/0x1a0
[20786.363334] [<ffffffff8108d347>] __lock_acquire+0x287/0x430
[20786.363345] [<ffffffff8108d595>] lock_acquire+0xa5/0x150
[20786.363355] [<ffffffff812008da>] jbd2_journal_start+0xfa/0x150
[20786.363365] [<ffffffff811d98a8>] ext4_journal_start_sb+0x58/0x90
[20786.363377] [<ffffffff811cce85>] ext4_delete_inode+0xc5/0x2c0
[20786.363389] [<ffffffff81146fa3>] generic_delete_inode+0xd3/0x1a0
[20786.363401] [<ffffffff81147095>] generic_drop_inode+0x25/0x30
[20786.363411] [<ffffffff81145ce2>] iput+0x62/0x70
[20786.363420] [<ffffffff81142878>] dentry_iput+0x98/0x110
[20786.363429] [<ffffffff81142a00>] d_kill+0x50/0x80
[20786.363438] [<ffffffff811444c5>] dput+0x95/0x180
[20786.363447] [<ffffffff8120de4b>] ecryptfs_d_release+0x2b/0x70
[20786.363459] [<ffffffff81142978>] d_free+0x28/0x60
[20786.363468] [<ffffffff81142a18>] d_kill+0x68/0x80
[20786.363477] [<ffffffff81142ad3>] prune_one_dentry+0xa3/0xc0
[20786.363487] [<ffffffff81142d61>] __shrink_dcache_sb+0x271/0x290
[20786.363497] [<ffffffff81142e89>] prune_dcache+0x109/0x1b0
[20786.363506] [<ffffffff81142f6f>] shrink_dcache_memory+0x3f/0x50
[20786.363516] [<ffffffff810f6d3d>] shrink_slab+0x12d/0x190
[20786.363527] [<ffffffff810f97d7>] balance_pgdat+0x4d7/0x640
[20786.363537] [<ffffffff810f9a57>] kswapd+0x117/0x170
[20786.363546] [<ffffffff810773ce>] kthread+0x9e/0xb0
[20786.363558] [<ffffffff8101430a>] child_rip+0xa/0x20
[20786.363569] [<ffffffffffffffff>] 0xffffffffffffffff
[20786.363598] irq event stamp: 15997
[20786.363603] hardirqs last enabled at (15997): [<ffffffff81125f9d>] kmem_cache_alloc+0xfd/0x1a0
[20786.363617] hardirqs last disabled at (15996): [<ffffffff81125f01>] kmem_cache_alloc+0x61/0x1a0
[20786.363628] softirqs last enabled at (15966): [<ffffffff810631ea>] __do_softirq+0x14a/0x220
[20786.363641] softirqs last disabled at (15861): [<ffffffff8101440c>] call_softirq+0x1c/0x30
[20786.363651]
[20786.363653] other info that might help us debug this:
[20786.363660] 3 locks held by http/8397:
[20786.363665] #0: (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<ffffffff8112ed24>] do_truncate+0x64/0x90
[20786.363685] #1: (&sb->s_type->i_alloc_sem_key#5){+++++.}, at: [<ffffffff81147f90>] notify_change+0x250/0x350
[20786.363707] #2: (jbd2_handle){+.+.?.}, at: [<ffffffff812008bb>] jbd2_journal_start+0xdb/0x150
[20786.363724]
[20786.363726] stack backtrace:
[20786.363734] Pid: 8397, comm: http Tainted: G C 2.6.31-2-generic #14~rbd4gitd960eea9
[20786.363741] Call Trace:
[20786.363752] [<ffffffff8108ad7c>] print_usage_bug+0x18c/0x1a0
[20786.363763] [<ffffffff8108b0c0>] ? check_usage_backwards+0x0/0xb0
[20786.363773] [<ffffffff8108bad2>] mark_lock_irq+0xf2/0x280
[20786.363783] [<ffffffff8108bd97>] mark_lock+0x137/0x1d0
[20786.363793] [<ffffffff8108c03c>] mark_held_locks+0x6c/0xa0
[20786.363803] [<ffffffff8108c11f>] lockdep_trace_alloc+0xaf/0xe0
[20786.363813] [<ffffffff810efbac>] __alloc_pages_nodemask+0x7c/0x180
[20786.363824] [<ffffffff810e9411>] ? find_get_page+0x91/0xf0
[20786.363835] [<ffffffff8111d3b7>] alloc_pages_current+0x87/0xd0
[20786.363845] [<ffffffff810e9827>] __page_cache_alloc+0x67/0x70
[20786.363856] [<ffffffff810eb7df>] find_or_create_page+0x4f/0xb0
[20786.363867] [<ffffffff811cb3be>] ext4_block_truncate_page+0x3e/0x460
[20786.363876] [<ffffffff812008da>] ? jbd2_journal_start+0xfa/0x150
[20786.363885] [<ffffffff812008bb>] ? jbd2_journal_start+0xdb/0x150
[20786.363895] [<ffffffff811c6415>] ? ext4_meta_trans_blocks+0x75/0xf0
[20786.363905] [<ffffffff811e8d8b>] ext4_ext_truncate+0x1bb/0x1e0
[20786.363916] [<ffffffff811072c5>] ? unmap_mapping_range+0x75/0x290
[20786.363926] [<ffffffff811ccc28>] ext4_truncate+0x498/0x630
[20786.363938] [<ffffffff8129b4ce>] ? _raw_spin_unlock+0x5e/0xb0
[20786.363947] [<ffffffff81107306>] ? unmap_mapping_range+0xb6/0x290
[20786.363957] [<ffffffff8108c3ad>] ? trace_hardirqs_on+0xd/0x10
[20786.363966] [<ffffffff811ffe58>] ? jbd2_journal_stop+0x1f8/0x2e0
[20786.363976] [<ffffffff81107690>] vmtruncate+0xb0/0x110
[20786.363986] [<ffffffff81147c05>] inode_setattr+0x35/0x170
[20786.363995] [<ffffffff811c9906>] ext4_setattr+0x186/0x370
[20786.364005] [<ffffffff81147eab>] notify_change+0x16b/0x350
[20786.364014] [<ffffffff8112ed30>] do_truncate+0x70/0x90
[20786.364021] [<ffffffff8112f48b>] T.657+0xeb/0x110
[20786.364021] [<ffffffff8112f4be>] sys_ftruncate+0xe/0x10
[20786.364021] [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
Reported-by: Roland Dreier <roland@digitalvampire.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
1 file changed