Skip to content

2.1.7 Deadlock on buf_hash_table #17453

@sSsXzZz

Description

@sSsXzZz

System information

Type Version/Name
Distribution Name Amazon Linux
Distribution Version 2
Kernel Version 5.10.236-228.935.amzn2.aarch64
Architecture aarch64
OpenZFS Version 2.1.7

Describe the problem you're observing

We are using ZFS with Lustre v2.15.5. Occasionally when memory pressure is high, we are seeing deadlocks in the ZFS layer. We see threads inside arc_read which tries to allocate more memory, triggering memory reclamation. We also see kswapd stuck in arc_buf_destroy on the hash lock (buf_hash_table). We're guessing that one of the arc_read threads is holding the hash lock while it tries to reclaim memory and kswapd getting stuck on the hash lock is stalling the whole system.

Describe how to reproduce the problem

High memory usage while driving IO.

Include any warning/errors/backtraces from the system logs

Lustre IO thread

[372109.472745] Lustre: ll_ost_io00_006: service thread pid 4975 was inactive for 202.714 seconds. The thread might be hung
[372109.472747] task:ll_ost00_009    state:R
[372109.472748] task:ll_ost00_011    state:R
[372109.472751]   running task
[372109.472751]   running task     stack:    0 pid: 4934 ppid:     2 flags:0x00000228
[372109.472754] Call trace:
[372109.472759]  __switch_to+0xbc/0xfc
[372109.472763]  __schedule+0x28c/0x718
[372109.472765]  _cond_resched+0x48/0x60
[372109.472768]  shrink_page_list+0x6c/0xc70
[372109.472769]  shrink_inactive_list+0x160/0x510
[372109.472771]  shrink_lruvec+0x26c/0x300
[372109.472772]  shrink_node_memcgs+0x1c0/0x230
[372109.472773]  shrink_node+0x150/0x5e0
[372109.472775]  shrink_zones+0x98/0x220
[372109.472776]  do_try_to_free_pages+0xac/0x2e0
[372109.472778]  try_to_free_pages+0x120/0x25c
[372109.472780]  __alloc_pages_slowpath.constprop.0+0x400/0x82c
[372109.472781]  __alloc_pages_nodemask+0x2b4/0x310
[372109.472831]  abd_alloc_chunks+0x184/0x470 [zfs]
[372109.472881]  abd_alloc+0x90/0x120 [zfs]
[372109.472924]  arc_hdr_alloc_abd+0x134/0x22c [zfs]
[372109.472966]  arc_read+0x4a8/0x10a0 [zfs]
[372109.473008]  dbuf_read_impl.constprop.0+0x23c/0x3dc [zfs]
[372109.473050]  dbuf_read+0xd4/0x65c [zfs]
[372109.473092]  dmu_buf_hold_by_dnode+0xa0/0x124 [zfs]
[372109.473136]  zap_get_leaf_byblk+0x68/0x154 [zfs]
[372109.473178]  zap_deref_leaf+0xb4/0x148 [zfs]
[372109.473221]  fzap_lookup+0x80/0x1b8 [zfs]
[372109.473263]  zap_lookup_impl+0x6c/0x1c8 [zfs]
[372109.473305]  zap_lookup+0xc8/0x10c [zfs]
[372109.473314]  osd_fid_lookup+0x27c/0x4d4 [osd_zfs]
[372109.473322]  osd_object_init+0x314/0xaec [osd_zfs]
[372109.473355]  lu_object_start+0x84/0x154 [obdclass]
[372109.473385]  lu_object_find_at+0x37c/0x72c [obdclass]
[372109.473415]  lu_object_find+0x1c/0x24 [obdclass]
[372109.473423]  ofd_object_find+0x6c/0x18c [ofd]
[372109.473430]  ofd_lvbo_init+0x294/0x938 [ofd]
[372109.473481]  ldlm_lvbo_init+0x70/0x2e0 [ptlrpc]
[372109.473529]  ldlm_handle_enqueue0+0x540/0x1b24 [ptlrpc]
[372109.473577]  tgt_enqueue+0x84/0x2c0 [ptlrpc]
[372109.473624]  tgt_handle_request0+0x2b4/0x658 [ptlrpc]
[372109.473671]  tgt_request_handle+0x268/0xaac [ptlrpc]
[372109.473718]  ptlrpc_server_handle_request.isra.0+0x460/0xf20 [ptlrpc]
[372109.473765]  ptlrpc_main+0xd24/0x15bc [ptlrpc]
[372109.473768]  kthread+0x118/0x120

kswapd

372073.633195] INFO: task kswapd0:188 blocked for more than 122 seconds.
[372073.634293]       Tainted: P           OE     5.10.236-228.935.amzn2.aarch64 #1
[372073.635482] echo 0 > /proc/sys/kernel/hung_task_timeout_secs disables this message.
[372073.636758] task:kswapd0         state:D stack:    0 pid:  188 ppid:     2 flags:0x00000028
[372073.638118] Call trace:
[372073.638546]  __switch_to+0xbc/0xfc
[372073.639123]  __schedule+0x28c/0x718
[372073.639712]  schedule+0x4c/0xcc
[372073.640248]  schedule_preempt_disabled+0x14/0x1c
[372073.641015]  __mutex_lock.constprop.0+0x190/0x640
[372073.641796]  __mutex_lock_slowpath+0x18/0x20
[372073.642506]  mutex_lock+0x74/0x80
[372073.643112]  arc_buf_destroy+0x84/0x178 [zfs]
[372073.643884]  dbuf_destroy+0x38/0x3dc [zfs]
[372073.644607]  dbuf_evict_one+0x168/0x188 [zfs]
[372073.645376]  dbuf_evict_notify+0xe0/0xf0 [zfs]
[372073.646153]  dbuf_rele_and_unlock+0x61c/0x708 [zfs]
[372073.646998]  dmu_buf_rele+0x44/0x58 [zfs]
[372073.647710]  sa_handle_destroy+0x7c/0x120 [zfs]
[372073.648470]  osd_object_delete+0x64/0x17c [osd_zfs]
[372073.649312]  lu_object_free.isra.0+0x88/0x1fc [obdclass]
[372073.650215]  lu_site_purge_objects+0x338/0x47c [obdclass]
[372073.651127]  lu_cache_shrink_scan+0xa0/0x18c [obdclass]
[372073.651989]  do_shrink_slab+0x194/0x394
[372073.652635]  shrink_slab+0xbc/0x13c
[372073.653234]  shrink_node_memcgs+0x1d4/0x230
[372073.653943]  shrink_node+0x150/0x5e0
[372073.654554]  balance_pgdat+0x260/0x524
[372073.655192]  kswapd+0x124/0x208
[372073.655732]  kthread+0x118/0x120

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions