Skip to content

zpl: handle suspend from two remaining calls to txg_wait_synced()#17413

Merged
amotin merged 2 commits intoopenzfs:masterfrom
robn:zpl-txg-suspend-break
Jun 5, 2025
Merged

zpl: handle suspend from two remaining calls to txg_wait_synced()#17413
amotin merged 2 commits intoopenzfs:masterfrom
robn:zpl-txg-suspend-break

Conversation

@robn
Copy link
Copy Markdown
Member

@robn robn commented Jun 2, 2025

Motivation and Context

Following #17355, there's two remaining ZPL ops that use txg_wait_synced() directly. This converts them to allow suspend.

(This came from review on #17398, but not really related).

Description

Convert the two call sites to txg_wait_synced_flags(), setting TXG_WAIT_SUSPEND when failmode=continue, and handling a suspend (ESHUTDOWN) by returning EIO.

(Aside: this particular pattern of checking spa_failmode, choosing the right suspend flags and making the call could probably be a macro, but I couldn't think of a great name right now, and I can live with it for two calls. It's probably ok to wait for the moment, at least until a similar question in #17398 is resolved, and until an update to #11082 is posted, as I know @ihoro has some "convert the errors" ideas in there).

How Has This Been Tested?

Compile checked on Linux and FreeBSD. Will rely on CI tests to make sure I haven't broken anything. Nothing tests these codepaths directly though.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

Comment thread module/zfs/zfs_vnops.c
Comment thread module/os/linux/zfs/zfs_vnops_os.c Outdated
robn added 2 commits June 4, 2025 20:32
4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_link() can fall back to
txg_wait_synced() if it has to wait for a tempfile to be fully created
before continuing, which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_clone_range() can fall back to
txg_wait_synced() if it has to wait for a dirty block to be written out,
which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
@robn robn force-pushed the zpl-txg-suspend-break branch from da29c78 to a13f25e Compare June 4, 2025 10:33
@amotin amotin added the Status: Accepted Ready to integrate (reviewed, tested) label Jun 4, 2025
@amotin amotin merged commit af7d609 into openzfs:master Jun 5, 2025
23 checks passed
spauka pushed a commit to spauka/zfs that referenced this pull request Aug 30, 2025
* zfs_link: allow tempfile sync to fail if pool suspends

4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_link() can fall back to
txg_wait_synced() if it has to wait for a tempfile to be fully created
before continuing, which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

* zfs_clone_range: allow dirty wait to fail if pool suspends

4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_clone_range() can fall back to
txg_wait_synced() if it has to wait for a dirty block to be written out,
which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes openzfs#17413
lundman pushed a commit to openzfsonosx/openzfs-fork that referenced this pull request Jan 30, 2026
* zfs_link: allow tempfile sync to fail if pool suspends

4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_link() can fall back to
txg_wait_synced() if it has to wait for a tempfile to be fully created
before continuing, which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

* zfs_clone_range: allow dirty wait to fail if pool suspends

4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_clone_range() can fall back to
txg_wait_synced() if it has to wait for a dirty block to be written out,
which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes openzfs#17413
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Feb 16, 2026
* zfs_link: allow tempfile sync to fail if pool suspends

4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_link() can fall back to
txg_wait_synced() if it has to wait for a tempfile to be fully created
before continuing, which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

* zfs_clone_range: allow dirty wait to fail if pool suspends

4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_clone_range() can fall back to
txg_wait_synced() if it has to wait for a dirty block to be written out,
which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes openzfs#17413
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Accepted Ready to integrate (reviewed, tested)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants