syncfs: don't erroneously return success when the pool is suspended#17420
Closed
robn wants to merge 4 commits intoopenzfs:masterfrom
Closed
syncfs: don't erroneously return success when the pool is suspended#17420robn wants to merge 4 commits intoopenzfs:masterfrom
syncfs: don't erroneously return success when the pool is suspended#17420robn wants to merge 4 commits intoopenzfs:masterfrom
Conversation
syncfs: don't erroneously return success when the pool is suspended
14 tasks
pcd1193182
approved these changes
Jun 4, 2025
Contributor
pcd1193182
left a comment
There was a problem hiding this comment.
The solution is sort of gross, but so it the problem space. I can't think of a better approach than this one, given the constraints we have. I would say you could close the github issue, though I would consider opening a new one specifically for "We can't tell if the system is shutting down in the syncfs() path"
amotin
reviewed
Jun 4, 2025
4b13dd1 to
6876437
Compare
behlendorf
approved these changes
Jun 11, 2025
Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
behlendorf
approved these changes
Jun 11, 2025
behlendorf
pushed a commit
that referenced
this pull request
Jun 12, 2025
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17420
behlendorf
pushed a commit
that referenced
this pull request
Jun 12, 2025
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17420
behlendorf
pushed a commit
that referenced
this pull request
Jun 12, 2025
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 13, 2025
Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 13, 2025
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 13, 2025
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 13, 2025
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
behlendorf
pushed a commit
that referenced
this pull request
Jun 17, 2025
Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17420
behlendorf
pushed a commit
that referenced
this pull request
Jun 17, 2025
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17420
behlendorf
pushed a commit
that referenced
this pull request
Jun 17, 2025
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17420
behlendorf
pushed a commit
that referenced
this pull request
Jun 17, 2025
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 17, 2025
Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 17, 2025
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 17, 2025
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
behlendorf
pushed a commit
to behlendorf/zfs
that referenced
this pull request
Jun 17, 2025
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
14 tasks
spauka
pushed a commit
to spauka/zfs
that referenced
this pull request
Aug 30, 2025
Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
spauka
pushed a commit
to spauka/zfs
that referenced
this pull request
Aug 30, 2025
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
spauka
pushed a commit
to spauka/zfs
that referenced
this pull request
Aug 30, 2025
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
spauka
pushed a commit
to spauka/zfs
that referenced
this pull request
Aug 30, 2025
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonosx/openzfs-fork
that referenced
this pull request
Jan 30, 2026
Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonosx/openzfs-fork
that referenced
this pull request
Jan 30, 2026
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonosx/openzfs-fork
that referenced
this pull request
Jan 30, 2026
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonosx/openzfs-fork
that referenced
this pull request
Jan 30, 2026
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonwindows/openzfs
that referenced
this pull request
Feb 16, 2026
Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonwindows/openzfs
that referenced
this pull request
Feb 16, 2026
The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonwindows/openzfs
that referenced
this pull request
Feb 16, 2026
If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
lundman
pushed a commit
to openzfsonwindows/openzfs
that referenced
this pull request
Feb 16, 2026
If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes openzfs#17420
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[Sponsors: Klara, Inc., Wasabi Technology, Inc.]
Motivation and Context
See #17416. This PR is tackling the core problem described:
syncfs()can return success even when the pool has failed. There's a lot more we can do but this should address correctness, which is more important than anything else.I'm hesitant to say this closes #17416, since there's more we could do in there to improve the situation. On the other hand, it straight-up says "this is wrong" and this PR un-wrongs it, so idk, reviewer's call :)
Description
See individual commits, described below.
Add a test case to demonstrate the issue, and verify the fix. We don't generally care what
syncfs()returns when the pool is up, but we want to be sure that if the pool suspends, it either blocks until it returns, or returns an error. (we do care, of course, but it's not something we can address here; I'm working towards that in ZIL: "crash" the ZIL if the pool suspends during fallback #17398 and later).Simplify
zfs_sync()by removing code for scenarios that can't occur, that is, called with NULL superblock and called with no ZIL. Seesyncfs()returns success on suspended pools #17416 for more on this.Change the suspended check to return
EIOif the pool suspends. This is nod back to the original reason for this check, which is to not block on suspended pools during shutdown (see discussion insyncfs()returns success on suspended pools #17416). This will likely change in ZIL: "crash" the ZIL if the pool suspends during fallback #17398 once the ZIL can report a failure directly without blocking, but until then, this seems like a reasonable middle road to not block the call.Ensure safe error behaviour according to what
syncfs()can do. The comment explains more, but the short version is that before 5.17 the kernel ignores errors directly returned bysync_fs(), but a workaround is available back to 5.8. If the workaround is not available either, then the call will return success to userspace, and we have no choice but to block until the txg completes. Unamazing, but ensures correctness.How Has This Been Tested?
The new
syncfs_suspendtest passes on 5.4.293 (blocks, no fallbacks available) and 6.1.140 (sync_fsreturn works, immediate error return).It's difficult to test the superblock writeback error case on a release kernel, since the LTS kernels in the 5.8-5.17 range that have no other options all have the fix backported, and non-LTS 5.x kernels are difficult to build these days. To check, I used 6.1.140, commented out the version check and forced the returned error to 0, leaving the superblock writeback error as the only remaining error state. This went as expected -
syncfs()return an error on a suspended pool.Types of changes
Checklist:
Signed-off-by.