Skip to content

Consistently encode DRR_BEGIN packed nvlist payloads with NV_ENCODE_XDR#18372

Open
GarthSnyder wants to merge 1 commit intoopenzfs:masterfrom
GarthSnyder:pr-xdr-nvlists
Open

Consistently encode DRR_BEGIN packed nvlist payloads with NV_ENCODE_XDR#18372
GarthSnyder wants to merge 1 commit intoopenzfs:masterfrom
GarthSnyder:pr-xdr-nvlists

Conversation

@GarthSnyder
Copy link
Copy Markdown
Contributor

@GarthSnyder GarthSnyder commented Mar 25, 2026

This is a fix for #18360.

Currently, zfs send generates a mix of nvlist encodings in DRR_BEGIN records, some XDR and some in native byte order. The result is that many streams currently can't be zfs received on opposite-endian systems.

zfs send generates the outer wrappers for compound streams in userspace, and it explicitly requests NV_ENCODE_XDR format for those records. But the BEGIN records for individual datasets are generated on the kernel side, in dmu_send.c, where fnvlist_pack() is used for encoding. That routine hard-wires NV_ENCODE_NATIVE format.

This PR replaces the fnvlist_pack() call with a direct call to nvlist_pack() that specifies NV_ENCODE_XDR.

Motivation and Context

Currently, cross-endian zfs receives can fail because there is no facility within ZFS for byteswapping packed nvlists after the fact. When a DRR_BEGIN record with a cross-endian NV_ENCODE_NATIVE nvlist is received, the kernel rejects it with ENOTSUP, aborting the whole receive.

This PR is a step toward making any valid send stream readable and importable on any ZFS system. There are no doubt other stream encoding issues yet to be resolved, but in my limited testing, many opposite-endian-generated streams seem to be received just fine with this patch in place.

How Has This Been Tested?

This change likely affects the majority of nontrivial send streams, so the existing test suite is already a fairly comprehensive vetting, at least as far as same-endian functionality goes.

I have also built with this change on a big-endian system and generated several send streams that formerly were unimportable on little-endian systems. They now import fine.

Of note, this change requires no receiving-end support. All-XDR streams are already supported by the existing nvlist_unpack() infrastructure. There does not appear to be any stream-related code that does anything with packed nvlists other than passing them along to nvlist_unpack().

I will include cross-endian stream testing as part of a separate testing revamp for zstream. There are some interdependencies, so it would be helpful to have this PR in master before that PR is submitted.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

This is a fix for openzfs#18360.

Currently, zfs send generates a mix of nvlist encodings in DRR_BEGIN
records, some XDR and some in native byte order. The result is that most
streams currently can't be zfs received on opposite-endian systems.

zfs send generates the outer wrappers for compound streams in userspace,
and it explicitly requests NV_ENCODE_XDR format for those records. But
the BEGIN records for individual datasets are generated on the kernel
side, in dmu_send.c, where fnvlist_pack() is used for encoding. That
routine hard-wires NV_ENCODE_NATIVE format.

This PR replaces the fnvlist_pack() call with a direct call to
nvlist_pack() that specifies NV_ENCODE_XDR.

Signed-off-by: Garth Snyder <garth@garthsnyder.com>
@github-actions github-actions bot added the Status: Work in Progress Not yet ready for general review label Mar 25, 2026
@GarthSnyder GarthSnyder marked this pull request as ready for review March 28, 2026 18:55
Copilot AI review requested due to automatic review settings March 28, 2026 18:55
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Mar 28, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes cross-endian zfs send | zfs recv incompatibility by ensuring packed nvlist payloads in kernel-generated DRR_BEGIN records are encoded using XDR, consistent with userspace-generated compound stream wrapper records.

Changes:

  • Replace fnvlist_pack() (native-endian) with an explicit nvlist_pack(..., NV_ENCODE_XDR, ...) for DRR_BEGIN payload encoding in dmu_send_impl().
  • Preserve existing behavior of including optional BEGIN nvlist fields (redaction/resume/crypto metadata), but now with a consistent on-wire encoding.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@GarthSnyder GarthSnyder marked this pull request as draft March 29, 2026 00:13
@github-actions github-actions bot added Status: Work in Progress Not yet ready for general review and removed Status: Code Review Needed Ready for review and testing labels Mar 29, 2026
@GarthSnyder
Copy link
Copy Markdown
Contributor Author

GarthSnyder commented Mar 29, 2026

I think Copilot's comment regarding payloads not being rounded up to an 8-byte boundary is correct. Internally, NV_ENCODE_NATIVE seems to work with 8-byte rounding, but NV_ENCODE_XDR uses 4-byte rounding.

In theory, send_prelim_records() in libzfs_sendrecv.c should raise this same issue, as it does not appear to do any rounding. However, that file has its own private implementation of dump_record() that does not double-check payload sizes for 8-byte granularity. The receiving side is also special-cased and does not run through the usual assertions. I have verified that zfs send does in fact generate non-rounded DRR_BEGIN payloads in some cases.

This PR does the stupidest possible thing: if a payload needs to be rounded up, a second, slightly longer buffer is allocated and the payload is copied into it, with the extra bytes being zeroed. The alternative would be to define a customized nv_alloc_t that does rounding on allocation, thus sparing the copy. That's how I did it at first, but it's inelegant. It is, in effect, a monkey patch, and requires a separate allocator function. It's also harder to make it compile in userspace since some of the normal nv_alloc_t infrastructure isn't available there.

I suggest staying with the copy. BEGIN record payloads are typically small and infrequent, there's at most one per dataset, and the allocation is only active momentarily. But that other version exists and I'd be happy to sub it in if that's preferable.

@GarthSnyder GarthSnyder force-pushed the pr-xdr-nvlists branch 3 times, most recently from 161e7a4 to 774b697 Compare April 4, 2026 23:22
@GarthSnyder GarthSnyder marked this pull request as ready for review April 5, 2026 19:37
Copilot AI review requested due to automatic review settings April 5, 2026 19:37
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Apr 5, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings April 6, 2026 21:37
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for the fix.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Apr 7, 2026
@github-actions github-actions bot removed the Status: Accepted Ready to integrate (reviewed, tested) label Apr 7, 2026
@behlendorf
Copy link
Copy Markdown
Contributor

Since you've already written the test case for this in #18355 we should include it in this PR.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants