Skip to content

Add TXG timestamp database#16853

Merged
behlendorf merged 1 commit intoopenzfs:masterfrom
oshogbo:oshogbo/scrub_data_range
Aug 6, 2025
Merged

Add TXG timestamp database#16853
behlendorf merged 1 commit intoopenzfs:masterfrom
oshogbo:oshogbo/scrub_data_range

Conversation

@oshogbo
Copy link
Copy Markdown
Contributor

@oshogbo oshogbo commented Dec 11, 2024

Motivation and Context

This feature enables tracking of when TXGs are committed to disk, providing an estimated timestamp for each TXG.

With this information, it becomes possible to perform scrubs based on specific date ranges, improving the granularity of data management and recovery operations.

Description

To achieve this, we implemented a round-robin database that keeps track of time. We separate the tracking into minutes, days, and years. We believe this provides the best resolution for time management. This feature does not track the exact time of each transaction group (txg) but provides an estimate. The txg database can also be used in other scenarios where mapping dates to transaction groups is required.

How Has This Been Tested?

  • Create pool
  • write data
  • wait some time
  • write data
  • wait some time
  • try to scrub different times

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 7 times, most recently from 2a20b11 to 364f813 Compare December 11, 2024 14:01
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Dec 11, 2024
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 364f813 to 891c8f2 Compare December 11, 2024 15:50
@amotin
Copy link
Copy Markdown
Member

amotin commented Dec 12, 2024

It crashes on VERIFY(!dmu_objset_is_dirty(dp->dp_meta_objset, txg)).

@amotin
Copy link
Copy Markdown
Member

amotin commented Dec 12, 2024

This reminds me we recently added ddp_class_start into the new dedup table entries format to be able to prune DDT based on time. I wonder if we could save some space would we have this mechanism back then.

Comment thread cmd/zpool/zpool_main.c Outdated
Comment thread module/zcommon/zfeature_common.c Outdated
Comment thread module/zfs/spa.c Outdated
Comment thread module/zfs/spa.c Outdated
Comment thread include/zfs_crrd.h Outdated
Comment thread module/zfs/zfs_crrd.c Outdated
Comment thread module/zfs/spa.c Outdated
@amotin amotin added the Status: Revision Needed Changes are required for the PR to be accepted label Dec 12, 2024
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 891c8f2 to ba5ee33 Compare January 31, 2025 10:31
@github-actions github-actions Bot removed the Status: Revision Needed Changes are required for the PR to be accepted label Jan 31, 2025
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 8 times, most recently from 963a5a3 to 33a7c27 Compare January 31, 2025 14:08
Comment thread cmd/zpool/zpool_main.c Outdated
Comment thread cmd/zpool/zpool_main.c
Comment thread module/zfs/spa.c Outdated
Comment thread module/zfs/zfs_crrd.c Outdated
Comment thread include/zfs_crrd.h Outdated
Comment thread module/zfs/spa.c Outdated
Comment thread module/zfs/spa.c Outdated
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 4 times, most recently from d214335 to 603bbfa Compare July 9, 2025 17:21
@amotin
Copy link
Copy Markdown
Member

amotin commented Jul 11, 2025

@oshogbo zpool_scrub/zpool_scrub_date_range_001 test failed on almalinux8.

@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 3 times, most recently from d69c857 to f40bd7f Compare July 14, 2025 19:30
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 5 times, most recently from cabf73b to dec4dbb Compare July 27, 2025 07:01
@oshogbo
Copy link
Copy Markdown
Contributor Author

oshogbo commented Jul 28, 2025

I think the problem with almalinux8 is finally solved. It was caused by using /dev/random instead of /dev/urandom, which resulted in empty files. The zinject tool injects a data error, but because there was no data, the second file wasn't detected as corrupted.

I also found an issue with timezone calculation, which has now been fixed.

Additionally, I changed the way we select the final time. Since we have three different groups of timestamps, we can't simply select the smallest TXG as the start date - doing so would always pick the one from the "lowest frequency group" (monthly). So instead, we still floor each group, but we now select the time that is closest overall. Hope that makes sense.

Comment thread module/zfs/spa.c
Comment thread module/zfs/spa.c Outdated
Comment thread module/zfs/spa.c
Comment thread module/zfs/zfs_crrd.c Outdated
Comment thread module/zfs/zfs_crrd.c Outdated
Comment thread module/zfs/zfs_crrd.c Outdated
Comment thread module/zfs/zfs_crrd.c Outdated
Comment thread include/libzfs.h Outdated
@behlendorf
Copy link
Copy Markdown
Contributor

It looks like we had some unexpected failures in the centos-stream builders as well which need to be looked at:

https://github.com/openzfs/zfs/actions/runs/16548417061/job/46985751681?pr=16853

 Tests with results other than PASS that are unexpected:
    FAIL cli_root/zpool_scrub/zpool_scrub_date_range_001 (expected PASS)

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jul 30, 2025
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 6 times, most recently from 4203e80 to b2c2630 Compare August 1, 2025 11:49
Comment thread module/zfs/spa.c Outdated
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from b2c2630 to 7a3155f Compare August 4, 2025 13:28
This feature enables tracking of when TXGs are committed to disk,
providing an estimated timestamp for each TXG.

With this information, it becomes possible to perform scrubs based
on specific date ranges, improving the granularity of
data management and recovery operations.

Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Accepted Ready to integrate (reviewed, tested)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants