[proposal] remote storage refactor proposal by andaaron · Pull Request #3750 · project-zot/zot

andaaron · 2026-01-30T22:17:20Z

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

codecov · 2026-01-30T22:38:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.46%. Comparing base (1ff6f60) to head (fc5fefd).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #3750   +/-   ##
=======================================
  Coverage   91.46%   91.46%           
=======================================
  Files         193      193           
  Lines       27456    27456           
=======================================
  Hits        25112    25112           
  Misses       1520     1520           
  Partials      824      824

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

vrajashkr · 2026-02-12T14:20:13Z

pkg/storage/SHARED_BLOB_STORAGE_PROPOSAL.md

+
+### S3 Blob Path Structure
+```
+{rootDir}/storage/blobs/{algorithm}/{digest}  # Shared across all repos


I'm curious why we are using hard links for local storage, but are fine without any kind of reference counting links in S3. i.e. if the implementation logic for blobs without links is possible, then would local storage also work fine without the hardlinks?

I'm thinking of whether there any drawbacks of not using hard links in local storage.

Using hardlinks saves space if you have multiple instances of the same blobs in different paths on the same drive. If we remove hardlinks disk usage would go up significantly.

For example: you push a base image and a derived image, and, because they have different repositories, the base image layers would take twice as much disk space for the 2 different instances, once for the base image, and once for the derived image repos.

Sure, we could implement a single instance of {rootDir}/storage/blobs/{algorithm}/{digest} also in case of local storage, but that would break tools reading directly from the disk and expecting each repository to be valid OCI layout of its own. One such cases is the product we developed zot for in the first place.

vrajashkr · 2026-02-12T14:24:17Z

pkg/storage/SHARED_BLOB_STORAGE_PROPOSAL.md

+    if is.storageDriver != nil {
+        return path.Join(is.rootDir, "storage", ispec.ImageBlobsDir, 
+            digest.Algorithm().String(), digest.Encoded())
+    }


In newer code, we should probably not assume that a non-empty storage driver == S3. Zot may in future support perhaps a new storage driver. Since we have the interfaces in place, we should check if it is really S3 and then return the appropriate path.

Agreed. There's also the GCS PR we should be merging soon.

vrajashkr · 2026-02-12T14:25:30Z

pkg/storage/SHARED_BLOB_STORAGE_PROPOSAL.md

+}
+```
+
+**Key Point:** No configuration flag needed. S3 detection is automatic via `storeDriver != nil`.


I'm not onboard with this - if zot supports something new in future, all this code would need to be updated, rather we should check if the storage is really S3 before proceeding.

This reads S3 because that's the only option besides local disk right now.
I would keep the existing dedupe logic only for local.
So it makes little difference what value storeDriver has as long as it is not local driver, for which we need the dedupe.
S3 and any other possible value in the future would use {rootDir}/storage/blobs/{algorithm}/{digest}.

vrajashkr · 2026-02-12T14:29:29Z

pkg/storage/SHARED_BLOB_STORAGE_PROPOSAL.md

+3. Review results
+4. Run migration: `zot migrate-s3-storage --config config.json`
+5. **Restart zot** (automatically uses shared blob storage)
+6. Verify and monitor


Since this is a destructive operation (migrating files to the new path), will the migration logic also keep track of what was moved so that it can be rolled back if required?

Out of curiosity, what would a rollback cost on S3?

Those are both good questions.

I think we can keep track of 1, but unless we keep it in the "cache DB", keeping it up to date after using start using the service would be practically impossible. Even if we track it in the cache DB, It complicates maintaining it.

Maybe we should have an entry there for the original blob location same as we do now, and simply use that instead of {rootDir}/storage/blobs/{algorithm}/{digest} in case of old blobs from pre-transition. I think this could potentially help cut the cost of copying stuff back in case of a rollback.

Signed-off-by: Andrei Aaron <andreifdaaron@gmail.com>

andaaron marked this pull request as ready for review January 31, 2026 09:46

andaaron requested review from rchincha and vrajashkr January 31, 2026 09:46

vrajashkr reviewed Feb 12, 2026

View reviewed changes

andaaron added 2 commits March 1, 2026 09:56

docs: remote storage refactor proposal

5cf4e2d

Signed-off-by: Andrei Aaron <andreifdaaron@gmail.com>

chore: update doc

fc5fefd

andaaron force-pushed the storage3 branch from d4d5e20 to fc5fefd Compare March 1, 2026 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[proposal] remote storage refactor proposal#3750

[proposal] remote storage refactor proposal#3750
andaaron wants to merge 2 commits intoproject-zot:mainfrom
andaaron:storage3

andaaron commented Jan 30, 2026

Uh oh!

codecov bot commented Jan 30, 2026 •

edited

Loading

Uh oh!

vrajashkr Feb 12, 2026

Uh oh!

andaaron Feb 16, 2026

Uh oh!

vrajashkr Feb 12, 2026

Uh oh!

andaaron Feb 16, 2026

Uh oh!

vrajashkr Feb 12, 2026

Uh oh!

andaaron Feb 16, 2026 •

edited

Loading

Uh oh!

vrajashkr Feb 12, 2026

Uh oh!

andaaron Feb 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andaaron commented Jan 30, 2026

Uh oh!

codecov bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vrajashkr Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

andaaron Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

vrajashkr Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

andaaron Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

vrajashkr Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

andaaron Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vrajashkr Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

andaaron Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jan 30, 2026 •

edited

Loading

andaaron Feb 16, 2026 •

edited

Loading

andaaron Feb 16, 2026 •

edited

Loading