Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates Garnet/Tsavorite’s native Linux storage device loading and build configuration to handle the Debian/Ubuntu t64 libaio transition without requiring system-wide symlinks in Docker images or CI.
Changes:
- Add Linux loader auto-repair for missing
libaio.so.1by creating a local compat symlink next tolibnative_device.so, and improve native library path resolution. - Pin libaio symbol versions at link time via a new
libaio_compat.h, and adjust native build/linker RPATH behavior to prefer$ORIGIN. - Remove now-redundant libaio workaround steps from Dockerfiles, GitHub workflows, and relax docker image validation to accept either
libaio.so.1orlibaio.so.1t64.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
libs/storage/Tsavorite/cs/src/core/Device/NativeStorageDevice.cs |
Adds absolute-path resolution for the native library and Linux libaio t64 auto-repair with improved diagnostics. |
libs/storage/Tsavorite/cc/src/device/libaio_compat.h |
New header to force versioned libaio symbol bindings to avoid performance/regression/hang behavior. |
libs/storage/Tsavorite/cc/src/device/file_linux.h |
Includes the new libaio compatibility header for Linux builds. |
libs/storage/Tsavorite/cc/src/CMakeLists.txt |
Adds $ORIGIN RPATH behavior and links options to ensure local dependency resolution for the native device. |
libs/server/Resp/Vector/VectorManager.cs |
Skips initialization/recovery paths when Vector Set preview is disabled. |
test/docker-tests/validate_docker_images.py |
Accepts either libaio.so.1 or libaio.so.1t64 since the symlink may now be created lazily. |
Dockerfile |
Removes build-time libaio symlink workaround. |
Dockerfile.ubuntu |
Removes build-time libaio symlink workaround. |
.github/workflows/ci.yml |
Removes Ubuntu libaio symlink workaround step so CI exercises loader repair. |
.github/workflows/nightly.yml |
Removes Ubuntu libaio symlink workaround step so nightly runs exercise loader repair. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7de1020 to
f47f903
Compare
…nsition)
The t64 ABI transition renamed libaio.so.1 to libaio.so.1t64, breaking
libnative_device.so which has a hard DT_NEEDED of libaio.so.1. Fix the
problem in three places so both Docker and non-Docker users on t64 hosts
get a working native device without manual intervention.
1) libaio_compat.h (new) pins the libaio entry points to specific symbol
versions at link time:
io_setup @LIBAIO_0.4
io_destroy @LIBAIO_0.4
io_getevents@LIBAIO_0.4 (userspace ring fast path)
io_submit @LIBAIO_0.1
Older libaio-dev marked LIBAIO_0.4 as the default version so a plain
link picked these up automatically. On t64 (libaio1t64-dev) the default
is gone and libaio.h has no .symver redirects for x86_64, so a fresh
link produces UNVERSIONED references that at runtime resolve to the
slower LIBAIO_0.1 io_getevents - which always syscalls and blocks -
causing NativeStorageDevice probe/TryComplete paths to hang. With
libaio_compat.h included first, any future rebuild on any distro
reproduces the correct versioned bindings.
2) CMakeLists.txt sets RPATH=$ORIGIN (via INSTALL_RPATH +
BUILD_WITH_INSTALL_RPATH + --disable-new-dtags) so libnative_device.so
searches its own directory for dependencies. This enables the managed
loader's fallback (below).
3) NativeStorageDevice.ImportResolver resolves NativeLibraryPath to an
absolute path (fixing a latent bug where the relative path bypassed
.NET's runtimes/ probing) and, on Linux, catches DllNotFoundException
referencing libaio.so.1, locates libaio.so.1t64 in standard multiarch
paths, and drops a compat symlink next to libnative_device.so. The
symlink creation tolerates the race where multiple processes start
simultaneously and another process has already created a usable
symlink. If repair still fails, the loader throws a descriptive
DllNotFoundException explaining the t64 transition and offering three
remediation options. This path is primarily for non-Docker users
(developers running dotnet GarnetServer on their own Debian 13 /
Ubuntu 24.04 machines).
Also:
- VectorManager.Initialize() and ResumePostRecovery() now early-return
when IsEnabled is false. Vector Set preview is off by default; there
is no reason these paths should touch storage when the feature is
disabled.
- Dockerfile and Dockerfile.ubuntu still install libaio1t64 and
pre-create the libaio.so.1 -> libaio.so.1t64 symlink at build time
for maximum robustness (works on read-only filesystems and under
restrictive seccomp profiles that block symlink(2)). The managed
loader fallback is belt-and-braces for non-Docker users.
(Dockerfile.alpine and Dockerfile.azurelinux ship libaio.so.1
natively. Dockerfile.chiseled uses a restricted runtime image and
was not changed - it already stages libaio.so.1 from a build stage.)
- .github/workflows/ci.yml and nightly.yml drop the ubuntu-latest
libaio pre-step; the managed ImportResolver now handles repair
automatically on any host.
- validate_docker_images.py accepts either libaio.so.1 or
libaio.so.1t64 when checking library presence.
The bundled libnative_device.so has been rebuilt against the above
sources with '-O3 -g -DNDEBUG' (project Release defaults). Verified via
objdump -T that io_* references are correctly versioned.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f47f903 to
70062be
Compare
- CMakeLists.txt: fix FNATIVE_DEVICE_HEADERS typo so file_linux.h and
libaio_compat.h are actually associated with the native_device target
(cosmetic, does not affect compiled binary).
- NativeStorageDevice: wrap Directory.GetCurrentDirectory() in a
TryGetCurrentDirectory helper so a deleted/inaccessible CWD cannot
block native library resolution when the library exists in the
assembly or AppContext directory.
- NativeStorageDevice.BuildLibaioDiagnostic: expand architecture mapping
(x64, Arm64, Arm) with a null fallback that emits a distro-agnostic
fix instruction, and correct the remediation advice to suggest a
valid DeviceType value ('RandomAccess') instead of the non-existent
'Managed'.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
vazois
approved these changes
Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The t64 ABI transition renamed libaio.so.1 to libaio.so.1t64, breaking libnative_device.so which has a hard DT_NEEDED of libaio.so.1. Previously we worked around this with system-wide symlinks in every Dockerfile and CI workflow.
Fix this properly in the loader itself:
CMakeLists.txt now sets RPATH=$ORIGIN (via INSTALL_RPATH + BUILD_WITH_INSTALL_RPATH + --disable-new-dtags) so libnative_device.so searches its own directory for dependencies. This lets a managed-side compat symlink next to the native library satisfy the linker without any LD_LIBRARY_PATH contortions from the caller.
libaio_compat.h (new) pins the libaio entry points to the specific symbol versions that make libaio's userspace fast paths kick in:
io_setup @LIBAIO_0.4
io_destroy @LIBAIO_0.4
io_getevents@LIBAIO_0.4 (userspace ring fast path)
io_submit @LIBAIO_0.1
Older libaio-dev marked LIBAIO_0.4 as the default version so a plain
link picked these up automatically. On t64 (libaio1t64-dev) the default
is gone and libaio.h has no .symver redirects for x86_64, so a fresh
link produces UNVERSIONED references that at runtime resolve to the
slower LIBAIO_0.1 io_getevents which always syscalls and blocks -
which caused NativeStorageDevice probe/TryComplete paths to hang.
NativeStorageDevice.ImportResolver now resolves NativeLibraryPath to an absolute path (fixing a latent bug where the relative path bypassed .NET's runtimes/ probing) and, on Linux, catches DllNotFoundException referencing libaio.so.1, locates libaio.so.1t64 in standard multiarch paths, and drops a compat symlink next to libnative_device.so. The symlink creation tolerates the race where multiple processes start simultaneously and another process has already created a usable symlink. If repair still fails, the loader throws a descriptive DllNotFoundException explaining the t64 transition and offering three remediation options.
VectorManager.Initialize() and ResumePostRecovery() now early-return when IsEnabled is false. Vector Set preview is off by default; there is no reason these paths should touch storage when the feature is disabled.
With the loader + build fixes in place, remove the now-redundant workarounds:
Dockerfile and Dockerfile.ubuntu: drop the ln -sf libaio.so.1 line. (Dockerfile.alpine and Dockerfile.azurelinux ship libaio.so.1 natively. Dockerfile.chiseled uses a restricted runtime and was not touched.)
.github/workflows/ci.yml and nightly.yml: drop the ubuntu-latest libaio pre-step; the managed ImportResolver now handles repair automatically and the test suite actually exercises the repair path.
validate_docker_images.py: accept either libaio.so.1 or libaio.so.1t64, since the former is only materialized lazily (on first native device init) for glibc images now.
The bundled libnative_device.so has been rebuilt against the above sources with '-O3 -g -DNDEBUG' (project Release defaults). Verified via objdump -T that io_* references are correctly versioned.