Add semi-automated code generation for RocksDB C API bindings#14572
Add semi-automated code generation for RocksDB C API bindings#14572xingbowang wants to merge 10 commits intofacebook:mainfrom
Conversation
|
| Check | Count |
|---|---|
cppcoreguidelines-pro-type-member-init |
1 |
cppcoreguidelines-special-member-functions |
1 |
modernize-make-shared |
3 |
readability-braces-around-statements |
13 |
readability-isolate-declaration |
2 |
| Total | 20 |
Details
db/c.cc (20 warning(s))
db/c.cc:313:21: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:1066:22: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5300:28: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5303:26: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5311:30: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5314:28: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5327:28: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5330:26: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5338:30: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5341:28: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5427:28: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5430:26: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:5438:47: warning: statement should be inside braces [readability-braces-around-statements]
db/c.cc:6082:29: warning: use std::make_shared instead [modernize-make-shared]
db/c.cc:7354:12: warning: use std::make_shared instead [modernize-make-shared]
db/c.cc:7361:12: warning: use std::make_shared instead [modernize-make-shared]
db/c.cc:8036:8: warning: constructor does not initialize these fields: rep_ [cppcoreguidelines-pro-type-member-init]
db/c.cc:8036:8: warning: class 'SliceTransformWrapper' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator [cppcoreguidelines-special-member-functions]
db/c.cc:8438:3: warning: multiple declarations in a single statement reduces readability [readability-isolate-declaration]
db/c.cc:8451:3: warning: multiple declarations in a single statement reduces readability [readability-isolate-declaration]
ffb1471 to
2232f34
Compare
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 673d7a2 Code Review: Add Semi-Automated Code Generation for RocksDB C API BindingsPR: Add semi-automated code generation for RocksDB C API bindings Executive SummaryThis PR introduces a semi-automated code generation system for the RocksDB C API, adding 725 new C API functions (68.9% increase). The architecture is sound and addresses a real maintenance burden. Approve with required changes. HIGH Severity FindingsH1: Verify All Removed Hand-Written Functions Have Generated ReplacementsThe PR removes ~35 hand-written implementations (backup_engine_options, block_based_options, restore_options) from H2:
|
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D99609629. |
|
Regarding the F1 finding (ABI risk on The auto-generator correctly emits In practice: the value domain is 0/1. On all common calling conventions (x86-64 SysV, Windows x64, ARM64), passing a small integer as After weighing the options, we've decided to keep the generated — via Navi on behalf of xingbowang |
|
This is great! I've been wanting this for a long time! Thanks for working on this. Just a note:
This PR adds 11 quoted #include directives: In When libclang encounters This breaks any downstream consumer that processes on the For me, it's not a big deal, but I want to flag just in case because:
|
|
Thanks @zaidoon1! Good catch on the self-containedness contract. We explored a few options and landed on the following approach:
So bindgen invoked as We verified this end-to-end against the rust-rocksdb repo. — via Navi on behalf of xingbowang |
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D99609629. |
3 similar comments
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D99609629. |
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D99609629. |
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D99609629. |
Summary: Upgrade the Ubuntu 24 CI Docker image to include clang-21 (from clang-18), and bump all workflow references to the new image tags. Also adds ccache to both images and renames all clang-18 job references to clang-21. This is mainly for upgrading the clang version later used for generating C API automatically. See PR #14572 ### Changes **`build_tools/ubuntu24_image/Dockerfile`** - Add clang-21 installation from LLVM snapshot repo (`apt.llvm.org/noble/llvm-toolchain-noble-21`) - Add ccache - Add comment pointing Meta employees to internal devvm build guide **`build_tools/ubuntu22_image/Dockerfile`** - Add ccache - Add comment pointing Meta employees to internal devvm build guide **`.github/workflows/pr-jobs.yml`** - Rename `build-linux-clang-18-no_test_run` → `build-linux-clang-21-no_test_run` - Rename `build-linux-clang18-asan-ubsan` → `build-linux-clang21-asan-ubsan` - Rename `build-linux-clang18-mini-tsan` → `build-linux-clang21-mini-tsan` - Update all `clang-18`/`clang++-18` references to `clang-21`/`clang++-21` - Update ccache key prefixes: `clang18-asan-ubsan` → `clang21-asan-ubsan`, `clang18-tsan` → `clang21-tsan` - Bump `rocksdb_ubuntu:24.0` → `rocksdb_ubuntu:24.1` **`.github/workflows/nightly.yml`** - Rename `build-linux-clang-18-asan-ubsan-with-folly` → `build-linux-clang-21-asan-ubsan-with-folly` - Update clang-18 → clang-21 compiler references - Bump `rocksdb_ubuntu:24.0` → `rocksdb_ubuntu:24.1` **`.github/workflows/clang-tidy.yml`** - Update clang-18 → clang-21 - Bump `rocksdb_ubuntu:24.0` → `rocksdb_ubuntu:24.1` ### New Docker Images Both images have been built and tested locally. They will be pushed to `ghcr.io/facebook/rocksdb_ubuntu` before this PR is merged. - `ghcr.io/facebook/rocksdb_ubuntu:22.2` — adds ccache, adds devvm build note - `ghcr.io/facebook/rocksdb_ubuntu:24.1` — adds clang-21 from LLVM snapshot repo, adds ccache Pull Request resolved: #14576 Test Plan: Built both images locally and verified CI job names/compiler flags are consistent. Reviewed By: joshkang97 Differential Revision: D99694293 Pulled By: xingbowang fbshipit-source-id: c23c27f5cf870fb2c8b4e3d1cba281d0ce63f9d6
Summary: RocksDB's C API (include/rocksdb/c.h / db/c.cc) is large and hand-maintained, with hundreds of repetitive setter/getter wrappers that follow strict naming and type conventions. Adding new C++ fields requires manually adding matching C wrappers, which is tedious and error-prone. This PR introduces a semi-automated code generation system for the mechanical parts of the C API while keeping complex wrappers (callbacks, ownership-transfer, multi-get families) hand-written. The system has two complementary generators: 1. Auto-discovery (auto_simple_bindings.py): Scans selected C++ public option/metadata structs and auto-generates getter/setter wrappers for simple scalar, enum, string, and chrono fields. Generated .inc files are checked in and compiled into c.h/c.cc via #include. Fields that cannot be auto-generated must be explicitly blocklisted in auto_simple_bindings_blocklist.json with a policy and reason. 2. Spec-driven generator (generate_c_api.py): Takes spec.json as input and emits consistent boilerplate for method-style wrappers whose C shape cannot be inferred from the C++ header alone (e.g. rocksdb_put, rocksdb_transaction_commit, WriteBatch methods). Coverage improvement: This PR adds 725 new C API functions, growing the public C API surface from 1,053 to 1,778 exported functions -- a 68.9% increase. The bulk of the new coverage (436 functions) comes from auto-discovered option struct setters/getters that were previously missing. Coverage breakdown by family: - Option structs (auto-discovered): 436 new functions - Metadata structs (auto-discovered): 89 new functions - ReadOptions (auto-discovered): 46 new functions - JobInfo structs (auto-discovered): 46 new functions - Spec-driven (subset wrappers): 56 new functions - DB simple operations: 27 new functions - Transaction/TransactionDB/WriteBatch: 29 new functions The generated code emits the same style as today's hand-written wrappers (SaveError, Slice(), malloc-owned buffers, unsigned char booleans) and is organized in clearly marked generated sections within c.h/c.cc. Test Plan: - Existing db_test.c C API tests pass (1743 lines of tests extended/verified) - python3 tools/c_api_gen/regen_all.py && git diff --exit-code -- include/rocksdb/c_api_gen db/c_api_gen verifies generated output is stable - python3 tools/c_api_gen/validate_generated_equivalence.py --ref HEAD verifies equivalence with reference hand-written wrappers
…book#14576) Summary: Upgrade the Ubuntu 24 CI Docker image to include clang-21 (from clang-18), and bump all workflow references to the new image tags. Also adds ccache to both images and renames all clang-18 job references to clang-21. This is mainly for upgrading the clang version later used for generating C API automatically. See PR facebook#14572 ### Changes **`build_tools/ubuntu24_image/Dockerfile`** - Add clang-21 installation from LLVM snapshot repo (`apt.llvm.org/noble/llvm-toolchain-noble-21`) - Add ccache - Add comment pointing Meta employees to internal devvm build guide **`build_tools/ubuntu22_image/Dockerfile`** - Add ccache - Add comment pointing Meta employees to internal devvm build guide **`.github/workflows/pr-jobs.yml`** - Rename `build-linux-clang-18-no_test_run` → `build-linux-clang-21-no_test_run` - Rename `build-linux-clang18-asan-ubsan` → `build-linux-clang21-asan-ubsan` - Rename `build-linux-clang18-mini-tsan` → `build-linux-clang21-mini-tsan` - Update all `clang-18`/`clang++-18` references to `clang-21`/`clang++-21` - Update ccache key prefixes: `clang18-asan-ubsan` → `clang21-asan-ubsan`, `clang18-tsan` → `clang21-tsan` - Bump `rocksdb_ubuntu:24.0` → `rocksdb_ubuntu:24.1` **`.github/workflows/nightly.yml`** - Rename `build-linux-clang-18-asan-ubsan-with-folly` → `build-linux-clang-21-asan-ubsan-with-folly` - Update clang-18 → clang-21 compiler references - Bump `rocksdb_ubuntu:24.0` → `rocksdb_ubuntu:24.1` **`.github/workflows/clang-tidy.yml`** - Update clang-18 → clang-21 - Bump `rocksdb_ubuntu:24.0` → `rocksdb_ubuntu:24.1` ### New Docker Images Both images have been built and tested locally. They will be pushed to `ghcr.io/facebook/rocksdb_ubuntu` before this PR is merged. - `ghcr.io/facebook/rocksdb_ubuntu:22.2` — adds ccache, adds devvm build note - `ghcr.io/facebook/rocksdb_ubuntu:24.1` — adds clang-21 from LLVM snapshot repo, adds ccache Pull Request resolved: facebook#14576 Test Plan: Built both images locally and verified CI job names/compiler flags are consistent. Reviewed By: joshkang97 Differential Revision: D99694293 Pulled By: xingbowang fbshipit-source-id: c23c27f5cf870fb2c8b4e3d1cba281d0ce63f9d6
# Conflicts: # utilities/trie_index/trie_index_factory.cc
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D99609629. |
Summary
RocksDB's C API (
include/rocksdb/c.h/db/c.cc) is large and hand-maintained, with hundreds of repetitive setter/getter wrappers that follow strict naming and type conventions. Adding new C++ fields requires manually adding matching C wrappers, which is tedious and error-prone.This PR introduces a semi-automated code generation system for the mechanical parts of the C API while keeping complex wrappers (callbacks, ownership-transfer, multi-get families) hand-written.
Two complementary generators
1. Auto-discovery (
auto_simple_bindings.py)Scans selected C++ public option/metadata structs and auto-generates getter/setter wrappers for simple scalar, enum, string, and chrono fields. Generated
.incfiles are checked in and compiled intoc.h/c.ccvia#include. Fields that cannot be auto-generated must be explicitly blocklisted inauto_simple_bindings_blocklist.jsonwith a policy and reason.2. Spec-driven generator (
generate_c_api.py)Takes
spec.jsonas input and emits consistent boilerplate for method-style wrappers whose C shape cannot be inferred from the C++ header alone (e.g.rocksdb_put,rocksdb_transaction_commit, WriteBatch methods).Coverage improvement
This PR adds 725 new C API functions, growing the public C API surface from 1,053 to 1,778 exported functions — a 68.9% increase. The bulk of the new coverage (436 functions) comes from auto-discovered option struct setters/getters that were previously missing.
The generated code emits the same style as today's hand-written wrappers (
SaveError,Slice(), malloc-owned buffers,unsigned charbooleans) and is organized in clearly marked generated sections withinc.h/c.cc.What stays manual
Complex wrapper families remain hand-written:
MultiGet/ batchedMultiGetfamiliesstd::shared_ptr/std::unique_ptrownership transferTest Plan
db_test.cC API tests pass (1,743 lines of tests extended/verified)python3 tools/c_api_gen/regen_all.py && git diff --exit-code -- include/rocksdb/c_api_gen db/c_api_genverifies generated output is stablepython3 tools/c_api_gen/validate_generated_equivalence.py --ref HEADverifies equivalence with reference hand-written wrappers