Skip to content

SY-4004: Use Oracle to Generate Retrieve Queries#2150

Open
emilbon99 wants to merge 101 commits intorcfrom
sy-4004-use-oracle-to-generate-retrieve-queries
Open

SY-4004: Use Oracle to Generate Retrieve Queries#2150
emilbon99 wants to merge 101 commits intorcfrom
sy-4004-use-oracle-to-generate-retrieve-queries

Conversation

@emilbon99
Copy link
Copy Markdown
Contributor

@emilbon99 emilbon99 commented Apr 2, 2026

Issue Pull Request

Linear Issue

SY-4004

Description

Add a new go/query Oracle plugin that generates gorp Retrieve query wrappers from schema definitions, replacing hand-written retrieve files across all core services.

New Oracle annotations:

  • @retrieve on structs to generate retrieve.gen.go with WhereKeys, Entry, Entries, and Exec methods
  • @retrieve custom for services (rack, ranger) that need additional hand-written query methods beyond the generated ones
  • @filter / @filter required on fields to generate Where<Field> filter methods
  • @search to generate Search method integration

Generated retrieve files for: arc, device, lineplot, log, rack, ranger, schematic, table, task, user, workspace. Each replaces a hand-written retrieve.go with a retrieve.gen.go.

Removed legacy Oracle plugins: migrate, marshal, snapshot, and their associated commands and tests. These were superseded by the current plugin architecture.

Basic Readiness

  • I have performed a self-review of my code.
  • I have added relevant, automated tests to cover the changes.
  • I have updated documentation to reflect the changes.

Greptile Summary

This PR introduces a new Oracle code-generation plugin (go/query) that auto-generates gorp Retrieve query wrappers from schema annotations (@retrieve, @filter, @search), replacing 11 hand-written retrieve.go files with generated retrieve.gen.go counterparts. Custom types (rack, ranger) retain their hand-written filter methods and only receive the generated boilerplate.

Key changes:

  • New oracle/plugin/go/query/query.go plugin with full template-based code generation; registered in oracle/cmd/sync.go
  • 11 generated retrieve.gen.go files replacing or supplementing hand-written retrieve.go files across arc, device, lineplot, log, rack, ranger, schematic, table, task, user, and workspace services
  • Oracle schema files updated with @retrieve, @filter, @search, and @retrieve custom annotations
  • Most generated files add previously missing Count and Exists methods, expanding the Retrieve API surface
  • Behavioral differences are minimal: gorp.OverrideTx is now applied after search (not before), which is semantically equivalent; WhereAuthor on workspace gains optional gorp.FilterOption support (backward-compatible)
  • The pluralize template function is simplistic and will produce incorrect method names for English words with irregular plurals (e.g., fields ending in "x", "ch", consonant+"y") — not a problem for current schemas but a latent bug for future field additions
  • No guard prevents two non-custom @retrieve types in the same package from generating a file with a duplicate type Retrieve declaration, which would cause a compilation error

Confidence Score: 4/5

  • Safe to merge with minor latent issues; no runtime regressions found in the generated code for current schemas.
  • The generated code is functionally equivalent to the hand-written code it replaces, and all behavioral differences (new Count/Exists methods, WhereAuthor gaining optional opts, removal of the empty-slice guard in arc/WhereNames) are either additive or harmless given how callers use the API. The two flagged issues — the simplistic pluralize function and the missing multi-type guard — are latent risks that don't affect current compilation or correctness.
  • oracle/plugin/go/query/query.go — the pluralize function and missing multi-type validation are both in this file.

Important Files Changed

Filename Overview
oracle/plugin/go/query/query.go New Oracle plugin that generates gorp Retrieve wrappers from schema annotations; contains a simplistic pluralize function that will mishandle some English field names, and lacks a guard against multiple non-custom @retrieve types sharing the same output package.
oracle/cmd/sync.go Registers the new go/query plugin in the plugin registry; change is minimal and correct.
core/pkg/service/task/retrieve.gen.go Generated retrieve file for task; adds Count/Exists methods not previously present; struct field order differs from old hand-written code but all constructors use named fields so no impact.
core/pkg/service/rack/retrieve.gen.go Generated methods for the custom rack Retrieve struct; adds Count/Exists previously absent, with proper execSearch delegation.
core/pkg/service/ranger/retrieve.gen.go Generated methods for the custom ranger Retrieve struct; adds Count/Exists/execSearch not present in the old hand-written file; WhereNames is functionally equivalent to the removed code.
core/pkg/service/arc/retrieve.gen.go Fully generated replacement for arc/retrieve.go; WhereNames now lacks the empty-slice guard present in the old code, but all callers guard before invoking it so no functional regression.
core/pkg/service/device/retrieve.gen.go Fully generated replacement for device/retrieve.go; all filter methods are functionally equivalent to the removed hand-written code.
core/pkg/service/workspace/retrieve.gen.go Generated replacement; WhereAuthor now accepts optional gorp.FilterOption args (backward-compatible superset of the old signature), and adds Count/Exists methods.
schemas/task.oracle Adds @retrieve, @search, and @filter annotations to Task fields; correctly maps to the generated retrieve code.
schemas/rack.oracle Adds @retrieve custom and @search to Rack; no @filter annotations so all filter methods remain hand-written in retrieve.go.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Oracle Schema File (.oracle)"] --> B{"Has @retrieve?"}
    B -- "No" --> Z["Skip"]
    B -- "Yes" --> C{"@retrieve custom?"}
    C -- "No (standard)" --> D["generate full Retrieve struct + all methods"]
    C -- "Yes (custom)" --> E["generate methods only\n(struct stays in retrieve.go)"]
    D --> F["retrieve.gen.go"]
    E --> F
    A --> G{"Has @filter on field?"}
    G -- "bool / @filter scalar" --> H["Where[Field](v T, opts ...FilterOption)"]
    G -- "slice (default)" --> I["Where[Field]s(vals ...T)"]
    H --> F
    I --> F
    A --> J{"Has @search?"}
    J -- "Yes" --> K["Search(term string)\nexecSearch helper\nontology type lookup"]
    K --> F
    F -->|"gofmt -w"| L["Formatted Go source"]
Loading

Reviews (1): Last reviewed commit: "Merge branch 'rc' of https://github.com/..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 68.99225% with 360 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.72%. Comparing base (16a084d) to head (4720510).

Files with missing lines Patch % Lines
core/pkg/distribution/channel/retrieve.gen.go 45.58% 33 Missing and 4 partials ⚠️
core/pkg/service/arc/retrieve.gen.go 30.61% 33 Missing and 1 partial ⚠️
core/pkg/service/task/retrieve.gen.go 49.18% 30 Missing and 1 partial ⚠️
core/pkg/service/device/retrieve.gen.go 53.96% 28 Missing and 1 partial ⚠️
core/pkg/service/workspace/retrieve.gen.go 44.89% 26 Missing and 1 partial ⚠️
core/pkg/distribution/channel/retrieve.go 38.09% 26 Missing ⚠️
oracle/plugin/go/query/query.go 88.35% 10 Missing and 7 partials ⚠️
core/pkg/service/view/retrieve.gen.go 67.34% 13 Missing and 3 partials ⚠️
core/pkg/service/lineplot/retrieve.gen.go 36.36% 14 Missing ⚠️
core/pkg/service/log/retrieve.gen.go 36.36% 14 Missing ⚠️
... and 20 more

❌ Your patch check has failed because the patch coverage (68.99%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##               rc    #2150      +/-   ##
==========================================
+ Coverage   63.67%   63.72%   +0.05%     
==========================================
  Files        2105     2112       +7     
  Lines      105996   106457     +461     
  Branches     8228     8239      +11     
==========================================
+ Hits        67494    67843     +349     
- Misses      32600    32690      +90     
- Partials     5902     5924      +22     
Flag Coverage Δ
alamos-go 55.25% <ø> (ø)
arc-go 76.00% <ø> (ø)
aspen 67.54% <ø> (-0.36%) ⬇️
cesium 82.98% <ø> (-0.03%) ⬇️
client-py 85.94% <ø> (+0.02%) ⬆️
client-ts 90.17% <ø> (ø)
console 20.02% <ø> (ø)
core 69.50% <64.85%> (-0.02%) ⬇️
freighter-go 63.00% <ø> (+0.25%) ⬆️
freighter-integration 1.51% <ø> (ø)
freighter-py 79.96% <ø> (ø)
freighter-ts 73.87% <ø> (ø)
oracle 62.29% <83.87%> (+0.21%) ⬆️
pluto 55.20% <ø> (ø)
x-go 81.86% <86.20%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread oracle/plugin/go/query/query.go Outdated
Comment thread oracle/plugin/go/query/query.go
emilbon99 and others added 18 commits April 2, 2026 17:12
Adds the top-level Codec type that wraps the existing ORC reader/writer
with magic header validation, sync.Pool reuse, and the SelfEncoder/SelfDecoder
interfaces that generated code will implement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Encode and Decode used bare type assertions that would panic when values
didn't implement SelfEncoder/SelfDecoder. Use the two-value form and
return descriptive errors instead. Added tests for both error paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Simplifies the gorp API by removing per-table codec configuration (all
tables now use the DB's codec) and updating the Migration interface to
receive a typed Tx instead of raw kv.Tx. Query constructors no longer
accept codec parameters. Fixes callers in status and ranger/alias
services that used the old API. Keeps deprecated Migrator/MigrationSpec
for backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test files were passing nil to gorp.NewRetrieve/NewCreate which no longer
accept codec parameters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The full alias refactor belongs in the ranger PR. This commit only
removes nil codec parameters from gorp query constructors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The sed replacement incorrectly removed a nil gorp.Tx argument from
NewReader, not a nil codec.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Staticcheck flags callers of deprecated APIs. Since ranger still uses
Migrator until its migration PR, remove the deprecation markers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Improves the Go marshal encoder with better handling of generics, type
casts, and auto-copy field resolution. Updates the migration plugin for
the new gorp Migration interface signature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds "err" and "ok" to reservedNames to prevent receiver name collisions
with generated error variables. Hoists orc.Reader allocation outside the
benchmark loop in the non-generic template to match the generic one.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds Oracle-generated EncodeOrc/DecodeOrc implementations and round-trip
tests for color, label, spatial, status, and telem types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds Oracle-generated EncodeOrc/DecodeOrc implementations and round-trip
tests for compiler, graph, IR, program, text, and types packages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds generated ORC codecs for ontology Resource/Relationship/ID types.
Adds RawFilter optimization for parent traversal that rejects non-matching
relationships without decoding. Switches ErrCycle to use graph.ErrCyclicDependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrites ranger migrations to use the new gorp Migration interface,
replacing the deprecated Migrator. Adds ORC codecs for Range type and
wires migrations into OpenTable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Brings in the alias writer/service/test changes that add channel service
dependency, fixing the build failure from partial inclusion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…4-use-oracle-to-generate-retrieve-queries
Three additions land together so channel can adopt the generated retrieve
shape without losing its post-retrieve overflow check.

1. gorp.Retrieve.Validate(f) attaches a batch validator that runs once on
   the final bound result set after Exec. Non-nil error short-circuits
   Exec. Multiple validators accumulate and run in order. Wired through
   execKeys and execFilter; Count/Exists are unchanged since they do not
   produce a bound set.

2. The go/query oracle plugin accepts a struct-level @retrieve key "Type"
   override. Falls back to the existing @key field when absent. Required
   for channel.Channel because its GorpKey is computed from leaseholder +
   local_key rather than stored as a @key field.

3. channel.Channel is annotated with @retrieve custom, @retrieve key "Key",
   and @search type "channel". The hand-written retrieve.go collapses onto
   the generated Retrieve: the struct drops tx/validateRetrievedChannels
   in favor of baseTX, the Search/Where/WhereKeys/Entry/Entries/Limit/
   Offset/Exec/Exists methods are generated, and the MatchX filter funcs
   return the generated Filter type instead of gorp.Filter[Key, Channel].
   validateChannels moves off the custom Exec and onto
   NewRetrieve().Validate(...) so it propagates through the generated
   pipeline automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix mustsucceedlint, HaveOccurred, and fmt.Errorf violations in gorp tests
- drop redundant type args flagged by infertypeargs
- remove accidentally-committed rbac policy/role migrate.gen.go files
None of these three retrieves carry service-bound state beyond what the
default @retrieve (plus @search for view) already generates, so the
hand-written retrieve.go wrappers are redundant.
…4-use-oracle-to-generate-retrieve-queries

# Conflicts:
#	core/pkg/service/ranger/ranger_test.go
…4-use-oracle-to-generate-retrieve-queries
…4-use-oracle-to-generate-retrieve-queries
Copy link
Copy Markdown
Contributor

@pjdotson pjdotson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Generally think the API needs to be improved to make it easier to read and write (specifics in comments).

complexity: you need to define field ordering, handle partial prefix queries, and the
number of possible composites grows combinatorially. For our access patterns, chaining a
sorted index with a post-filter on the remaining fields is sufficient. We can revisit
composite indexes if profiling shows a real need.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do add composite indexes would the API need to change at all?

populate(ctx context.Context, tx Tx) error
update(key any, oldEntry any, newEntry any)
remove(key any)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this not to have type parameters for the key and entries?

Where(channel.ByDataType(telem.Float32)).
Where(channel.IsVirtual()).
Entries(&results).
Exec(ctx, tx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a fundamental reason that we have to have ByDataType, IsVirtual be
functions instead of methods on the Retrieve type?

This matches the current behavior where `Retrieve` reads from the underlying KV store,
which (in Pebble) provides snapshot isolation at the transaction level. Adding
transaction-local index overlays is complex and not needed for our current access
patterns, where writes and reads in the same transaction are rare.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if Retrieve takes in a transaction in it, it won't properly retrieve things because
the index hasn't been updated? I don't know if this really makes sense, even if its rare
it means any time we do have a read after a write in the same transaction that touches
the same thing, the read will be wrong, and is just basically a known bug we are putting
in here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a design flaw. I changed it in the actual gorp indexes PR to accurately reflect transactions. Will update RFC

returns the matches. This converts the query into the existing fast `execKeys` path
internally.
2. If the backing data is nil (index not registered), the `Eval` closure falls back to a
field comparison against each decoded entry, the same as any non-indexed filter.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are examples where we call Eval and it won't hit an index?

func HasDomainFromField(f resolution.Field, domainName string) bool {
_, ok := f.Domains[domainName]
return ok
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tests for all of these

@@ -0,0 +1,697 @@
// Copyright 2026 Synnax Labs, Inc.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for tests like this we never test at all what the body of the code looks like, just
the function / method typing?

}

// And returns a filter that matches when all provided filters match.
func And(fs ...Filter) Filter {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to generate specific And, Or, and Nots for each type? so if i want to call
something that matches both names and data types I would have to call:

r.Where(
    channel.And(
        channel.MatchDataTypes(telem.Float32T, telem.Float64T),
        channel.MatchNames("name1", "name2-*"),
    ),
)

This is a really verbose API instead of the previous method based one:

r.WhereDataTypes(telem.Float32T, telem.Float64T).WhereNames("name1","name2-*").MatchAny()

where you have something like a MatchAny / MatchAll to give the AND / OR semantics.
It is also odd because we still have the .WhereKeys method as well, but everything
else needs to go through the mega r.Where. Finally, having them as methods instead of
functions is a bit better devX: you can get see in your IDE all of the methods on the
Retrieve for different filtering options directly, instead of having to call a
function in the channel package

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I don't think that it's that verbose. If you do:

r.Where(channel.MatchDataTypes(telem.Float32, telem.Float64)).Where(channel.MatchNames("name1", "name2-*").Exec()

is really not that much longer than what you have. Total cost here is like 20 characters. Not that much.

  1. The key thing that your method based API ability to mix AND and OR logic. What happens if I want these two filters in AND and another filter in OR? Your API cannot express that.

What if I want to do:

channel.Where( channel.Or( channel.MatchNames("cat"), channel.And( channel.MatchDataTypes(yada), channel.MatchInternal(true) ) ) ))

Your query language cannot express that cleanly.


func (d dagWriter) deleteIncomingRelationships(ctx context.Context, id ID) error {
return d.relationshipTable.NewDelete().Where(func(ctx gorp.Context, rel *Relationship) (bool, error) {
return d.relationshipTable.NewDelete().Where(gorp.Match(func(ctx gorp.Context, rel *Relationship) (bool, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a little odd that we have to wrap calls like this in gorp.Match now. Is there a
good reason the API has to be like this instead of the previous Where / WhereRaw?
Why can't Where wrap the closure as a gorp.Filter internally?

Copy link
Copy Markdown
Contributor Author

@emilbon99 emilbon99 Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so that matchers like And and Or can take in a mix of raw filters, prefix filters, and others. Makes complex query composition much easier.

// MatchVirtual returns a filter for channels that are virtual if virtual is true, or
// are not virtual if virtual is false. Calculated channels are excluded from the
// virtual bucket even though they are stored with Virtual=true.
func MatchVirtual(virtual bool) Filter {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These Match* functions are not auto-generated in the retrieve.gen.go files?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they should be. Will correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants