Commit 7dfc0c4
authored
[OPIK-5973] [BE] perf: optimize thread list/count queries and push down thread_id filter (#6358)
* [OPIK-5973] [BE] perf: optimize thread list/count queries and push down thread_id filter
Thread list/count queries rewritten to use LIMIT 1 BY instead of FINAL on
traces, spans, and trace_threads. Narrow the traces_final projection and
add a separate traces_final_ids CTE so downstream CTEs (spans_deduped,
trace_threads_final) subset via IN(id) rather than wide-row deduplication.
Benchmark on production-equivalent data: ~3× faster, ~2.8× less memory.
Split TraceThreadField.ID=EQUAL filters out of the outer trace_thread_filters
and emit them against traces.thread_id inside traces_final_ids, so
idx_traces_thread_id_bf can prune granules. Parameter :thread_id_pushdown
is bound separately to avoid colliding with strategy-generated :filter{i}
names.
Rename ThreadService → TraceThreadQueryService and split the trace-thread
filter helpers out of DatabaseUtils into a dedicated FilterUtils class.
* [OPIK-5973] [BE] fix: update remaining callers after ThreadService/DatabaseUtils renames
Missed in the previous commit — OpikApplication, TracesResource, and
AssertionResultDAO all reference the old names and break compile.
Swap imports and callsites to match the renamed classes.
* [OPIK-5973] [BE] chore: add SETTINGS log_comment to thread queries
Attaches log_comment to list, count, findById, search, and stats thread
queries so they can be traced in ClickHouse query logs with the query name,
workspace_id, user_name, and per-call details. Follows the same pattern
used across the other DAOs.
Restructured each method to put makeMonoContextAware / makeFluxContextAware
at the outermost level so userName / workspaceId are available before
template rendering. countThreadTotal now takes those values as parameters
instead of wrapping in its own context lookup.
* Fix filters
* [OPIK-5973] [BE] fix: apply TraceThreadField.ID pushdown additively; drop filter strip
The strip helper (withoutTraceThreadIdPushdown) caused two problems:
- bindTraceThreadSearchCriteria still iterated the full filter list with
TRACE_THREAD strategy, trying to bind :filter0 for the ID filter that
newTraceThreadFindTemplate had emitted against the stripped list. The
SQL had no :filter0 placeholder → R2DBC "non-existing identifier" and
thread-stats endpoint 500'd when filtered by thread id.
- TraceDAO callers of the same helper silently lost ID filters (baz #3).
Since thread_id is stable per trace (the same invariant the pushdown
optimization relies on), applying the filter at both positions — the
outer aggregate (via the normal TRACE_THREAD strategy) and the inner
raw scan (via :thread_id_pushdown) — is idempotent. The inner predicate
still unlocks idx_traces_thread_id_bf; the outer becomes a no-op on an
already-single-thread result set.
Simpler code, no strip helper needed, stats test passes.
* [OPIK-5973] [BE] refactor: align thread list/count queries with TraceDAO prefilter pattern
Mirrors TraceDAO#shouldUseTraceIdPrefilter / trace_id_prefilter shape on
the thread list and count queries:
- traces_final_ids is now conditional (<if(traces_final_ids)>) — only
emitted when filters/search narrow beyond the workspace/project/uuid-
range scope, via new shouldUseTracesFinalIdsPrefilter() helper.
- Dropped the inner ORDER BY … LIMIT 1 BY id inside traces_final_ids;
it's now SELECT DISTINCT id, thread_id with filters inline. Accepts
phantom reads at the prefilter level — authoritative filtering happens
downstream.
- traces_final wraps in a two-level subquery so filters/search_text are
re-applied after the LIMIT 1 BY id dedup, dropping phantom rows whose
latest version no longer matches.
- Downstream CTEs (traces_final, spans_deduped, trace_threads_final) now
branch via <if(traces_final_ids)>…<else> <if(uuid_from_time)>…<endif>
<endif>, scoping to the ID set when the prefilter is active, otherwise
applying the uuid range directly.
- Applied to list and count queries. Stats query has different shape
(still on FINAL) and doesn't use traces_final_ids; unchanged here.
* [OPIK-5973] [BE] refactor: simplify count query, LIMIT 1 BY id, doc pushdown operator scope
- Count query: drop spans_deduped/spans_agg/feedback_scores_agg CTEs and
their joins, drop display-only columns (total_estimated_cost, usage,
feedback_scores_list, feedback_scores). Mirrors TraceDAO#COUNT_BY_PROJECT_ID
— only CTEs needed for filter evaluation remain. All filter paths
preserved (TRACE-level, trace_thread_filters, feedback_scores_filters,
feedback_scores_empty_filters, annotation_queue_filters).
- spans_deduped LIMIT 1 BY: switch from full sort-key tuple
(workspace_id, project_id, trace_id, parent_span_id, id) to id alone.
Span id is UUID v7 globally unique, so the shorter key is equivalent
and reads more clearly.
- Indent the <if(traces_final_ids)> / <else> branches one level deeper
inside traces_final, spans_deduped, and trace_threads_final so the
conditional structure is visually aligned with its nesting.
- FilterUtils.findTraceThreadIdPushdownFilter: add a comment explaining
why the check is scoped to Operator.EQUAL — the pushdown SQL template
is hardcoded to thread_id = :x, and only EQUAL benefits from the
traces.thread_id bloom filter. Locks intent for future contributors.
* [OPIK-5973] [BE] refactor: align stats query with list/count FINAL→LIMIT 1 BY pattern
SELECT_TRACE_THREADS_STATS now mirrors list/count structure:
- Replace FROM traces final / FROM spans final / FROM trace_threads FINAL
with LIMIT 1 BY id patterns.
- Add conditional traces_final_ids prefilter (gated by
shouldUseTracesFinalIdsPrefilter) with <if(traces_final_ids)>…<else>
<if(uuid_from_time)>…<endif> <endif> branches in traces_final,
spans_deduped, and trace_threads_final.
- Split spans_deduped → spans_agg so the narrow dedup runs before the
per-thread aggregation. LIMIT 1 BY id (consistent with spans_deduped
simplification elsewhere).
- traces_final wraps in a two-level subquery so <filters> / <search_text>
re-apply post-dedup (drops phantom reads).
- Wire shouldUseTracesFinalIdsPrefilter in getThreadStats.
* [OPIK-5973] [BE] fix: apply filters once in traces_final_ids prefilter
Restructures the traces_final_ids CTE in list/count/stats so filters and
search_text are applied on the outer SELECT DISTINCT of the prefilter rather
than being duplicated in both the prefilter and the traces_final outer
wrapper. traces_final now just dedups via LIMIT 1 BY id and projects; the
id set from traces_final_ids is authoritative for downstream spans_deduped
and trace_threads_final scans.
* [OPIK-5973] [BE] refactor: simplify trace_threads_final dedup to LIMIT 1 BY id
UUIDv7 id is globally unique, so LIMIT 1 BY id produces the same deduped
set as the full (workspace_id, project_id, thread_id, id) tuple. Aligns
list/count with the stats query which already uses LIMIT 1 BY id.
* [OPIK-5973] [BE] perf: gate feedback_scores and annotation_queue CTEs on filter usage in count query
feedback_scores_deduped → _grouped → _final and thread_annotation_queue_ids
were always materialized in the count query, even when no filter referenced
them. Their output (display-formatted category_name/reason concat,
value_by_author maps, annotation_queue_ids arrays) is not used by
count(DISTINCT id), only by the inner filter predicates that may or may not
be present.
Gated via <if(feedback_scores_needed)> (OR of feedback_scores_filters and
feedback_scores_empty_filters) and <if(annotation_queue_filters)>. The
common-path count (no feedback-score filter, no annotation-queue filter)
now skips the feedback_scores union-all + two-level groupArray pipeline
and the annotation_queue_items join entirely.1 parent df6cffa commit 7dfc0c4
21 files changed
Lines changed: 377 additions & 171 deletions
File tree
- apps/opik-backend/src/main/java/com/comet/opik
- api/resources/v1/priv
- domain
- experiments/aggregations
- threads
- infrastructure
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | 5 | | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
85 | | - | |
| 85 | + | |
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
95 | | - | |
| 95 | + | |
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
| |||
Lines changed: 6 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
41 | 40 | | |
42 | 41 | | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
112 | | - | |
| 112 | + | |
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| |||
661 | 661 | | |
662 | 662 | | |
663 | 663 | | |
664 | | - | |
| 664 | + | |
665 | 665 | | |
666 | 666 | | |
667 | 667 | | |
| |||
712 | 712 | | |
713 | 713 | | |
714 | 714 | | |
715 | | - | |
| 715 | + | |
716 | 716 | | |
717 | 717 | | |
718 | 718 | | |
| |||
738 | 738 | | |
739 | 739 | | |
740 | 740 | | |
741 | | - | |
| 741 | + | |
742 | 742 | | |
743 | 743 | | |
744 | 744 | | |
| |||
905 | 905 | | |
906 | 906 | | |
907 | 907 | | |
908 | | - | |
| 908 | + | |
909 | 909 | | |
910 | 910 | | |
911 | 911 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | | - | |
| 64 | + | |
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
| 34 | + | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
36 | | - | |
| 35 | + | |
| 36 | + | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| |||
0 commit comments