mosaic-core: latest-only stream scheduling for interactive updates#967
mosaic-core: latest-only stream scheduling for interactive updates#967Datta-sai-vvn wants to merge 1 commit intouwdata:mainfrom
Conversation
Datta-sai-vvn
left a comment
There was a problem hiding this comment.
Hi! This PR is ready for review. Since it’s from a fork, GitHub Actions is currently blocked (“workflow awaiting approval”).
Could a maintainer please approve the workflow run so CI can execute, and then review when convenient?
Local core tests pass, and the change is limited to mosaic-core (Coordinator/QueryManager/types + test + short doc). Thanks!
|
Doesn't requestUpdate already handle this? |
|
Good catch - requestUpdate throttles updates, but it doesn’t provide lifecycle control for already-queued requests or a ‘latest-only’ contract. The gap I measured comes from queued stale requests continuing to execute serially and dominating time-to-latest-result. my proposal introduces explicit stream semantics so older pending requests are dropped deterministically and results can be suppressed, which is orthogonal to throttling. |
|
Ahh cool. I thought about how to not send unnecessary queries in #510 as well. I actually have a student working on a full-fledged solution of a dependency graph. This code does not yet automatically set the stream id in vgplot. Maybe it makes sense to use it there so we can give people good guidance on when to use the stream id. Either way, this is cool. I am going OOO soon so I will not be able to review until March. |
Problem
Interactive UI interactions (brush/drag/slider) can enqueue many similar queries quickly. With Mosaic’s current request queueing and socket protocol, those requests are executed serially and cannot be cancelled server-side. This causes “stale query storms” where the system spends time executing queries that are already obsolete, increasing end-to-end latency for the user’s latest intent.
Change
Add latest-only stream scheduling at the Coordinator/QueryManager layer:
New request options: stream?: string, latest?: boolean
Semantics when stream + latest are set:
Increment per-stream generation counter
Prune older queued requests for the same stream before they reach the connector
Suppress stale results using generation IDs when responses arrive
What this is NOT
This does not cancel queries already executing in DuckDB/server (protocol + node-duckdb binding do not expose interruption). It reduces wasted work by preventing stale requests from being sent when still queued and by ignoring stale responses.
Evidence
Reproduced backlog: blasting 30 heavy queries yields ~7.6s tail latency due to serial execution.
With latest-only enabled, only the latest request resolves; stale requests are pruned/suppressed, reducing tail latency substantially.
Added a unit test proving pruning behavior and small number of connector calls.
Files
packages/mosaic/core/src/types.ts
packages/mosaic/core/src/Coordinator.ts
packages/mosaic/core/src/QueryManager.ts
packages/mosaic/core/test/querymanager-latest-only.test.ts
packages/mosaic/core/docs/latest-only-streams.md
How to use
Use stream to group interaction requests (e.g., "brush", "pan", "slider"). Set latest: true to keep only the most recent pending request for that stream.