Skip to content

mosaic-core: latest-only stream scheduling for interactive updates#967

Open
Datta-sai-vvn wants to merge 1 commit intouwdata:mainfrom
Datta-sai-vvn:task7-latest-only
Open

mosaic-core: latest-only stream scheduling for interactive updates#967
Datta-sai-vvn wants to merge 1 commit intouwdata:mainfrom
Datta-sai-vvn:task7-latest-only

Conversation

@Datta-sai-vvn
Copy link
Copy Markdown

Problem
Interactive UI interactions (brush/drag/slider) can enqueue many similar queries quickly. With Mosaic’s current request queueing and socket protocol, those requests are executed serially and cannot be cancelled server-side. This causes “stale query storms” where the system spends time executing queries that are already obsolete, increasing end-to-end latency for the user’s latest intent.

Change
Add latest-only stream scheduling at the Coordinator/QueryManager layer:

New request options: stream?: string, latest?: boolean

Semantics when stream + latest are set:

Increment per-stream generation counter

Prune older queued requests for the same stream before they reach the connector

Suppress stale results using generation IDs when responses arrive

What this is NOT

This does not cancel queries already executing in DuckDB/server (protocol + node-duckdb binding do not expose interruption). It reduces wasted work by preventing stale requests from being sent when still queued and by ignoring stale responses.

Evidence

Reproduced backlog: blasting 30 heavy queries yields ~7.6s tail latency due to serial execution.

With latest-only enabled, only the latest request resolves; stale requests are pruned/suppressed, reducing tail latency substantially.

Added a unit test proving pruning behavior and small number of connector calls.

Files

packages/mosaic/core/src/types.ts

packages/mosaic/core/src/Coordinator.ts

packages/mosaic/core/src/QueryManager.ts

packages/mosaic/core/test/querymanager-latest-only.test.ts

packages/mosaic/core/docs/latest-only-streams.md

How to use
Use stream to group interaction requests (e.g., "brush", "pan", "slider"). Set latest: true to keep only the most recent pending request for that stream.

Copy link
Copy Markdown
Author

@Datta-sai-vvn Datta-sai-vvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! This PR is ready for review. Since it’s from a fork, GitHub Actions is currently blocked (“workflow awaiting approval”).
Could a maintainer please approve the workflow run so CI can execute, and then review when convenient?
Local core tests pass, and the change is limited to mosaic-core (Coordinator/QueryManager/types + test + short doc). Thanks!

@derekperkins
Copy link
Copy Markdown
Collaborator

Doesn't requestUpdate already handle this?

@Datta-sai-vvn
Copy link
Copy Markdown
Author

Good catch - requestUpdate throttles updates, but it doesn’t provide lifecycle control for already-queued requests or a ‘latest-only’ contract. The gap I measured comes from queued stale requests continuing to execute serially and dominating time-to-latest-result. my proposal introduces explicit stream semantics so older pending requests are dropped deterministically and results can be suppressed, which is orthogonal to throttling.

@domoritz
Copy link
Copy Markdown
Member

domoritz commented Feb 9, 2026

Ahh cool. I thought about how to not send unnecessary queries in #510 as well. I actually have a student working on a full-fledged solution of a dependency graph.

This code does not yet automatically set the stream id in vgplot. Maybe it makes sense to use it there so we can give people good guidance on when to use the stream id.

Either way, this is cool. I am going OOO soon so I will not be able to review until March.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants