Releases · boettiger-lab/mcp-data-server

18 Apr 19:26

cboettig

v0.5.1

8b8be9c

v0.5.1 — retry-rescued parents enqueue their sub-children Latest

Latest

Patch fix for v0.5.0. If a parent-of-children (e.g. us-census) fails the first-pass fetch and gets rescued by the retry pass, its sub-children are now also fetched in the retry pool — previously they'd silently go missing from the catalog.

Observed on the 2026-04-18 dev rollout: v0.5.0 loaded 75 of the expected 81 collections because us-census retried successfully but its 6 census-year sub-collections were never enqueued.

Test added: test_retry_rescued_parent_fetches_its_subchildren. 96/96 tests passing.

Assets 2

18 Apr 19:15

boettiger-lab-llm-agent

v0.5.0

3756774

v0.5.0 — STAC retry + exit on root failure

Retry pass for transient child failures

STAC catalog children that time out on the first attempt now get one automatic retry in a fresh pool with a longer per-child timeout.

New env var: `STAC_CHILD_RETRY_TIMEOUT` (default 8s).
A retry that succeeds clears the original error from `STAC_LOAD_ERRORS`; a retry that also fails leaves the error in place.
Rescues the tail-latency case observed on v0.4.0's first real-world deploy (2 of 81 collections timed out at the 5s ceiling — both were recoverable with a slightly longer timeout).

Exit on root-catalog failure

When the root STAC catalog JSON is unreachable at startup, the server now `sys.exit(1)` rather than starting uvicorn with an empty catalog. Kubernetes restarts the pod; next attempt tries fresh S3 conditions.

Partial catalogs (some children failed) still serve — that's unchanged from v0.4.0. Only total failure triggers exit.

Default concurrency bumped

`STAC_FETCH_CONCURRENCY` default raised from 8 to 16 so the main pool plus the retry pass both fit within the readiness-probe budget. The full-load cold-start worst case (all 63 fetches succeed on retry) drops from ~40s to ~28s.

Upgrade notes

All changes are backwards-compatible for existing callers. New env var `STAC_CHILD_RETRY_TIMEOUT` is optional. Behavior change worth knowing about: pods that previously stayed up with an empty catalog when root failed will now exit on restart; that's intentional (k8s retries faster than a broken-but-serving pod).

Assets 2

18 Apr 15:57

boettiger-lab-llm-agent

v0.4.0

31d8233

v0.4.0 — STAC catalog resilience

Resilient STAC catalog loader (Fixes #65)

fetch_stac_catalog() now survives slow / partially-failing S3:

Split timeouts: STAC_ROOT_TIMEOUT (default 15s, hard prerequisite) and STAC_CHILD_TIMEOUT (default 5s, individually skippable). Back-compat: STAC_TIMEOUT alone still works as a single knob.
Bounded parallelism via ThreadPoolExecutor with dynamic enqueue — parent and sub-child fetches share an 8-worker pool by default; tune with STAC_FETCH_CONCURRENCY. Sub-children are submitted as soon as their parent's JSON arrives, keeping the pool saturated without static wave boundaries.
Partial-result fallback: per-child failures are recorded in a new module-level STAC_LOAD_ERRORS dict instead of aborting the whole load. list_datasets() appends a ⚠️ footer when errors exist so agents see "N collections could not be loaded" rather than treating a transient-failure collection as nonexistent.
Cache preservation: cache-miss refetches no longer wipe previously-loaded state if the new load returns zero datasets — the old snapshot is kept until a subsequent load succeeds.

Worst-case startup wall-clock is bounded to ~40s (within the readiness probe budget) even under pathological S3 tail latency.

Startup performance (#66)

Dropped a duplicate catalog walk at module import — halves the startup S3 load.
Removed the dead fetch_stac_collections() / DATA_CATALOG machinery that only tests consumed.

Internal / ops

Dev deployment right-sized proposal (PR #67; merged to main, cluster spec still pending a successful kubectl apply).
Prod scaled from 4 to 2 replicas (#68).

Upgrade notes

No code changes required on the client side. New env vars (STAC_ROOT_TIMEOUT, STAC_CHILD_TIMEOUT, STAC_FETCH_CONCURRENCY) are all optional. The STAC_TIMEOUT env var still works as a back-compat single knob.

Assets 2

16 Apr 16:14

cboettig

v0.3.0

ff9b615

v0.3.0 — Dynamic hex tile endpoint, get_collection tool, programmatic access docs

What's New

Dynamic MVT tile endpoint (`register_hex_tiles`)

New MCP tool that materializes H3 hex data into a resolution pyramid on S3 and returns a MapLibre-compatible vector tile URL. For datasets too large to return as a table (~100k+ cells), agents can now generate interactive hex map layers on the fly.

register_hex_tiles(sql, finest_res, ...) → writes partitioned parquet pyramid, returns tile_url_template
/tiles/hex/{hash}/{z}/{x}/{y}.pbf endpoint serves MVT tiles via DuckDB ST_AsMVT
Automatic resolution switching: coarser hexes at low zoom, finest at high zoom
Content-addressed hashing: identical queries dedupe naturally

`get_collection` tool

Returns structured STAC collection metadata as JSON for programmatic use. Unlike get_stac_details (markdown for LLM consumption), this returns the raw collection dict with all assets, per-asset STAC extension fields, and child collection IDs. Intended for app code that builds map layers and system prompts programmatically.

Programmatic access docs & examples

R examples using ellmer (agent) and httr2 (direct query)
Python examples using langchain-mcp-adapters (agent) and httpx (direct query)
VitePress docs page at /guide/programmatic-access

Bug fix

Tile connection S3 secret: build_tile_connection() was missing the Ceph S3 secret configuration, causing register_hex_tiles to fail with NoSuchBucket. Now matches the query tool's S3 setup.

Test summary

105 unit/integration tests passing
Live-tested on dev server: pyramid materialization, tile serving at z2–z12, no regressions on existing tools

Assets 2

16 Apr 05:27

cboettig

v0.2.0

f31d233

v0.2.0 — Auth, query reliability, H3 spatial guidance

What's new since v0.1.0

Security & Auth

Optional Bearer token auth via MCP_AUTH_TOKEN env var (#43) — set it to require authentication on all endpoints

Query reliability

Fixed GEOMETRY crash on GeoParquet queries — geometry columns are now silently dropped from tabular output (#48)
Removed s3_allow_recursive_globbing=false workaround — fixed upstream in DuckDB 1.5.1 (#37)

Performance & infrastructure

Least-conn load balancing and 10-minute query timeout on the HAProxy ingress (#46)

H3 spatial guidance (agent prompt improvements)

Prohibit DuckDB spatial ops; require H3 hash joins for all geographic filtering (#49)
Strengthen cross-resolution join guidance to prevent h3_cell_to_children misuse (#40)
Add resolution direction table and pre-computed parent column join pattern (#45)
Area column warning, multi-resolution table, cross-resolution joins (#35)

STAC catalog improvements

get_stac_details: directory listing for parent collections, suppress parent columns (#36)
Replace placeholder S3 paths to prevent path hallucination (#38)

Query optimization hints

Case-insensitive text search and apostrophe escaping guidance
S3 credentials: scoped per-request isolation with s3_scope support

Docs

Fix Claude Code install instructions in README (#51)

Upgrade notes

No breaking changes. The server is fully backwards-compatible with v0.1.0 clients.
If you set MCP_AUTH_TOKEN, all requests must include Authorization: Bearer <token>.

Assets 2

31 Mar 23:59

boettiger-lab-llm-agent

v0.1.0

fa878b7

v0.1.0

What's included

MCP server with three tools: browse_stac_catalog, get_stac_details, and query
DuckDB-powered SQL against S3 Parquet files with H3 spatial indexing
STAC catalog integration for dynamic dataset discovery
Client-supplied credentials for private STAC catalogs and S3 buckets
s3_scope parameter for queries mixing private and public S3 endpoints
Hosted endpoint at https://duckdb-mcp.nrp-nautilus.io/mcp on NRP Nautilus k8s
MCP resources (catalog://list, catalog://{dataset_id}) and geospatial-analyst prompt
VitePress documentation site

Datasets served

GLWD, Vulnerable Carbon, NCP, WDPA, Ramsar Sites, HydroBASINS, Countries & Regions, iNaturalist, Corruption Index 2024

Assets 2

Releases: boettiger-lab/mcp-data-server

v0.5.1 — retry-rescued parents enqueue their sub-children

Uh oh!

v0.5.0 — STAC retry + exit on root failure

Retry pass for transient child failures

Exit on root-catalog failure

Default concurrency bumped

Upgrade notes

Uh oh!

v0.4.0 — STAC catalog resilience

Resilient STAC catalog loader (Fixes #65)

Startup performance (#66)

Internal / ops

Upgrade notes

Uh oh!

v0.3.0 — Dynamic hex tile endpoint, get_collection tool, programmatic access docs

What's New

Dynamic MVT tile endpoint (register_hex_tiles)

get_collection tool

Programmatic access docs & examples

Bug fix

Test summary

Uh oh!

v0.2.0 — Auth, query reliability, H3 spatial guidance

What's new since v0.1.0

Security & Auth

Query reliability

Performance & infrastructure

H3 spatial guidance (agent prompt improvements)

STAC catalog improvements

Query optimization hints

Docs

Upgrade notes

Uh oh!

v0.1.0

What's included

Datasets served

Uh oh!

Dynamic MVT tile endpoint (`register_hex_tiles`)

`get_collection` tool