You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(client): self-healing for permanently stuck expired shape handles (#4087)
## Summary
Expired shape handle entries in localStorage can get permanently stuck,
preventing data from ever loading for affected shapes. This adds a
self-healing retry mechanism that clears the poisoned entry and retries
once, allowing automatic recovery even when a proxy strips cache-buster
query parameters.
Based on #4085 by @evan-liveflow — refined with additional hardening
from code review.
## Root Cause
When a shape gets a 409 (handle rotation), the client stores the old
handle in `localStorage['electric_expired_shapes']`. On future requests,
if a response contains that handle, the client treats it as a stale
cached response and retries up to 3 times with cache-buster params.
The problem: if a proxy (e.g., phoenix_sync) strips query parameters,
the cache busters are ineffective. All 3 retries fail, `FetchError(502)`
is thrown to `onError`, and if `onError` doesn't retry, the stream dies.
The expired entry persists in localStorage, so the next session hits the
same wall — permanently.
Since the server never reuses handles (now documented as **SPEC.md
S0**), the expired entry becomes a false positive once the caching layer
clears — but the client has no way to discover this.
## Approach
After stale cache retries exhaust (3 attempts), the client now:
1. **Always clears the expired entry** from localStorage — if cache
busters didn't work, keeping the entry only poisons future sessions
2. **Attempts one self-healing retry** — resets the stream and retries
without the `expired_handle` param. Since handles are never reused, the
fresh response will have a new handle and won't trigger stale detection
3. **Guards against infinite loops** via `#expiredShapeRecoveryKey`
(once per shape key, reset on up-to-date)
```typescript
if (transition.exceededMaxRetries) {
if (shapeKey) {
expiredShapesCache.delete(shapeKey) // always clear
if (this.#expiredShapeRecoveryKey !== shapeKey) {
this.#expiredShapeRecoveryKey = shapeKey // remember we tried
this.#reset() // fresh start
throw new StaleCacheError(...) // caught internally → retry
}
}
throw new FetchError(502, ...) // truly give up
}
```
### Key Invariants
- **S0**: Server handles are unique and never reused (phash2 +
microsecond timestamp, SQLite UNIQUE INDEX, ETS insert_new)
- Self-healing fires at most once per shape per retry cycle
(`#expiredShapeRecoveryKey` guard)
- Guard resets on up-to-date, so long-lived streams can self-heal again
if CDN misbehaves later
- Expired entry is cleared on every exhaustion, regardless of whether
self-healing fires
### Non-goals
- TTL on expired cache entries — the self-healing mechanism handles the
failure mode without added complexity
- Changing `onError` contract — the fix works regardless of what the
user's `onError` callback does
## Verification
```bash
cd packages/typescript-client
pnpm vitest run --config vitest.unit.config.ts
# 312 tests pass
pnpm exec tsc --noEmit
# Clean
```
## Files changed
| File | Change |
|------|--------|
| `src/client.ts` | Self-healing logic in `#onInitialResponse`, recovery
key cleared on up-to-date, updated catch block comment |
| `test/expired-shapes-cache.test.ts` | Updated 2 existing tests for
self-healing flow, added test for CDN-always-stale scenario |
| `SPEC.md` | Added S0 (handle uniqueness guarantee), updated L3
loop-back entry and guard table |
| `.changeset/fix-expired-shapes-self-healing.md` | Changeset for patch
release |
---
Based on #4085
---------
Co-authored-by: Evan O'Brien <evan@liveflow.io>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Fix permanently stuck expired shape handles in localStorage by adding self-healing retry. When stale cache retries are exhausted (3 attempts with cache busters), the client now clears the expired entry from localStorage and retries once without the `expired_handle` parameter. Since the server never reuses handles (documented as SPEC.md S0), the fresh response will have a new handle and bypass stale detection. This prevents shapes from being permanently unloadable when a proxy strips cache-buster query parameters.
Copy file name to clipboardExpand all lines: packages/typescript-client/SPEC.md
+37-16Lines changed: 37 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,6 +66,26 @@ Any ──markMustRefetch─► Initial (offset = -1)
66
66
-`response` on Paused delegates to `previousState`, preserving the Paused wrapper for `accepted` and `stale-retry` transitions; `ignored` returns `this`
67
67
-`response`/`messages`/`sseClose` on Error return `this` (ignored)
68
68
69
+
## Server Assumptions
70
+
71
+
Properties of the sync service that the client state machine depends on.
72
+
73
+
### S0: Shape handles are unique and never reused
74
+
75
+
The server generates handles as `{phash2_hash}-{microsecond_timestamp}`. Uniqueness
76
+
is enforced by monotonic timestamps, a SQLite `UNIQUE INDEX` on the handle column,
77
+
and ETS `insert_new` checks. Even after server restarts, old handles persist in
78
+
SQLite and new ones receive fresh timestamps, so collisions cannot occur.
79
+
80
+
**Implication for expired shapes cache**: Once a handle is marked expired (after a
81
+
409 response), the server will never issue that handle again. If a response contains
82
+
an expired handle, it must be coming from a caching layer (browser HTTP cache,
| L6 |`fetchSnapshot` catch → `fetchSnapshot`| 1975 | HTTP 409 on snapshot fetch | New handle via `withHandle()`; or local retry cache buster if same/no handle |`#maxSnapshotRetries` (5) + cache buster on same handle |
369
+
| # | Site | Line | Trigger | URL changes because | Guard|
| L1 |`#requestShape` → `#requestShape`| 940 | Normal completion after `#fetchShape()`| Offset advances from response headers |`#checkFastLoop` (non-live)|
372
+
| L2 |`#requestShape` catch → `#requestShape`| 874 | Abort with `FORCE_DISCONNECT_AND_REFRESH` or `SYSTEM_WAKE`|`isRefreshing` flag changes `canLongPoll`, affecting `live` param | Abort signals are discrete events|
373
+
| L3 |`#requestShape` catch → `#requestShape`| 886 |`StaleCacheError` thrown by `#onInitialResponse`|`StaleRetryState` adds `cache_buster` param; after max retries, self-healing clears expired entry + resets stream |`maxStaleCacheRetries` counter + `#expiredShapeRecoveryKey` (once per shape)|
374
+
| L4 |`#requestShape` catch → `#requestShape`| 924 | HTTP 409 (shape rotation) |`#reset()` sets offset=-1 + new handle; or request-scoped cache buster if no handle | New handle from 409 response or unique retry URL|
| L6 |`fetchSnapshot` catch → `fetchSnapshot`| 1975 | HTTP 409 on snapshot fetch | New handle via `withHandle()`; or local retry cache buster if same/no handle |`#maxSnapshotRetries` (5) + cache buster on same handle|
|`#checkFastLoop`| Non-live `#requestShape` only | Detects N requests at same offset within a time window. First: clears caches + resets. Persistent: exponential backoff → throws FetchError(502). |
363
-
|`maxStaleCacheRetries`| Stale response path (L3) | State machine counts stale retries. Throws FetchError(502) after 3 consecutive stale responses. |
|`#maxConsecutiveErrorRetries`|`#start` onError retry (L5) | Counts consecutive error retries. Sends error to subscribers and tears down after 50. Reset on successful message batch. |
366
-
| Pause lock |`#requestShape` entry | Returns immediately if paused. Prevents fetches during snapshots. |
367
-
| Up-to-date exit |`#requestShape` entry | Returns if `!subscribe` and `isUpToDate`. Breaks loop for one-shot syncs. |
|`#checkFastLoop`| Non-live `#requestShape` only | Detects N requests at same offset within a time window. First: clears caches + resets. Persistent: exponential backoff → throws FetchError(502). |
383
+
|`maxStaleCacheRetries`| Stale response path (L3) | State machine counts stale retries. After 3 consecutive stale responses, clears expired entry and attempts one self-healing retry. Throws FetchError(502) if self-healing also fails. |
384
+
|`#expiredShapeRecoveryKey`| Self-healing (L3 extension) | Records shape key after first self-healing attempt. Second exhaustion on same key skips self-healing → FetchError(502). Cleared on up-to-date. |
|`#maxConsecutiveErrorRetries`|`#start` onError retry (L5) | Counts consecutive error retries. Sends error to subscribers and tears down after 50. Reset on successful message batch. |
387
+
| Pause lock |`#requestShape` entry | Returns immediately if paused. Prevents fetches during snapshots. |
388
+
| Up-to-date exit |`#requestShape` entry | Returns if `!subscribe` and `isUpToDate`. Breaks loop for one-shot syncs. |
0 commit comments