Skip to content

Commit 3f12a8f

Browse files
bra1nDumpclaudehappy-otter
committed
docs: add realtime-sync-and-rpc doc + cross-references
High-level overview of socket management, RPC routing, and known observability gaps. Cross-referenced from multi-process.md and README index. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
1 parent a08c668 commit 3f12a8f

3 files changed

Lines changed: 79 additions & 0 deletions

File tree

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,13 @@ This folder documents how Happy works internally, with a focus on protocol, back
44

55
## Index
66
- protocol.md: Wire protocol (WebSocket), payload formats, sequencing, and concurrency rules.
7+
- realtime-sync-and-rpc.md: High-level overview of realtime socket management and RPC control flow.
78
- api.md: HTTP endpoints and authentication flows.
89
- encryption.md: Encryption boundaries and on-wire encoding.
910
- backend-architecture.md: Internal backend structure, data flow, and key subsystems.
1011
- deployment.md: How to deploy the backend and required infrastructure.
1112
- cli-architecture.md: CLI and daemon architecture and how they interact with the server.
13+
- multi-process.md: Deeper multi-replica Socket.IO + Redis streams behavior, failure modes, and integration-test history.
1214
- dev-environments.md: Local `environments/data/` workflow, lab-rat project provisioning, `env:cli` passthrough behavior, and daemon usage.
1315
- session-protocol.md: Unified encrypted chat event protocol.
1416
- session-protocol-claude.md: Claude-specific session-protocol flow (local vs remote launchers, dedupe/restarts).

docs/multi-process.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ How handy-server runs across multiple Kubernetes replicas: socket distribution,
44
room-based RPC routing, broadcast fan-out, daemon lifecycle, and what happens
55
during the messy cases (pod kill, brief reconnect, network partition).
66

7+
For the shorter high-level control-flow doc, see `realtime-sync-and-rpc.md`.
8+
79
> **Status:** the code in this doc is on `main` but `handy.yaml` ships
810
> `replicas: 1`. Flipping prod to multi-replica is a separate decision.
911

docs/realtime-sync-and-rpc.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Realtime Sync and RPC
2+
3+
This is the high-level doc for how Happy uses Socket.IO for realtime sync and point-to-point RPC.
4+
5+
Related docs:
6+
- `protocol.md`: wire contract, event names, and payload shapes
7+
- `multi-process.md`: deeper notes about cross-replica behavior, failure modes, and test history
8+
- `backend-architecture.md`: server subsystem overview
9+
- `cli-architecture.md`: daemon and client-side socket ownership
10+
11+
## Core Pieces
12+
13+
Happy uses one Socket.IO endpoint at `/v1/updates` and three connection scopes:
14+
- `user-scoped`: app/web clients and account-wide listeners
15+
- `session-scoped`: one live session process
16+
- `machine-scoped`: one daemon for one machine
17+
18+
On the server:
19+
- `socket.ts` authenticates the handshake, tags the socket with `userId` and scope metadata, and enables the Redis streams adapter when `REDIS_URL` is set.
20+
- `eventRouter.ts` handles fan-out for normal realtime updates.
21+
- `rpcHandler.ts` handles `rpc-register`, `rpc-unregister`, and `rpc-call`.
22+
23+
On the client side:
24+
- `ApiSessionClient` owns a long-lived session-scoped socket.
25+
- `ApiMachineClient` owns a long-lived machine-scoped socket.
26+
- the app's `apiSocket` owns a long-lived user-scoped socket.
27+
- `RpcHandlerManager` registers handlers and re-registers them on reconnect.
28+
29+
## Room Model
30+
31+
Normal fan-out rooms:
32+
- `user:<userId>`
33+
- `user:<userId>:user-scoped`
34+
- `user:<userId>:session:<sessionId>`
35+
- `user:<userId>:machine:<machineId>`
36+
37+
RPC registration rooms:
38+
- `rpc:<userId>:<prefixedMethod>`
39+
40+
The server uses room membership as the source of truth for who currently owns an RPC method.
41+
42+
## Realtime Sync Flow
43+
44+
1. A client connects with a scope (`user-scoped`, `session-scoped`, or `machine-scoped`).
45+
2. The server adds that socket to the appropriate user/session/machine rooms.
46+
3. When durable state changes, `eventRouter` emits `update` events to the matching rooms.
47+
4. When transient presence changes, the server emits `ephemeral` events to the matching rooms.
48+
5. On reconnect, clients can re-fetch state if they missed anything while offline.
49+
50+
## RPC Flow
51+
52+
1. A caller emits `rpc-call` with a method name and params.
53+
2. `rpcHandler.ts` resolves the room `rpc:<userId>:<method>`.
54+
3. The server looks for a target socket in that room.
55+
4. If no target is present, the server waits briefly for reconnect before failing.
56+
5. If a target is present, the server forwards the request with `rpc-request`.
57+
6. The target runs the handler through `RpcHandlerManager` and acks the result.
58+
7. If the target disappears mid-call, the server fails the call instead of waiting for the full timeout.
59+
60+
This is how Happy does point-to-point control traffic on top of the same transport used for normal realtime sync.
61+
62+
## Current Sharp Edges
63+
64+
- `packages/happy-agent/src/machineRpc.ts` still creates one-off caller sockets for machine `spawn` and `resume` instead of reusing a long-lived caller connection.
65+
- `packages/happy-server/sources/app/api/socket/rpcHandler.ts` still mixes room lookup, reconnect grace, mid-call presence checking, and metric emission in one place.
66+
67+
## Debugging
68+
69+
If this path is flaky, the first things to check are:
70+
- RPC success/failure rate
71+
- RPC latency
72+
- websocket connection churn
73+
- Redis stream lag
74+
75+
Use `multi-process.md` for the deeper cross-replica and failure-mode details.

0 commit comments

Comments
 (0)