| title | Architecture |
|---|---|
| description | Portal system architecture, transport model, and protocol details. |
| priority | P1 |
Portal publishes local services on public subdomains, optional dedicated TCP ports, and optional UDP ports through a relay. Backends connect outward to the relay. Stream traffic is routed by SNI, and tenant TLS remains end-to-end between the client and the SDK or tunnel endpoint for the stream path.
High-level path:
Stream client
-> Relay SNI listener (:443 by default)
-> Claimed reverse session
-> SDK / portal-tunnel
-> Local service
TCP port client
-> Relay lease TCP port (within configured MIN_PORT-MAX_PORT)
-> Claimed reverse session (raw TCP, no TLS)
-> SDK / portal-tunnel
-> Local service
UDP client
-> Relay lease UDP port (within configured MIN_PORT-MAX_PORT)
-> Internal QUIC tunnel
-> SDK / portal-tunnel
-> Local UDP service
- Raw TCP reverse-connect is the canonical stream transport.
- Do not introduce websocket or legacy compatibility paths by default.
- Derive lease hostnames from the full normalized
PORTAL_URLhost, not from apex extraction. - Preserve explicit root-host fallback through SNI no-route handling to the admin/API listener.
- Stream ingress is TLS-only. UDP exposure, when enabled, is raw UDP.
- Relay terminates admin/API TLS on the root host and exposes
/v1/signfor tenant-side keyless signing. - Control-plane HTTP (
/sdk/*), reverse-session establishment (/sdk/connect), and tenant TLS are separate connections with different trust boundaries. - Relay API TLS, SDK relay-client TLS, SDK tenant-server TLS, and QUIC tunnel TLS are distinct configs even when they reuse the same relay certificate material.
- Relay does not terminate tenant TLS. It peeks ClientHello for SNI and bridges raw encrypted bytes after routing.
- SDK/tunnel endpoints terminate tenant TLS locally with a keyless-backed signer that calls the relay.
- In keyless TLS, the relay performs certificate private-key signing through
/v1/sign, but the SDK/tunnel endpoint still runs the TLS server handshake and derives tenant TLS session keys locally. /sdk/connect,/sdk/renew, and/sdk/unregisterare authorized by lease existence plus a relay-issued lease access token./sdk/registeris authenticated by a SIWE challenge/response flow using the SDK identity secp256k1 key. On success, the relay issues a lease-scoped ES256K JWT access token signed by the relay identity key and used for the rest of the lease lifecycle.- Relay URLs must use
https://. - HTTP/2 stays disabled on the admin/API TLS listener because
/sdk/connectdepends on HTTP/1.1 hijacking semantics. - WireGuard, when enabled, is relay-to-relay overlay transport only. It carries multi-hop relay forwarding and overlay discovery, but it is not used for direct tenant TLS termination, public UDP ingress, or
/sdk/*control-plane traffic.
- SNI wildcard matching is one level only.
*.parent.example.commatchesfoo.parent.example.com, not deeper labels. - Reverse TCP marker bytes remain protocol state:
0x00= idle keepalive0x01= raw TCP activation (non-TLS port routing)0x02= TLS passthrough activation
/sdk/connectremains HTTP/1.1 only.
- All JSON control-plane responses use
APIEnvelope:{ ok, data?, error? }. - JSON handlers should write responses through the shared API helpers.
types/is reserved for shared wire/public types and cross-package constants only.- Shared control-plane and public route constants belong in
types/paths.go. - Relay-local frontend asset filenames stay local to
cmd/relay-server. - Do not import
portalfromcmd/*orsdkjust to reach shared DTOs or constants.
- For non-localhost deployments, relay TLS can run from manual certificate files in the relay
IDENTITY_PATHdirectory or from managed ACME. - When managed ACME is enabled, supported DNS providers are
cloudflare,gcloud, androute53. - ENS gasless automation reuses
ACME_DNS_PROVIDERfor DNSSEC and ENS TXT sync. - Relay stores its state under
IDENTITY_PATH, includingidentity.json,admin_settings.json, and certificate material. Tunnel and demo-app identities still useIDENTITY_PATH/--identity-pathas a direct JSON file path. - Managed non-localhost ACME keeps both root and wildcard DNS A records in sync.
- Relay certificate material lives under
IDENTITY_PATHasfullchain.pemandprivatekey.pem. - Localhost uses the development certificate path instead of public managed/manual certificate setup.
Portal has three distinct network roles:
- Control-plane HTTP requests
POST /sdk/register/challengePOST /sdk/registerPOST /sdk/renewPOST /sdk/unregisterGET /sdk/domain
- Reverse session connection
GET /sdk/connect- HTTP/1.1 only
- hijacked into a long-lived raw TCP session
- starts idle in the per-lease stream ready queue, then becomes the tenant data path when claimed
- Internal datagram tunnel
- QUIC to the relay URL host plus the relay-advertised
sni_portfromPOST /sdk/registerwith ALPNportal-tunnel - authenticated by a first-stream control message carrying
access_token - carries relay-to-SDK/tunnel datagram traffic only
- QUIC to the relay URL host plus the relay-advertised
That distinction matters because /sdk/connect stops being ordinary HTTP once hijacked, while the UDP backhaul is a separate internal QUIC carrier.
The relay runtime lives in portal/ (server, route table, transport runtimes, ACME, keyless, auth, discovery, WireGuard overlay, policy).
The SDK client library lives in sdk/ (listener, exposure, relay API client, MITM self-probe, transport clients).
CLI entry points live in cmd/relay-server and cmd/portal-tunnel; they import portal/ and sdk/ respectively but never each other.
Shared wire types, API envelope, error codes, path constants, and transport frame codec live in types/.
- SDK/tunnel registers one lease per relay through
POST /sdk/register/challengefollowed byPOST /sdk/register. - SDK opens one or more reverse sessions per registered lease with
GET /sdk/connect. - Each relay hijacks
/sdk/connectrequests and places the connection in the per-lease stream ready queue. - While idle, the relay writes
0x00keepalive markers. - A stream client connects to the relay SNI listener.
- Relay extracts SNI from ClientHello, resolves a lease, and waits up to
ClaimTimeoutfor one reverse session from that lease stream queue. - Relay writes
0x02to activate the claimed session. - SDK/tunnel receives
0x02, starts tenant TLS locally using the relay-backed keyless signer, and the relay bridges raw encrypted bytes end-to-end.
Result: the relay decides routing, but tenant TLS termination still happens at the SDK/tunnel side.
- After a real tenant connection begins I/O, the SDK may start one asynchronous self-probe for that listener if no probe is in flight and the 30-second cooldown has expired.
- The SDK opens a new TLS connection to its own public URL using the same tenant-facing TLS characteristics as normal traffic.
- The probe client exports TLS keying material (
ExportKeyingMaterial) from that probe connection and stores it under a random nonce. - The first encrypted probe payload is
16-byte nonce + random padding; there is no fixed probe magic or dedicated ALPN. - When the probe connection comes back through the relay and reaches the SDK-side tenant TLS terminator, the SDK peeks only the first 16 encrypted application bytes while a probe is pending.
- If those bytes match a pending nonce, the SDK exports keying material on the server side and compares it with the client-side exporter value.
- Matching exporter values mean the probe observed passthrough for that connection. A mismatch is logged as suspected relay-side TLS termination. A timeout is logged as probe failure, not proof of MITM.
Result: this is a detect-only signal by default. It raises the cost of adaptive relay-side TLS termination, but it does not prove passthrough for every user connection. Callers that need stricter behavior can opt into relay banning.
- SDK/tunnel requests a register challenge with
tcp_enabled=true, signs the returned SIWE message, and completes registration. - Relay validates that the TCP port plane is enabled, allocates a TCP port, and creates a per-lease TCP listener.
- Registration response includes
tcp_addr(public TCP endpoint). - An external TCP client connects to
tcp_addr. - The relay accepts the connection, claims a reverse session from the lease stream queue, and writes
0x01(raw TCP activation marker). - SDK-side receives
0x01and passes the raw connection directly without TLS handshake. - Data is copied bidirectionally between the external client and the reverse session.
Result: the relay allocates a dedicated TCP port per lease and bridges raw TCP without TLS. This is ideal for non-TLS protocols like Minecraft, game servers, or any raw TCP service.
- SDK/tunnel requests a register challenge with
udp_enabled=true, signs the returned SIWE message, and completes registration. - Relay validates that the datagram plane is enabled, allocates a UDP port, and creates a per-lease datagram runtime.
- Registration response includes
udp_addr,access_token, andsni_port. The SDK dials QUIC to the relay onsni_port. - SDK opens a QUIC connection with ALPN
portal-tunneland DATAGRAM support enabled. - Authentication: SDK sends
{access_token}JSON on the first QUIC stream; relay validates before accepting the tunnel. - External UDP client sends a packet to
udp_addr-> relay assigns a flow ID -> QUIC DATAGRAM frame to SDK. - SDK-side decodes frames and delivers to local UDP target.
- Return path: local response -> SDK -> QUIC DATAGRAM -> relay ->
WriteToUDPto the original client.
Result: raw public UDP exposure with an internal QUIC datagram backhaul. UDP and TCP port allocations are independent from the same MIN_PORT-MAX_PORT range.
- Discovery bootstraps from public HTTPS relay URLs, then expands through relay-to-relay
/discoverypolling and periodic self-announces to bootstrap relays through/discovery/announce. - SDK exposures consume relay discovery results to choose relays, but they do not announce themselves and do not serve
/discovery. - Discovery descriptors are signed relay self-descriptions. They bind relay routing metadata such as
api_https_addr,supports_overlay,wireguard_public_key, andwireguard_portto the relay identity. Lease access tokens remain separate and authorize tenant lease operations only. /discovery/announceaccepts only signed relay descriptors. Loopback or localhost relay descriptors are rejected because they cannot join the public discovery mesh.- The overlay peer API is plain HTTP on the WireGuard network, not public Internet HTTP. It serves the same discovery payload shape used by public
/discovery. - Overlay failure affects inter-relay discovery, mesh synchronization, and multi-hop relay forwarding. Direct tenant TLS routing, keyless TLS, register/renew/connect, and public UDP ingress do not depend on the WireGuard transport path.
POST /sdk/register/challengethenPOST /sdk/register.- Caller signs the returned SIWE message with the identity secp256k1 key (
personal_sign). namemust be a valid single DNS label; the relay publishes the lease at<name>.<root host>.- Registration reserves the hostname and publishes the route immediately; if no reverse session is ready yet, inbound SNI claims wait up to
ClaimTimeout. - On success, the relay issues a lease-scoped ES256K JWT access token signed by the relay identity key, used for the rest of the lease lifecycle.
- UDP registration requires server
UDP_ENABLED=true, a validMIN_PORT/MAX_PORTrange, and admin enablement. Failures:udp_disabled(403),udp_capacity_exceeded(503),udp_port_exhausted(503). - TCP port registration has equivalent three-condition gating. Failures:
tcp_port_disabled(403),tcp_port_capacity_exceeded(503),tcp_port_exhausted(503). PORTAL_URLis normalized to its host component only; path/query segments are ignored for routing.
GET /sdk/connect(HTTP/1.1 only,X-Portal-Access-Tokenheader).- Relay validates: lease exists and is not expired; access token signature, issuer, audience, identity, and expiry are all valid.
- After claim, relay writes
0x02before switching the session into tenant TLS passthrough. - After hijack, the connection becomes a broker-managed reverse session.
POST /sdk/renewwithaccess_token. Extends lease TTL and returns a refreshed token.
POST /sdk/unregisterwithaccess_token. Removes the lease, routes, and ready reverse sessions.
Route lookup order:
- Exact hostname match
- Single-label wildcard match (
*.example.com) - Root-host fallback to the admin/API listener
Notes:
- Wildcards are one level only.
- The exact root host is never served by the wildcard route.
- For non-apex
PORTAL_URLvalues such ashttps://portal.example.com:8443/admin, a lease nameddemois published atdemo.portal.example.com.
The admin surface is intentionally small: an HTML index, one JSON snapshot endpoint, and a small set of admin action/auth routes. Route paths are enumerated in types/paths.go and cmd/relay-server.
The relay signs handshake digests via /v1/sign but never receives tenant TLS traffic secrets. The SDK/tunnel endpoint runs the full TLS server handshake and derives session keys locally. Relay control-plane TLS and reverse-session setup terminate on the relay's admin/API listener and are not protected by the tenant keyless path.
- Reverse-only backend connectivity
- One canonical raw TCP reverse transport
- Dedicated TCP port allocation for non-TLS services with raw TCP bridging
- Raw public UDP exposure with an internal QUIC datagram backhaul
- Optional WireGuard relay overlay for relay discovery, peer synchronization, and multi-hop relay forwarding
- SNI-based routing with root-host fallback
- End-to-end tenant TLS with relay-backed keyless signing
- Traffic-triggered detect-only MITM self-probing for probable relay-side TLS termination
- SIWE identity proof for registration plus relay-issued ES256K JWT access tokens for the lease lifecycle
- Lease-local stream and datagram ownership through per-lease transport runtimes
- Optional QUIC/UDP datagram transport coexisting with TCP on the same lease
- Per-lease UDP and TCP port allocation with sticky name-based reservation
- QUIC tunnel authentication via control stream (
access_token)