Skip to content

Latest commit

 

History

History
236 lines (162 loc) · 6.07 KB

File metadata and controls

236 lines (162 loc) · 6.07 KB

Deployment

Here's how to get Netclode running on your own server.

Prerequisites

  • Linux machine with nested virtualization (2 vCPU, 8GB RAM minimum)
  • S3-compatible storage (DigitalOcean Spaces, Cloudflare R2, etc.)
  • Tailscale account
  • At least one LLM API key (Anthropic, OpenAI, Mistral, etc.) - see SDK Support
  • Ansible installed locally

1. Clone the repo

git clone https://github.com/angristan/netclode.git
cd netclode

2. Provision a server

Requirements:

  • Debian or Ubuntu
  • Nested virtualization support
  • 2+ vCPU, 8GB+ RAM

3. Setup server access

SSH into your server and:

  1. Add your SSH public key to ~/.ssh/authorized_keys
  2. Install Tailscale: curl -fsSL https://tailscale.com/install.sh | sh
  3. Connect to your tailnet: tailscale up --ssh

Your server is now accessible via its Tailscale hostname (e.g., my-server).

4. Configure Tailscale for k8s ingress

  1. Create an OAuth client with Devices: Write and Auth Keys: Write scopes
  2. Enable MagicDNS

5. Configure secrets

Create .env at the repo root:

# LLM provider (at least one required - see docs/sdk-support.md)
ANTHROPIC_API_KEY=sk-ant-api03-xxx
# OPENAI_API_KEY=sk-xxx
# MISTRAL_API_KEY=xxx

# Tailscale (OAuth client from step 4)
TS_OAUTH_CLIENT_ID=your-oauth-client-id
TS_OAUTH_CLIENT_SECRET=your-oauth-client-secret

# Tailscale auth key for host access (optional)
# If not set, authenticate manually on the host: tailscale up --ssh
# TAILSCALE_AUTHKEY=tskey-auth-xxx

# JuiceFS / S3 storage
DO_SPACES_ACCESS_KEY=your-spaces-access-key
DO_SPACES_SECRET_KEY=your-spaces-secret-key
JUICEFS_BUCKET=https://fra1.digitaloceanspaces.com/your-bucket
JUICEFS_META_URL=redis://redis-juicefs.netclode.svc.cluster.local:6379/0

# Deployment target (Tailscale hostname from step 3)
DEPLOY_HOST=your-server

# GitHub App (optional - for repo picker and github-bot)
GITHUB_APP_ID=123456
GITHUB_APP_PRIVATE_KEY_B64=base64-encoded-pem-private-key
GITHUB_INSTALLATION_ID=12345678

# GitHub Bot webhook secret (required if using github-bot)
# GITHUB_WEBHOOK_SECRET=your-webhook-secret

# Kata VM resources (optional, defaults shown)
# KATA_VM_CPUS=4
# KATA_VM_MEMORY_MB=4096

# Datadog observability (optional)
# DD_API_KEY=your-datadog-api-key
# DD_APP_KEY=your-datadog-app-key
# DD_SITE=datadoghq.com

Create a bucket (e.g., netclode-juicefs) with read/write credentials.

6. Install Ansible dependencies

cd infra/ansible
ansible-galaxy collection install -r requirements.yaml

7. Deploy

cd infra/ansible

# Full infrastructure deployment (reads secrets from .env)
DEPLOY_HOST=<server-ip> ansible-playbook playbooks/site.yaml

This installs:

  • k3s (single-node Kubernetes)
  • Kata Containers (microVM runtime)
  • Cilium CNI (NetworkPolicy support)
  • Tailscale (secure networking)
  • JuiceFS CSI (S3-backed storage)
  • Control plane and warm pool

8. Fetch kubeconfig

cd infra/ansible
DEPLOY_HOST=<server-ip> ansible-playbook playbooks/fetch-kubeconfig.yaml

This merges the netclode context into ~/.kube/config. Use it with:

kubectl --context netclode get nodes

9. Verify

kubectl --context netclode -n netclode get pods

You should see control-plane, redis-sessions, and warm pool pods running.

Get the ingress hostname:

kubectl --context netclode -n netclode get ingress control-plane -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

10. Connect clients

Build and run the macOS app:

make run-macos

Then go to Settings → enter <ingress-hostname> → Connect.

For iOS, see clients/ios/README.md.

Configuration

Control plane

Variable Default Description
PORT 3000 Server port
K8S_NAMESPACE netclode Kubernetes namespace
REDIS_URL redis://redis-sessions... Redis URL
WARM_POOL_ENABLED true Use warm pool
MAX_ACTIVE_SESSIONS 5 Max concurrent sessions
IDLE_TIMEOUT_MINUTES 0 (disabled) Auto-pause sessions after N minutes of inactivity

Agent

Variable Description
SESSION_ID Session identifier
GIT_REPOS Optional JSON array of repos to clone (URL or owner/repo)

For LLM API keys, see SDK Support.

Updating

Re-run Ansible to update infrastructure:

cd infra/ansible
DEPLOY_HOST=<server-ip> ansible-playbook playbooks/site.yaml

Or deploy only k8s manifests (faster):

cd infra/ansible
DEPLOY_HOST=<server-ip> ansible-playbook playbooks/k8s-only.yaml

To restart deployments after image updates:

make rollout-control-plane
make rollout-agent

Rollback

kubectl --context netclode -n netclode rollout undo deployment/control-plane

GPU Support (Optional)

For local model inference with Ollama, see GPU Setup in the Ansible README.

Troubleshooting

Pods stuck in Pending - check warm pool:

kubectl --context netclode -n netclode get sandboxclaim
kubectl --context netclode -n netclode get sandbox

JuiceFS mount failures - check CSI driver:

kubectl --context netclode -n kube-system logs -l app=juicefs-csi-driver

Tailscale services not getting IPs - check operator:

kubectl --context netclode -n tailscale logs -l app=operator

Kata pods not starting - verify Kata installation:

ssh root@<server> /opt/kata/bin/kata-runtime kata-env

For more troubleshooting, see infra/ansible/README.md.