Skip to content

Commit 2e4b5b6

Browse files
author
JiaDe
committed
docs: update README — WORKSPACE_BUCKET_NAME, EC2 build, CRUD feature, Troubleshooting section
- deploy.sh: clarify Docker build happens on EC2 via SSM (no local Docker required) - .env.example in Quick Start: document WORKSPACE_BUCKET_NAME for multi-stack deployments - Flagship Features: add Org CRUD (Dept/Position/Employee with binding checks) - New Troubleshooting section: - GuardDuty-managed VPC endpoints blocking stack deletion (with fix commands) - --skip-build on empty ECR repo - AgentCore HTTP 500 / openclaw version pin
1 parent 234713a commit 2e4b5b6

File tree

1 file changed

+63
-4
lines changed

1 file changed

+63
-4
lines changed

enterprise/README.md

Lines changed: 63 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ Additional controls: no public ports (SSM only) · IAM roles throughout, no hard
8383
| **Position → Runtime Routing** | 3-tier routing chain: employee override → position rule → default. Assign positions to runtimes from Security Center UI, propagates to all members automatically |
8484
| **Per-Employee Model Config** | Override model, context window, compaction settings, and response language at position OR employee level from Agent Factory → Configuration tab |
8585
| **IM Channel Management** | Admin sees every employee's IM connections grouped by channel — when they paired, session count, last active, one-click disconnect |
86+
| **Org CRUD** | Full create/edit/delete for Departments, Positions, and Employees from Admin Console. Delete is guarded: blocks if employees or bindings exist, prompts force-cascade delete |
8687
| **Security Center** | Live AWS resource browser — ECR images, IAM roles, VPC security groups with console links. Configure runtime images and IAM roles from the UI |
8788
| **Three-Layer Memory Guarantee** | Per-turn S3 checkpoint (1-message sessions), SIGTERM flush (idle timeout), Gateway compaction (long sessions). Same memory across Discord, Telegram, Feishu, and Portal |
8889
| **Dynamic Config, Zero Redeploy** | Change model, tool permissions, SOUL content, or KB assignments → takes effect on next cold start. No container rebuild, no runtime update |
@@ -475,25 +476,29 @@ ADMIN_PASSWORD=your-password # admin console login password
475476
# Optional: use existing VPC instead of creating a new one
476477
# EXISTING_VPC_ID=vpc-0abc123
477478
# EXISTING_SUBNET_ID=subnet-0abc123
479+
480+
# Optional: custom S3 bucket name — required when deploying multiple stacks in the same account
481+
# (e.g. staging + production in the same AWS account)
482+
# WORKSPACE_BUCKET_NAME=openclaw-tenants-123456789-staging
478483
```
479484

480-
Then run the deploy script — it handles everything:
485+
Then run the deploy script — it handles everything, **including the Docker build on the gateway EC2 (no local Docker required)**:
481486

482487
```bash
483488
bash deploy.sh
484-
# ~15 minutes total: CloudFormation → Docker build → AgentCore Runtime → DynamoDB seed
489+
# ~15 minutes total: CloudFormation → EC2 Docker build → AgentCore Runtime → DynamoDB seed
485490
```
486491

487492
To re-deploy after code changes without rebuilding the Docker image or re-seeding:
488493

489494
```bash
490-
bash deploy.sh --skip-build # update infra only
495+
bash deploy.sh --skip-build # update infra only, skip Docker build
491496
bash deploy.sh --skip-seed # update infra + image, skip DynamoDB
492497
```
493498

494499
**What `deploy.sh` does automatically:**
495500
1. Deploys CloudFormation (EC2, ECR, S3, IAM — creates or updates)
496-
2. Builds and pushes ARM64 agent container to ECR
501+
2. Packages source code → uploads to S3 → **triggers Docker build on the gateway EC2 via SSM** (ARM64 Graviton, no local Docker needed)
497502
3. Creates or updates AgentCore Runtime
498503
4. Creates DynamoDB table if it doesn't exist
499504
5. Seeds org data (employees, positions, departments, SOUL templates, knowledge docs)
@@ -983,6 +988,60 @@ sudo systemctl enable bedrock-proxy-h2 tenant-router
983988
sudo systemctl start bedrock-proxy-h2 tenant-router
984989
```
985990

991+
## Troubleshooting
992+
993+
### CloudFormation stack deletion fails on PrivateSubnet
994+
995+
**Symptom:** `aws cloudformation delete-stack` gets stuck, then reports `DELETE_FAILED` with:
996+
```
997+
The subnet 'subnet-xxx' has dependencies and cannot be deleted.
998+
```
999+
1000+
**Cause:** AWS GuardDuty automatically creates managed VPC endpoints in every subnet it monitors. These endpoints block subnet deletion.
1001+
1002+
**Fix:** Find and delete the GuardDuty-managed endpoints before retrying:
1003+
1004+
```bash
1005+
# Find GuardDuty endpoints in the stack's VPC
1006+
VPC_ID=$(aws ec2 describe-vpcs \
1007+
--filters "Name=tag:aws:cloudformation:stack-name,Values=${STACK_NAME}" \
1008+
--region $REGION --query 'Vpcs[0].VpcId' --output text)
1009+
1010+
ENDPOINTS=$(aws ec2 describe-vpc-endpoints \
1011+
--filters "Name=vpc-id,Values=$VPC_ID" \
1012+
--region $REGION \
1013+
--query 'VpcEndpoints[?State!=`deleted`].VpcEndpointId' --output text)
1014+
1015+
aws ec2 delete-vpc-endpoints --vpc-endpoint-ids $ENDPOINTS --region $REGION
1016+
1017+
# Retry stack deletion
1018+
aws cloudformation delete-stack --stack-name $STACK_NAME --region $REGION
1019+
```
1020+
1021+
> **Note:** This does not disable GuardDuty — it only removes the endpoint ENIs that were blocking deletion. GuardDuty will recreate them in any new subnets automatically.
1022+
1023+
> **Prevention:** Deploying with `CreateVPCEndpoints=false` (default) avoids creating a PrivateSubnet, which is the only subnet GuardDuty consistently attaches to in this template. The CloudFormation template has been updated to skip PrivateSubnet creation when VPC endpoints are disabled.
1024+
1025+
### `deploy.sh` fails: ECR repo is empty after `--skip-build`
1026+
1027+
**Symptom:** AgentCore runtime creation fails with "specified image identifier does not exist."
1028+
1029+
**Cause:** `--skip-build` skips the Docker build, but if this is the first deploy of a new stack, the ECR repo will be empty.
1030+
1031+
**Fix:** Run without `--skip-build` on first deploy. The script builds on the gateway EC2 via SSM — no local Docker needed.
1032+
1033+
### AgentCore returns HTTP 500 on every message
1034+
1035+
**Cause:** Almost always a wrong `openclaw` npm package version inside the container.
1036+
1037+
**Check:**
1038+
```bash
1039+
aws logs tail /aws/bedrock-agentcore/runtimes/<runtime-id>-DEFAULT --follow
1040+
# Look for: "openclaw returned empty output"
1041+
```
1042+
1043+
**Fix:** Rebuild the Docker image. Both `agent-container/Dockerfile` and `exec-agent/Dockerfile` must install `openclaw@2026.3.24` exactly — do not upgrade.
1044+
9861045
---
9871046

9881047
Built by [wjiad@aws](mailto:wjiad@amazon.com) · [aws-samples](https://github.com/aws-samples) · Contributions welcome

0 commit comments

Comments
 (0)