This project is provided as sample code for educational and reference purposes. It is NOT intended for production use without additional security hardening. See the "Production Hardening Recommendations" section below.
If you discover a security vulnerability in this project, please report it by emailing aws-security@amazon.com. Do not report security vulnerabilities through public GitHub issues.
- Amazon VPC — network isolation with public, private, and isolated subnets
- Amazon EC2 — runs Flask reverse proxy in private subnet
- Elastic Load Balancing (ALB) — internet-facing entry point
- Amazon SageMaker AI — ML domain, MLflow managed app, model registry
- Amazon S3 — MLflow artifact storage and helper script distribution
- AWS IAM — role-based access with STS AssumeRole for temporary credentials
- AWS Lambda — VPC cleanup automation during stack deletion
- Amazon CloudWatch — logging for Lambda and application monitoring
- AWS Systems Manager — EC2 instance access via Session Manager
- AWS KMS — VPC endpoint for encryption operations
- Amazon ECR — VPC endpoints for container image access
- AWS STS — temporary credential issuance for SigV4 signing
To deploy this solution, you need:
- An AWS account with permissions to create VPC, EC2, ALB, SageMaker Domain, S3, IAM roles, Lambda, and CloudWatch resources
- AWS CLI v2.34.5+ (for
create-mlflow-appcommands) - AWS CDK v2 installed and bootstrapped
- Node.js 18.x+, Python 3.13+
| Item | Category | Rationale |
|---|---|---|
| ALB uses HTTP (port 80) without TLS | Security Debt | Sample pattern does not assume a custom domain or ACM certificate. See Production Hardening below. |
sagemaker-mlflow:AccessUI uses Resource: "*" |
Security Debt | Action does not support resource-level permissions per AWS documentation. |
sagemaker:ListMlflowApps uses Resource: "*" |
Security Debt | List operations require wildcard resource. |
Cleanup Lambda uses Resource: "*" for EFS/EC2/SG actions |
Security Debt | Must discover and delete resources at runtime; scoped by VPC ID in code. |
| S3 access logging not configured | Security Debt | Recommended for production audit trails. |
| CloudWatch Logs and EBS use default encryption | Security Debt | Customer-managed KMS keys recommended for production. |
| CORS wildcard on proxy responses | By Design | Required for iframe embedding; access controlled at network layer. |
Before using this code in a production environment, implement the following changes:
The ALB is configured with HTTP (port 80) only. For production:
- Request or import an SSL/TLS certificate in AWS Certificate Manager (ACM) for your domain.
- Add an HTTPS listener (port 443) on the ALB with the ACM certificate.
- Redirect HTTP (port 80) to HTTPS (port 443).
- Remove the HTTP-only listener or configure it solely as a redirect.
Integrate ALB authentication with Amazon Cognito or your OIDC-compatible identity provider to require user authentication before accessing the MLflow UI. Configure an ALB authentication action rule on the HTTPS listener.
Upgrade both S3 buckets to use AWS KMS customer-managed keys (CMK) instead of S3-managed encryption for enhanced key management and audit capabilities.
Enable server access logging on both the MLflow artifacts bucket and the helper scripts bucket. Direct logs to a dedicated audit bucket with appropriate lifecycle policies.
Configure CloudWatch Log Groups with KMS customer-managed keys for log encryption at rest.
Enable EBS encryption by default in the account or specify encrypted volumes for the EC2 instance using a customer-managed KMS key.
Enable VPC Flow Logs for network traffic monitoring and anomaly detection. Send logs to CloudWatch Logs or S3 for analysis.
Enable ALB access logging to S3 for request auditing and troubleshooting.
Consider adding AWS WAF in front of the ALB for additional request filtering, rate limiting, and protection against common web exploits.
Replace sagemaker-mlflow:* with specific actions needed by the Flask proxy (e.g.,
sagemaker-mlflow:GetExperiment, sagemaker-mlflow:SearchRuns, sagemaker-mlflow:AccessUI).
If adding API keys or database credentials in future iterations, use AWS Secrets Manager or SSM Parameter Store instead of environment variables.
To remove all resources deployed by this project:
- Run
bash cleanup.sh— this destroys all CDK stacks and deletes the MLflow App. - The MLflow artifacts S3 bucket has a RETAIN policy — manually empty and delete it from the S3 console if no longer needed.
- Check for any remaining CloudWatch Log Groups created by the cleanup Lambda.
| Dependency | Version | Notes |
|---|---|---|
| Flask | 3.1.3 | Web framework for reverse proxy |
| boto3 | 1.39.3 | AWS SDK for Python — STS, SigV4 |
| requests | 2.32.4 | HTTP client for upstream MLflow calls |
| gunicorn | 23.0.0 | WSGI server for production Flask |
| react | 18.2.0 | Frontend UI framework |
| react-dom | 18.2.0 | React DOM rendering |
| react-scripts | 5.0.1 | Create React App build tooling |
| aws-cdk-lib | 2.243.0 | AWS CDK infrastructure library |
| cdk-nag | 2.27.0 | CDK security and best practices checks |