A custom React web application with the Amazon SageMaker managed MLflow tracking UI embedded via iframe. A Flask reverse proxy running on Amazon EC2 authenticates all requests to the SageMaker MLflow endpoint using AWS Signature Version 4 (SigV4) signing, enabling transparent access to both the MLflow UI and REST APIs.
The entire infrastructure is provisioned using AWS CDK in TypeScript, deploying four stacks: networking (VPC), SageMaker domain, MLflow resources (IAM role and S3 bucket), and the Flask application with an Application Load Balancer (ALB). The serverless MLflow App is created via the AWS CLI as part of an automated deployment script.
- AWS account with sufficient IAM permissions
- AWS CLI v2.34.5 or later (required for
create-mlflow-appcommands) - AWS CDK v2 installed and bootstrapped (
cdk bootstrap) - Node.js 18.x or later
- Python 3.13 or later (locally, for deployment script JSON parsing)
├── bin/app.ts # CDK app entry point
├── lib/
│ ├── networking-stack.ts # VPC, subnets, NAT gateway, VPC endpoints
│ ├── sagemaker-domain-stack.ts # SageMaker domain and execution role
│ ├── managed-mlflow-stack.ts # MLflow IAM role and S3 artifacts bucket
│ └── flask-app-stack.ts # EC2, ALB, IAM roles, S3 helper upload
├── helpers/
│ ├── app/
│ │ ├── main.py # Flask reverse proxy with SigV4 signing
│ │ ├── aws_utils.py # SigV4 signing utilities
│ │ └── requirements.txt # Python dependencies
│ ├── frontend/
│ │ ├── src/App.js # React app with MLflow iframe
│ │ ├── build/ # Pre-built React static files
│ │ └── package.json # React dependencies
│ ├── install_python13.sh # Python 3.13 installer for EC2
│ ├── setup_mlflow_proxy_app.sh # Flask app setup script for EC2
│ └── mlflowproxy.service # systemd service definition
├── deploy.sh # Automated deployment script
├── cleanup.sh # Automated teardown script
npm installcd helpers/frontend
npm install
npm run build
cd ../..export CDK_DEFAULT_ACCOUNT=<your-aws-account-id>
export CDK_DEFAULT_REGION=<your-aws-region>If you want to deploy to a region other than us-east-1, also set these to override any default region in ~/.aws/config:
export AWS_DEFAULT_REGION=<your-aws-region>
export AWS_REGION=<your-aws-region>Note: If you previously deployed to a different region, delete the cached context file before redeploying:
rm cdk.context.json
npx cdk bootstrap aws://<your-aws-account-id>/<your-aws-region>bash deploy.shThis single script:
- Deploys the networking, SageMaker domain, and MLflow resources stacks via CDK
- Creates the serverless MLflow App via
aws sagemaker create-mlflow-app - Deploys the Flask App stack with the MLflow App ARN passed as CDK context
Note the ALB URL from the output.
Connect to the EC2 instance via AWS Systems Manager Session Manager, then run:
sudo bash /root/install_python13.sh
sudo bash /root/setup_mlflow_proxy_app.shOpen the ALB URL in your browser. You should be redirected to /app, showing the React dashboard with the MLflow UI embedded in an iframe.
Test the health endpoint:
curl http://<ALB-URL>/healthCreate an experiment through the proxy:
curl -X POST http://<ALB-URL>/api/2.0/mlflow/experiments/create \
-H "Content-Type: application/json" \
-d '{"name": "my-first-experiment"}'bash cleanup.shThis destroys all CDK stacks in the correct order and deletes the serverless MLflow App. The MLflow artifacts S3 bucket has a RETAIN policy and must be manually deleted if no longer needed.
- SigV4 signing: The Flask proxy signs requests with service name
sagemakerand includes thex-sm-mlflow-app-arnheader - MLflow endpoint:
https://mlflow.sagemaker.<region>.app.aws - IAM: Least-privilege roles — EC2 instance role assumes a dedicated
FlaskMlflowRolefor MLflow API access - Iframe embedding: The proxy strips
X-Frame-Optionsand gzip-related headers from upstream responses - VPC endpoints: Included for SageMaker API, STS, S3, CloudWatch, ECR, and KMS
| Issue | Solution |
|---|---|
deploy.sh fails at MLflow App creation |
Ensure AWS CLI v2.34.5+ is installed (aws --version) |
| ALB returns 502 Bad Gateway | Check Flask service: systemctl status mlflowproxy on the EC2 instance |
| MLflow UI shows blank page | Verify gzip headers are stripped in main.py |
| MLflow REST API returns 403 | Check SigV4 service name is sagemaker in aws_utils.py |
| Stack deletion fails | Check CloudWatch logs for the VPC cleanup Lambda function |
- The ALB is deployed with HTTP (port 80). For production, add HTTPS with an SSL/TLS certificate.
- No authentication is configured on the ALB. For production, integrate with your SSO provider using ALB authentication with Amazon Cognito or OIDC.
- The EC2 instance runs in a private subnet with no direct internet access.
This project is licensed under the MIT-0 License. See the LICENSE file for details.
This project was developed and maintained by:
- Manish Garg
- Ashish Bhatt
- Ram Yennapusa
