A production-ready API gateway that compresses LLM prompts and enforces token policies before execution.
LLM prompts are getting longer and more expensive. This gateway helps by:
- 💰 Cost Reduction - Smart compression reduces token usage
- 🛡️ Policy Enforcement - Token limits before API calls
- ⚡ Fast Processing - Efficient compression with LLMLingua-2
- Intelligent prompt compression
- Configurable token limits
- REST API with FastAPI
- Docker support
- Comprehensive tests
- Clear documentation
# Clone the repo
git clone https://github.com/kelpejol/prompt-compression-gateway.git
cd prompt-compression-gateway
# Install dependencies
pip install -r requirements.txt
# Run the server
uvicorn gateway.main:app --reloaddocker-compose up -dcurl -X POST http://localhost:8000/compress \
-H "Content-Type: application/json" \
-d '{
"prompt": "You are an AI assistant helping with code review.",
"max_tokens": 512,
"compression_ratio": 0.5
}'Compress a prompt with policy enforcement.
Request:
{
"prompt": "string",
"max_tokens": 2048,
"compression_ratio": 0.5
}Response:
{
"original_tokens": 150,
"compressed_tokens": 75,
"compressed_prompt": "compressed text here"
}Interactive Docs: Visit http://localhost:8000/docs after starting the server.
Client Request
↓
Token Policy Check
↓
LLMLingua Compression
↓
Compressed Output
# Run tests
pytest
# With coverage
pytest --cov=gateway
# Specific test file
pytest tests/test_api.py# Install dev dependencies
pip install -r requirements-dev.txt
# Format code
black gateway/ tests/
# Run linter
ruff check gateway/docker build -t prompt-gateway .
docker run -p 8000:8000 prompt-gatewayHOST=0.0.0.0
PORT=8000
MAX_TOKENS_DEFAULT=2048
COMPRESSION_RATIO_DEFAULT=0.5See .env.example for all options.
Contributions are welcome! Please check out CONTRIBUTING.md for guidelines.
- Fork the repo
- Create a feature branch (
git checkout -b feature/amazing) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing) - Open a Pull Request
MIT License - see LICENSE file for details.