Skip to content

Add MLflow AI Gateway LLM integration example#21239

Open
PattaraS wants to merge 4 commits intorun-llama:mainfrom
PattaraS:add-mlflow-gateway-integration
Open

Add MLflow AI Gateway LLM integration example#21239
PattaraS wants to merge 4 commits intorun-llama:mainfrom
PattaraS:add-mlflow-gateway-integration

Conversation

@PattaraS
Copy link
Copy Markdown

@PattaraS PattaraS commented Mar 31, 2026

Description

Adds MLflow AI Gateway integration documentation — a Jupyter notebook showing how to use MLflow AI Gateway as an LLM backend in LlamaIndex via OpenAILike, plus screenshots of the gateway UI and an entry in the integrations list.

New Package?

  • Yes
  • No

Version Bump?

  • Yes
  • No (docs only)

Type of Change

  • This change requires a documentation update

How Has This Been Tested?

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Notebook code cells were executed against a running MLflow server with a gateway endpoint configured.

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings

AI Disclosure

This pull request was AI-assisted by Claude. All content was reviewed and validated by a human contributor. Screenshots were captured manually from a running MLflow server.

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 31, 2026
@PattaraS PattaraS marked this pull request as draft March 31, 2026 15:53
Comment thread docs/examples/llm/mlflow_gateway.ipynb Outdated
"- **Traffic splitting** — route a percentage of requests to different models (A/B testing)\n",
"- **OpenAI-compatible API** — works with any OpenAI SDK client\n",
"\n",
"The gateway is **database-backed** and configured through the MLflow UI — no YAML files required. It runs as part of `mlflow server` (MLflow ≥ 3.0)."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this line. Users don't need to know "— no YAML files required"

"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index-llms-openai-like"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work on jupyter notebooks?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it works.

Comment thread docs/examples/llm/mlflow_gateway.ipynb Outdated
"\n",
"Before making LLM requests, you need to create a **gateway endpoint** — a named route that maps to a specific LLM provider and model. You can do this via the UI, the Python client, or the REST API.\n",
"\n",
"### Option 1: MLflow UI\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add documentation link and add a screenshot?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread docs/examples/llm/mlflow_gateway.ipynb Outdated
"6. Enter your provider API key — it is stored encrypted on the server\n",
"7. Click **Save**\n",
"\n",
"### Option 2: Python Client\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think users will use option 2 and 3?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. removed.

Comment thread docs/examples/llm/mlflow_gateway.ipynb Outdated
"\n",
"Route a percentage of requests to different models for A/B testing. For example, 90% to `gpt-4o-mini` and 10% to `gpt-4o`. Configure via **AI Gateway → Edit Endpoint → Routing Strategy**.\n",
"\n",
"### Budget Tracking\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a screenshot?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread docs/examples/llm/mlflow_gateway.ipynb Outdated
"\n",
"Set token or cost budgets per endpoint or per user. When the budget is exhausted, the gateway returns an error. Configure via **AI Gateway → Budgets**.\n",
"\n",
"### Usage Tracing\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add screenshots, I think it's interesting to see what information is logged in the tracing and how cost is tracked when users use Llama index agents.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Adds a Jupyter notebook demonstrating how to use MLflow AI Gateway as an
LLM backend in LlamaIndex via the OpenAI-compatible endpoint. Includes
setup instructions (UI, Python client, curl), chat/streaming/complete
examples using OpenAILike, and an mlflow.deployments client alternative.
Also adds MLflow Gateway to the available LLM integrations list.
- Remove "no YAML files required" from intro
- Add documentation link to MLflow AI Gateway docs
- Simplify endpoint creation section to UI-only (remove Python/curl options)
- Add TODO screenshot placeholders for Create Endpoint UI, Budgets UI,
  and Traces UI with LlamaIndex agent traces
The mlflow.deployments.get_deploy_client API is deprecated. Replace the
"Alternative" section with the current recommended approaches: OpenAI SDK
with base_url pointing to the gateway, and plain HTTP requests using the
MLflow Invocations API.
Adds four screenshots to the MLflow Gateway integration notebook:
- Create Endpoint UI form
- Budget Policy creation dialog
- Usage dashboard (requests, latency, errors)
- Usage trace list and trace detail view
@PattaraS PattaraS force-pushed the add-mlflow-gateway-integration branch from 28ae010 to 8a4ba3f Compare April 21, 2026 04:47
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 21, 2026
@PattaraS
Copy link
Copy Markdown
Author

PattaraS commented Apr 21, 2026

Hi @logan-markewich, friendly ping on this PR. We're from the MLflow team and working on adding MLflow AI Gateway integration guides across major agent frameworks. MLflow AI Gateway (MLflow ≥ 3.0) is an open-source LLM proxy with built-in secrets management, fallback/retry, traffic splitting, and usage tracing — all configured
through a UI. Since it exposes an OpenAI-compatible API, it works with LlamaIndex's OpenAILike out of the box. This PR adds a notebook demonstrating that, along with screenshots of the gateway UI.
Happy to make any further changes if needed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants