refactor: mount data blueprint via WSGI and adopt Pydantic in engine blueprint by koladefaj · Pull Request #1179 · inclusionAI/AReaL

koladefaj · 2026-04-13T22:36:57Z

Description

This PR refactors the experimental inference service and the core infrastructure blueprints to eliminate code duplication and standardize request validation.

Key Changes:

WSGI Mounting: Mounted the legacy Flask data_bp blueprint into the FastAPI data proxy app using WSGIMiddleware. This removes ~60 lines of duplicate endpoint logic in areal/experimental/inference_service/data_proxy/app.py.
Pydantic Migration (Engine BP): Extended the Pydantic pattern to areal/infra/rpc/guard/engine_blueprint.py. Added SetEnvRequest, CreateEngineRequest, and CallEngineRequest models to replace manual JSON parsing and dictionary lookups, improving type safety and error handling.
Consistency: Updated comments in the data proxy to clarify the new mounted architecture and the source of truth for RTensor storage logic.

Related Issue

Fixes #1106 (Follow-up to #1154)

Type of Change

Checklist

I have read the Contributing Guide
Pre-commit hooks pass (pre-commit run --all-files)
Relevant tests pass; new tests added for new functionality
Documentation updated (if applicable; built with ./docs/build_all.sh)
Branch is up to date with main
Self-reviewed via /review-pr command
This PR was created by a coding agent via /create-pr
This PR is a breaking change

Breaking Change Details (if applicable): N/A

Additional Context

Note on Documentation: Because the /data/ endpoints are now handled by the WSGI middleware layer, they will no longer appear in the auto-generated FastAPI Swagger UI (/docs). This trade-off was chosen to ensure a single source of truth for infrastructure logic.

Need help? Check the Contributing Guide or ask in
GitHub Discussions!

… models

gemini-code-assist

Code Review

This pull request refactors the InferenceDataProxy by mounting a centralized Flask blueprint for RTensor data storage endpoints, effectively removing duplicate logic. Additionally, it introduces Pydantic models in engine_blueprint.py to standardize request validation for the engine API. A critical typo was found in the set_env endpoint where an incomplete attribute name is used, which will cause a runtime error.

areal/infra/rpc/guard/engine_blueprint.py

koladefaj · 2026-04-13T22:51:22Z

Hi @garrett4wade, I've mounted the data_bp into the app using WSGIMiddleware as suggested. The PR is ready for your review now.

garrett4wade · 2026-04-14T03:18:50Z

Review Findings

1. Empty-string validation regressed on engine RPC inputs

File: areal/infra/rpc/guard/engine_blueprint.py (Pydantic model definitions)
Severity: Medium

The old code used if not engine:, if not engine_name:, if not method_name: to reject empty strings with clear 400 errors. The new Pydantic models use plain str which accepts empty strings:

class CreateEngineRequest(BaseModel):
    engine: str              # accepts ""
    engine_name: str         # accepts ""

class CallEngineRequest(BaseModel):
    method: str              # accepts ""
    engine_name: str         # accepts ""

Consequences:

engine="" → reaches import_from_string("") → confusing ValueError/ImportError instead of a clean 400
engine_name="" → creates engine keyed by "" in the _engines dict
method="" → getattr(engine, "") → unpredictable behavior, 500 instead of clean 400

Suggested fix — enforce non-empty strings:

from pydantic import StringConstraints
from typing import Annotated

NonEmptyStr = Annotated[str, StringConstraints(min_length=1)]

class CreateEngineRequest(BaseModel):
    engine: NonEmptyStr
    engine_name: NonEmptyStr
    init_args: list[Any] = []
    init_kwargs: dict[str, Any] = {}

class CallEngineRequest(BaseModel):
    method: NonEmptyStr
    engine_name: NonEmptyStr
    args: list[Any] = []
    kwargs: dict[str, Any] = {}
    rpc_meta: dict[str, Any] | None = None

2. No unit tests for Pydantic model validation edge cases

Severity: Medium

The PR adds 3 Pydantic models (SetEnvRequest, CreateEngineRequest, CallEngineRequest) replacing ~40 lines of manual validation logic, but no tests were added for the engine blueprint validation. The test changes in the PR only cover the data proxy mount change (monkeypatch target update), not the engine blueprint Pydantic migration.

Suggested additions — cover at minimum:

def test_create_engine_empty_string():
    resp = client.post("/create_engine", json={"engine": "", "engine_name": "test"})
    assert resp.status_code == 400

def test_create_engine_missing_fields():
    resp = client.post("/create_engine", json={})
    assert resp.status_code == 400

def test_call_engine_missing_method():
    resp = client.post("/call", json={"engine_name": "actor/0"})
    assert resp.status_code == 400

def test_set_env_invalid_json():
    resp = client.post("/set_env", data="not json", content_type="application/json")
    assert resp.status_code == 400

koladefaj · 2026-04-14T05:12:40Z

I've addressed the review findings:

Implemented NonEmptyStr constraints for engine RPC models as suggested.
Added a dedicated test suite in tests/infra/rpc/test_engine_validation.py to cover validation edge cases.

Ready for review @garrett4wade

refactor: mount data blueprint via ASGI and implement engine Pydantic…

ca8e59a

… models

gemini-code-assist bot reviewed Apr 13, 2026

View reviewed changes

areal/infra/rpc/guard/engine_blueprint.py Outdated Show resolved Hide resolved

fix: correct typo in set_env attribute name

8ececd0

koladefaj marked this pull request as ready for review April 13, 2026 22:48

garrett4wade added the reviewed label Apr 14, 2026

Merge branch 'main' into feat/mount-data-blueprint

7a3247d

koladefaj requested review from garrett4wade, nuzant and rchardx as code owners April 14, 2026 04:41

test: add RPC engine validation tests and apply NonEmptyStr constraints

5db3520

koladefaj changed the title ~~refactor: mount data blueprint via ASGI and adopt Pydantic in engine blueprint~~ refactor: mount data blueprint via WSGI and adopt Pydantic in engine blueprint Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: mount data blueprint via WSGI and adopt Pydantic in engine blueprint#1179

refactor: mount data blueprint via WSGI and adopt Pydantic in engine blueprint#1179
koladefaj wants to merge 4 commits intoinclusionAI:mainfrom
koladefaj:feat/mount-data-blueprint

koladefaj commented Apr 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

koladefaj commented Apr 13, 2026

Uh oh!

garrett4wade commented Apr 14, 2026

Uh oh!

koladefaj commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

koladefaj commented Apr 13, 2026

Description

Related Issue

Type of Change

Checklist

Additional Context

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

koladefaj commented Apr 13, 2026

Uh oh!

garrett4wade commented Apr 14, 2026

Review Findings

1. Empty-string validation regressed on engine RPC inputs

2. No unit tests for Pydantic model validation edge cases

Uh oh!

koladefaj commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants