fix(azure-openai): use deployment name in AzureOpenAIResponses structured predict by joshuakilts · Pull Request #21530 · run-llama/llama_index

joshuakilts · 2026-05-01T04:05:34Z

Summary

AzureOpenAIResponses.structured_predict (and the async variant) were inherited from OpenAIResponses, which passes model=self.model (e.g. "gpt-4o") to responses.parse. Azure routes requests by deployment name, not model family name, so this produces a 404 DeploymentNotFound error.

Changes

openai/responses.py

Adds a _responses_model property (returns self.model) as a hook point for subclasses

azure_openai/responses.py

Overrides _responses_model to return self.engine (the Azure deployment name)
No method duplication needed — the parent's structured_predict / astructured_predict pick up the correct value through the property

This follows the same pattern already used by _get_model_kwargs in AzureOpenAIResponses.

Test plan

pytest llama-index-integrations/llms/llama-index-llms-azure-openai/tests/test_azure_openai.py::test_structured_predict_uses_engine_not_model
pytest llama-index-integrations/llms/llama-index-llms-azure-openai/tests/test_azure_openai.py::test_astructured_predict_uses_engine_not_model

🤖 Generated with Claude Code

… predict OpenAIResponses.structured_predict and astructured_predict call responses.parse(model=self.model), where self.model is the model family name (e.g. "gpt-4o"). Azure routes API requests by deployment name, so this causes a 404 DeploymentNotFound error that is silently swallowed by the instrumentation dispatcher, returning an empty structured response. Override both methods on AzureOpenAIResponses to pass model=self.engine (the deployment name) instead. This mirrors the existing pattern in _get_model_kwargs, which already applies the same correction for other API paths. Adds two unit tests asserting the deployment name is used.

…sponses_model property Instead of duplicating structured_predict and astructured_predict in AzureOpenAIResponses, add a _responses_model hook to OpenAIResponses (returns self.model) and override it in AzureOpenAIResponses (returns self.engine). The parent methods call self._responses_model, so Azure gets the right deployment name without copying any logic. Removes stale imports added for the duplicate overrides. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…t-engine

joshuakilts and others added 2 commits April 30, 2026 12:16

joshuakilts marked this pull request as ready for review May 1, 2026 04:51

dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 1, 2026

Merge branch 'main' into fix/azure-openai-responses-structured-predic…

3d012fd

…t-engine

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(azure-openai): use deployment name in AzureOpenAIResponses structured predict#21530

fix(azure-openai): use deployment name in AzureOpenAIResponses structured predict#21530
joshuakilts wants to merge 3 commits intorun-llama:mainfrom
joshuakilts:fix/azure-openai-responses-structured-predict-engine

joshuakilts commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joshuakilts commented May 1, 2026

Summary

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant