Skip to content

fix(azure-openai): use deployment name in AzureOpenAIResponses structured predict#21530

Open
joshuakilts wants to merge 3 commits intorun-llama:mainfrom
joshuakilts:fix/azure-openai-responses-structured-predict-engine
Open

fix(azure-openai): use deployment name in AzureOpenAIResponses structured predict#21530
joshuakilts wants to merge 3 commits intorun-llama:mainfrom
joshuakilts:fix/azure-openai-responses-structured-predict-engine

Conversation

@joshuakilts
Copy link
Copy Markdown
Contributor

Summary

AzureOpenAIResponses.structured_predict (and the async variant) were inherited from OpenAIResponses, which passes model=self.model (e.g. "gpt-4o") to responses.parse. Azure routes requests by deployment name, not model family name, so this produces a 404 DeploymentNotFound error.

Changes

openai/responses.py

  • Adds a _responses_model property (returns self.model) as a hook point for subclasses

azure_openai/responses.py

  • Overrides _responses_model to return self.engine (the Azure deployment name)
  • No method duplication needed — the parent's structured_predict / astructured_predict pick up the correct value through the property

This follows the same pattern already used by _get_model_kwargs in AzureOpenAIResponses.

Test plan

  • pytest llama-index-integrations/llms/llama-index-llms-azure-openai/tests/test_azure_openai.py::test_structured_predict_uses_engine_not_model
  • pytest llama-index-integrations/llms/llama-index-llms-azure-openai/tests/test_azure_openai.py::test_astructured_predict_uses_engine_not_model

🤖 Generated with Claude Code

joshuakilts and others added 2 commits April 30, 2026 12:16
… predict

OpenAIResponses.structured_predict and astructured_predict call
responses.parse(model=self.model), where self.model is the model family
name (e.g. "gpt-4o"). Azure routes API requests by deployment name, so
this causes a 404 DeploymentNotFound error that is silently swallowed by
the instrumentation dispatcher, returning an empty structured response.

Override both methods on AzureOpenAIResponses to pass model=self.engine
(the deployment name) instead. This mirrors the existing pattern in
_get_model_kwargs, which already applies the same correction for other
API paths.

Adds two unit tests asserting the deployment name is used.
…sponses_model property

Instead of duplicating structured_predict and astructured_predict in
AzureOpenAIResponses, add a _responses_model hook to OpenAIResponses
(returns self.model) and override it in AzureOpenAIResponses (returns
self.engine). The parent methods call self._responses_model, so Azure
gets the right deployment name without copying any logic. Removes stale
imports added for the duplicate overrides.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@joshuakilts joshuakilts marked this pull request as ready for review May 1, 2026 04:51
@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant