feat: add auto-recovery logic for chat completion clients by Ricardo-M-L · Pull Request #7567 · microsoft/autogen

Ricardo-M-L · 2026-04-10T15:48:18Z

Summary

Implements configurable auto-recovery logic for chat completion clients, addressing the need described in #3632.

RetryableChatCompletionClient: A wrapper that adds retry logic with exponential backoff and jitter to any ChatCompletionClient
RetryConfig: Dataclass for configuring max retries, base/max delay, exponential base, jitter, retryable status codes, and error types
Error classification: Correctly identifies transient errors (429 rate limit, 500/502/503/504 server errors, 424 failed dependency, timeouts, connection errors) vs permanent errors (400, 401, 403, 404, auth errors) that should not be retried
Retry-After header support: Respects server-provided Retry-After values as minimum delay
Jitter: Randomized delay to prevent thundering herd problems
Logging: All retry attempts are logged via the trace logger

Usage

from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.models.retry import RetryableChatCompletionClient, RetryConfig

openai_client = OpenAIChatCompletionClient(model="gpt-4o")
retryable_client = RetryableChatCompletionClient(
    client=openai_client,
    retry_config=RetryConfig(max_retries=5, base_delay=1.0, max_delay=30.0),
)

# Use exactly like any other ChatCompletionClient
result = await retryable_client.create(messages)

Files changed

python/packages/autogen-ext/src/autogen_ext/models/retry/__init__.py — Module exports
python/packages/autogen-ext/src/autogen_ext/models/retry/_retry_chat_completion_client.py — Core implementation (359 lines)
python/packages/autogen-ext/tests/models/test_retry_chat_completion_client.py — 25 test cases (581 lines)

Fixes #3632

Test plan

25 unit tests covering:
- Successful calls with no retries
- Retry on rate limit (429), server errors (500/502/503), timeout, connection errors, and 424 failed dependency
- No retry on permanent errors (400, 401, 403, 404, auth errors)
- Max retries exhaustion
- Retry-After header handling
- Exponential backoff calculation with/without jitter
- Stream retry behavior
- Delegation of all ChatCompletionClient methods (close, usage, tokens, model_info)
- Zero retries configuration

🤖 Generated with Claude Code

…t errors Implements a wrapper client that adds configurable retry logic with exponential backoff and jitter to any ChatCompletionClient. Supports rate limit (429), server errors (500/502/503/504), timeout, and connection errors while correctly classifying permanent errors (auth, bad request) as non-retryable. Fixes microsoft#3632 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Ricardo-M-L · 2026-04-16T07:52:04Z

Closing — this feature should have been discussed in an issue first before submitting a large PR. Apologies for the noise.

Ricardo-M-L closed this Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add auto-recovery logic for chat completion clients#7567

feat: add auto-recovery logic for chat completion clients#7567
Ricardo-M-L wants to merge 1 commit intomicrosoft:mainfrom
Ricardo-M-L:feat/auto-recovery-logic

Ricardo-M-L commented Apr 10, 2026

Uh oh!

Ricardo-M-L commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ricardo-M-L commented Apr 10, 2026

Summary

Usage

Files changed

Test plan

Uh oh!

Ricardo-M-L commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant