Skip to content

Fix _mentioned_agents() to ignore reasoning tags in speaker selection#7219

Open
veeceey wants to merge 1 commit intomicrosoft:mainfrom
veeceey:fix/issue-6891-thinking-tag-agent-selection
Open

Fix _mentioned_agents() to ignore reasoning tags in speaker selection#7219
veeceey wants to merge 1 commit intomicrosoft:mainfrom
veeceey:fix/issue-6891-thinking-tag-agent-selection

Conversation

@veeceey
Copy link
Copy Markdown

@veeceey veeceey commented Feb 8, 2026

Summary

Fixes #6891

When LLMs use reasoning tags like <thinking>, <reflection>, <planning>, etc., agent mentions inside these blocks should not affect speaker selection. This PR filters out common reasoning blocks before counting agent mentions.

Problem

The issue occurred because models like Qwen would mention agents in their internal reasoning (e.g., "<thinking>Maybe AgentA or AgentB could help</thinking>I suggest AgentC"), and the selector would incorrectly count those mentions, leading to wrong speaker selection.

Solution

  • Added _strip_reasoning_blocks() method to remove common reasoning tags:
    • thinking, thought, reflection, reasoning
    • analysis, internal, scratch, planning
  • Updated _mentioned_agents() to use cleaned content before counting
  • Added comprehensive test case

Testing

  • Manual unit tests confirm correct behavior:
    • Reasoning blocks are stripped before counting
    • Multiple tag types are handled
    • Case-insensitive matching
    • Tags with attributes are supported
  • Added test case test_selector_group_chat_ignores_thinking_tags

Example

Before:

<thinking>Maybe agent1 or agent2 could help</thinking>I suggest agent3

Would count: agent1=1, agent2=1, agent3=1 (incorrect)

After:
Would count: agent3=1 only (correct)

@veeceey
Copy link
Copy Markdown
Author

veeceey commented Feb 8, 2026

All checks passing, DCO signed, ready for merge

@veeceey
Copy link
Copy Markdown
Author

veeceey commented Feb 13, 2026

Hey @ekzhu, I noticed your comments on #6891 about preferring structured output / JSON mode and the family: "r1" approach for handling reasoning tags. Those are definitely the better long-term solutions.

That said, this PR is meant to handle the case where users haven't configured those options -- when they're just using a model that happens to emit thinking tags (like Qwen with default settings), the speaker selector still picks up those mentions and makes bad choices. This is basically a fallback for the "unstructured" path.

Totally understand if you'd prefer to just point users toward structured output / family: "r1" instead of adding this. Happy to close this if that's the direction you want to go. Just wanted to offer it as a safety net for the common case.

@veeceey
Copy link
Copy Markdown
Author

veeceey commented Feb 19, 2026

Friendly ping @ekzhu - just checking if you had a chance to see my earlier comment about the approach here. Totally understand if you'd prefer to point users toward structured output / family: "r1" instead -- happy to close this if that's the preferred direction, or make any changes you'd like. Thanks for your time!

@ekzhu
Copy link
Copy Markdown
Contributor

ekzhu commented Feb 19, 2026

@veeceey I don't work at Microsoft anymore.

Cc @victordibia

@veeceey
Copy link
Copy Markdown
Author

veeceey commented Feb 20, 2026

Thanks for the heads up @ekzhu! @victordibia would appreciate a look at this whenever you get a chance.

Fixes microsoft#6891

When LLMs use reasoning tags like <thinking>, <reflection>, <planning>,
etc., agent mentions inside these blocks should not affect speaker
selection. This change filters out common reasoning blocks before
counting agent mentions.

The issue occurred because models like Qwen would mention agents in
their internal reasoning (e.g., "<thinking>Maybe AgentA or AgentB could
help</thinking>I suggest AgentC"), and the selector would count those
mentions, leading to incorrect speaker selection.

Changes:
- Add _strip_reasoning_blocks() method to remove common reasoning tags
- Update _mentioned_agents() to use cleaned content
- Add test case for thinking tag handling

Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
@veeceey veeceey force-pushed the fix/issue-6891-thinking-tag-agent-selection branch from d64b1e7 to d1c7830 Compare March 12, 2026 01:25
@veeceey
Copy link
Copy Markdown
Author

veeceey commented Mar 12, 2026

Rebased on latest main. @victordibia any thoughts on this one? happy to adjust the approach if you'd prefer a different direction.

@veeceey
Copy link
Copy Markdown
Author

veeceey commented Apr 9, 2026

hey @victordibia, just a friendly bump on this - any thoughts on the approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_mentioned_agents() in the SelectorGroupChatManager doesn't work with the <thinking> tag

2 participants