[WIP]feat: support Speculative Decoding by Sglang Eagle algo#1176
[WIP]feat: support Speculative Decoding by Sglang Eagle algo#1176TaoZex wants to merge 24 commits intoinclusionAI:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for speculative decoding using EAGLE and Multi-Token Prediction (MTP) online training. Key changes include the addition of MTP and speculative decoding configuration fields across various API and engine components, implementation of MTP loss collection and gradient isolation in the Megatron engine, and the inclusion of speculative decoding statistics in model responses. Additionally, the PR provides comprehensive documentation, example configurations, and end-to-end tests. Feedback focuses on improving configuration consistency in SGLangConfig, removing redundant no-op string replacements in the MTP layer conversion utility, and simplifying logic in the RLVR workflow by removing a redundant conditional check.
| "help": "Attention mode for speculative decoding. E.g., 'full', 'sparse'." | ||
| }, | ||
| ) | ||
| enable_multi_layer_eagle: bool = False |
There was a problem hiding this comment.
For consistency with other configuration fields in this dataclass, enable_multi_layer_eagle should be defined using field(). This also provides an opportunity to add a help string in the metadata for better documentation and discoverability through CLI help messages.
enable_multi_layer_eagle: bool = field(
default=False,
metadata={"help": "Enable multi-layer EAGLE for speculative decoding."},
)| hf_remainder = hf_remainder.replace("enorm.weight", "enorm.weight") | ||
| hf_remainder = hf_remainder.replace("hnorm.weight", "hnorm.weight") |
| accept_rate = ( | ||
| resp.spec_accept_token_num / resp.spec_draft_token_num | ||
| if resp.spec_draft_token_num > 0 | ||
| else 0.0 | ||
| ) |
Description
Related Issue
Fixes #(issue)
Type of Change
Checklist
pre-commit run --all-files)./docs/build_all.sh)main/review-prcommand/create-prBreaking Change Details (if applicable):
Additional Context
Need help? Check the Contributing Guide or ask in
GitHub Discussions!