Skip to content

swift是不是不支持gemma-4-26B-A4B moe的训练 #9127

@miaomi1994

Description

@miaomi1994

Checklist / 检查清单

  • I have searched existing issues, and this is a new question or discussion topic. / 我已经搜索过现有的 issues,确认这是一个新的问题与讨论。

Question Description / 问题描述

swift是不是不支持gemma-4-26B-A4B moe的训练?我发现无论router_aux_loss_coef设置多少,都不会改变loss,说明aux loss是0,那相当于不支持moe训练,无法保证负载均衡。

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions