Selective Sinkhorn Routing for Improved Sparse Mixture of Experts
Published in Approved for Filing U.S. Patent, Qualcomm, 2025
Selective Sinkhorn Routing for Improved Sparse Mixture of Experts.
Duc-Anh Nguyen*, Huu Binh Ta*, Nhuan Le Duc, Tan Minh Nguyen, Toan Tran
Approved for Filing U.S. Patent, Qualcomm, 2025 </p> This work introduces Selective Sinkhorn Routing (SSR), a novel routing mechanism for sparse Mixture-of-Experts (SMoE) models. By formulating token-to-expert assignment as an optimal transport problem with balancing constraints, SSR derives gating assignments directly from a transport map - eliminating the need for auxiliary balancing losses or additional trainable noise. The method promotes balanced expert utilization while preserving flexibility, resulting in faster training, improved accuracy, and greater robustness across language modeling and image-classification tasks. This work introduces a new family of balancing strategies for efficient SMoE training.
Recommended citation: Duc-Anh Nguyen*, Huu Binh Ta*, Nhuan Le Duc, Tan Minh Nguyen, Toan Tran. (2025). "Selective Sinkhorn Routing for Improved Sparse Mixture of Experts." U.S. Patent Application (Filed), Qualcomm. 2025.
Download Paper
