Return to Issue Details
Sparse Mixture-of-Experts Transformers with Dynamic Routing for Efficient Large Language Model Inference
Download
Download PDF