Source:Moonshot AI

Moonshot AI Releases New Technology Report on “Muon for LLM Training” and Launches “Moonlight” Model

On February 24, Moonshot AI released a new technology report titled “Muon for Scalable LLM Training” and announced the launch of “Moonlight,” a mixture-of-experts (MoE) model trained on Muon. The model features 3 billion and 16 billion parameters, utilizing 5.7 trillion tokens. It achieves better performance with fewer floating point operations (FLOPs), thus pushing the Pareto efficiency frontier further.

图片2

Related News

No Content Available