2mo ago

LongCat-Flash-Chat

meituan-longcat/LongCat-Flash-Chat · Hugging Face

We introduce LongCat-Flash, a powerful and efficient language model with 560 billion total parameters, featuring an innovative Mixture-of-Experts (MoE) architecture. The model incorporates a dynamic computation mechanism that activates 18.6B∼31.3B parameters (averaging∼27B) based on contextual demands, optimizing both computational efficiency and performance. To achieve advanced training and inference efficiency, we employ a shortcut-connected architecture that expands computation-communication overlap window, achieving over 100 tokens per second (TPS) for inference cost-effectively.

MIT license
https://huggingface.co/meituan-longcat/LongCat-Flash-Chat
https://github.com/meituan-longcat/LongCat-Flash-Chat

Meituan is China's largest food delivery company.

No comments