LongCat-Flash-Chat
LongCat-Flash-Chat
huggingface.co
meituan-longcat/LongCat-Flash-Chat · Hugging Face

We introduce LongCat-Flash, a powerful and efficient language model with 560 billion total parameters, featuring an innovative Mixture-of-Experts (MoE) architecture. The model incorporates a dynamic computation mechanism that activates 18.6B∼31.3B parameters (averaging∼27B) based on contextual demands, optimizing both computational efficiency and performance. To achieve advanced training and inference efficiency, we employ a shortcut-connected architecture that expands computation-communication overlap window, achieving over 100 tokens per second (TPS) for inference cost-effectively.
- MIT license
- https://huggingface.co/meituan-longcat/LongCat-Flash-Chat
- https://github.com/meituan-longcat/LongCat-Flash-Chat
Meituan is China's largest food delivery company.