Alibaba Unveils Advanced Qwen3-Next AI Models on NVIDIA Platform
Luisa Crawford Sep 16, 2025 12:12
Alibaba introduces Qwen3-Next models with a hybrid MoE architecture, enhancing AI efficiency and performance on NVIDIA's advanced platform.

Alibaba has launched two new open-source AI models, Qwen3-Next 80B-A3B-Thinking and Qwen3-Next 80B-A3B-Instruct, showcasing a hybrid Mixture of Experts (MoE) architecture. These models promise improved accuracy and accelerated processing, particularly when deployed on NVIDIA's cutting-edge platform, according to NVIDIA.
Enhanced Efficiency and Performance
The Qwen3-Next models are engineered for handling extensive text sequences with high efficiency. Each model comprises 80 billion parameters, but thanks to the MoE architecture, only 3 billion are activated per token. This design allows the models to function with the power of a large-scale model while maintaining the efficiency of a smaller one. The architecture includes 512 routed experts and one shared expert, with ten experts being activated per token.
These models are optimized for long context lengths, capable of processing over 260,000 tokens in input. They leverage NVIDIA's Blackwell 5th-generation NVLink, which offers 1.8 TB/s of direct GPU-to-GPU bandwidth, crucial for minimizing latency and enhancing token throughput during complex processing tasks.
Innovative Architectural Features
The models incorporate 48 layers, with every fourth layer utilizing Global Query Attention (GQA) while others employ linear attention. This innovative combination allows the models to assign importance effectively to each token in an input sequence. The use of Gated Delta Networks, developed by NVIDIA research and MIT, further enhances the models' ability to process long sequences efficiently, ensuring minimal drift and improved focus.
Deployment and Accessibility
For deployment, NVIDIA has collaborated with open-source frameworks such as SGLang and vLLM. This collaboration facilitates the model's deployment across various platforms, offering flexibility and access to developers. The models are available on NVIDIA's NIM microservice endpoints, allowing enterprise developers to experiment with these advanced AI models.
Commitment to Open Source AI
Alibaba and NVIDIA's initiative with the Qwen3-Next models exemplifies a commitment to advancing AI through open-source contributions. This approach fosters a collaborative environment where researchers and developers can explore and innovate, driving future advancements in AI technology.
The Qwen3-Next models are available for testing on platforms like Open Router and can be downloaded from repositories such as Hugging Face, providing ample opportunity for further experimentation and development in the AI community.
Image source: Shutterstock