List of AI News about Gated DeltaNet
| Time | Details |
|---|---|
|
2025-09-22 22:32 |
Alibaba Releases Qwen3-Next-80B-A3B: Advanced 80B-Parameter Mixture-of-Experts AI Model for Long-Context Inference
According to DeepLearning.AI, Alibaba has launched Qwen3-Next-80B-A3B, an 80-billion-parameter mixture-of-experts AI model available in Base, Instruct, and Thinking variants under an open-weights Apache 2.0 license. Designed for faster long-context inference, the model replaces standard attention layers with Gated DeltaNet and gated attention mechanisms, enhancing efficiency in processing extended context windows. Trained on a 15 trillion token subset of the Qwen3 dataset and fine-tuned with GSPO, Qwen3-Next-80B-A3B enables multi-token prediction and accommodates input lengths up to 262,144 tokens, offering significant improvements for enterprise-level generative AI, document analysis, and large-scale conversational applications. (Source: DeepLearning.AI Twitter, 2025-09-22) |