Together AI Launches DeepSeek-V3.1: A Versatile Hybrid Model
Terrill Dicki Aug 25, 2025 23:56
Together AI introduces DeepSeek-V3.1, a hybrid model offering fast responses and deep reasoning modes, ensuring efficiency and reliability for various applications.

Together AI has unveiled DeepSeek-V3.1, an advanced hybrid model designed to cater to both fast response requirements and complex reasoning tasks. The model, now available for deployment on Together AI's platform, is particularly noted for its dual-mode functionality, allowing users to select between non-thinking and thinking modes to optimize performance based on task complexity.
Features and Capabilities
DeepSeek-V3.1 is crafted to provide enhanced efficiency and reliability, according to Together AI. It supports serverless deployment with a 99.9% SLA, ensuring robust performance across a variety of use cases. The model's thinking mode offers comparable quality to its predecessor, DeepSeek-R1, but with a significant improvement in speed, making it suitable for production environments.
The model is built on a substantial training dataset, with 630 billion tokens for 32K context and 209 billion tokens for 128K context, enhancing its capability to handle extended conversations and large codebases. This ensures that the model is well-equipped for tasks that require detailed analysis and multi-step reasoning.
Real-World Applications
DeepSeek-V3.1 excels in various applications, including code and search agent tasks. In non-thinking mode, it efficiently handles routine tasks such as API endpoint generation and simple queries. In contrast, the thinking mode is ideal for complex problem-solving, such as debugging distributed systems and designing zero-downtime database migrations.
For document processing, the model offers non-thinking capabilities for entity extraction and basic parsing, while thinking mode supports comprehensive analysis of compliance workflows and regulatory cross-referencing.
Performance Metrics
Benchmark tests reveal the model's strengths in both modes. For instance, in the MMLU-Redux benchmark, the thinking mode achieved a 93.7% success rate, surpassing the non-thinking mode by 1.9%. Similarly, the GPQA-Diamond benchmark showed a 5.2% improvement in thinking mode. These metrics underscore the model's ability to enhance performance across various tasks.
Deployment and Integration
DeepSeek-V3.1 is available through Together AI's serverless API and dedicated endpoints, offering technical specifications with 671 billion total parameters and an MIT license for extensive application. The infrastructure is designed for reliability, featuring North American data centers and SOC 2 compliance.
Developers can swiftly integrate the model into their applications using the provided Python SDK, enabling seamless incorporation of DeepSeek-V3.1's capabilities into existing systems. Together AI's infrastructure supports large mixture-of-experts models, ensuring both thinking and non-thinking modes operate efficiently under production workloads.
With the launch of DeepSeek-V3.1, Together AI aims to provide a versatile solution for businesses seeking to enhance their AI-driven applications with both rapid response and deep analytical capabilities.
Image source: Shutterstock