Place your ads here email us at info@blockchain.news
Together.AI Unveils Enhanced Batch Inference API with Expanded Capabilities - Blockchain.News

Together.AI Unveils Enhanced Batch Inference API with Expanded Capabilities

Tony Kim Sep 16, 2025 07:00

Together.AI has upgraded its Batch Inference API, offering a streamlined UI, universal model support, and a 3000x rate limit increase to 30 billion tokens, enhancing large-scale data processing.

Together.AI Unveils Enhanced Batch Inference API with Expanded Capabilities

Together.AI has announced significant upgrades to its Batch Inference API, aiming to simplify and accelerate the processing of large-scale AI workloads. These enhancements include an improved user interface, expanded model support, and a significant increase in rate limits, according to Together.AI.

Streamlined User Interface

The new UI allows users to create and monitor batch jobs more intuitively, eliminating the need for complex API calls. This development is expected to enhance user experience and operational efficiency.

Universal Model Support

The upgraded Batch Inference API now supports all serverless models and private deployments. This universal model access empowers users to execute batch workloads on any required model, increasing flexibility and scalability.

Massive Scale Enhancement

One of the most noteworthy improvements is the increase in rate limits from 10 million to 30 billion enqueued tokens per model per user. This 3000x enhancement allows for the processing of massive datasets without bottlenecks, facilitating faster and more efficient data handling.

Cost Efficiency

The Batch Inference API now operates at half the cost of real-time APIs for most serverless models. This reduction in cost makes it a more economical choice for processing high-throughput workloads, making large-scale inference both accessible and cost-effective.

Real-World Application

Volodymyr Kuleshov, Co-Founder of Inception Labs, highlighted the API's impact, stating it allows for the processing of large requests without bottlenecks, enabling faster experimentation. Inception Labs, among other teams, leverages the API for research and production workloads, demonstrating its broad applicability.

Ideal Use Cases

The Batch Inference API is particularly suited for scenarios that require high throughput without real-time constraints. This includes large-scale text analysis, fraud detection, synthetic data generation, embedding generation, content moderation, model evaluation, and customer support automation.

Future Prospects

The enhancements to the Batch Inference API signify a major advancement in the accessibility and efficiency of large-scale AI processing. With these updates, Together.AI positions its API as a leading solution for organizations looking to scale their AI experiments and applications efficiently.

Image source: Shutterstock