MACHINE LEARNING
Together AI's CDLM Achieves 14.5x Faster AI Inference Without Quality Loss
Consistency Diffusion Language Models solve two critical bottlenecks in AI inference, delivering up to 14.5x latency improvements while maintaining accuracy on coding and math tasks.
Anthropic Study Reveals AI Agents Run 45 Minutes Autonomously as Trust Builds
New Anthropic research shows Claude Code autonomy nearly doubled in 3 months, with experienced users granting more independence while maintaining oversight.
Monday.com Achieves 8.7x Faster AI Agent Testing with LangSmith Integration
Monday Service reveals eval-driven development framework that cut AI agent testing from 162 seconds to 18 seconds using LangSmith and parallel processing.
Anthropic Upgrades Claude AI Web Search Tools With 11% Accuracy Boost
Claude's new dynamic filtering feature cuts input tokens by 24% while improving search accuracy. Opus 4.6 hits 61.6% on BrowseComp benchmark.
Anthropic Ships Claude Sonnet 4.6 With 1M Token Context Window
Anthropic releases Claude Sonnet 4.6, delivering Opus-level AI performance at $3/$15 per million tokens with major computer use and coding improvements.
Together AI Achieves 40% Faster LLM Inference With Cache-Aware Architecture
Together AI's new CPD system separates warm and cold inference workloads, delivering 35-40% higher throughput for long-context AI applications on NVIDIA B200 GPUs.
Google DeepMind Unveils Gemini Deep Think for Scientific Research
Google's Gemini Deep Think AI solved 18 research problems across math, physics, and computer science, including a decade-old conjecture that stumped experts.
NVIDIA Releases Open Source Tools for License-Safe AI Model Training
NVIDIA's NeMo Data Designer enables developers to build synthetic data pipelines for AI distillation without licensing headaches or massive datasets.
Together AI Drops Largest Open Dataset for Training Coding Agents
TogetherCoder-Preview releases 161K verified coding trajectories achieving 59.4% on SWE-Bench, giving developers unprecedented training data for AI agents.
Open-Source AI Judges Beat GPT-5.2 at 15x Lower Cost Using DPO Fine-Tuning
Together AI demonstrates fine-tuned open-source LLMs can outperform GPT-5.2 as evaluation judges using just 5,400 preference pairs, slashing costs dramatically.
NVIDIA Megatron Core Gets Dynamic-CP Update With 48% Training Speedups
NVIDIA releases Dynamic Context Parallelism for Megatron Core, achieving up to 1.48x faster LLM training and 35% gains in industrial deployments.
Together AI Launches DSGym Framework for Training Data Science AI Agents
Together AI's DSGym framework benchmarks LLM agents on 90+ bioinformatics tasks and 92 Kaggle competitions. Their 4B parameter model matches larger rivals.
Anthropic Shares Multi-Agent AI Framework for Developers
Anthropic reveals when multi-agent systems outperform single AI agents, citing 3-10x token costs and three specific use cases worth the overhead.
AI Inference Costs Drop 40% With New GPU Optimization Tactics
Together AI reveals production-tested techniques cutting inference latency by 50-100ms while reducing per-token costs up to 5x through quantization and smart decoding.
GPU Waste Crisis Hits AI Production as Utilization Drops Below 50%
New analysis reveals production AI workloads achieve under 50% GPU utilization, with CPU-centric architectures blamed for billions in wasted compute resources.