Search Results for "machine learning"
Together AI Achieves 40% Faster LLM Inference With Cache-Aware Architecture
Together AI's new CPD system separates warm and cold inference workloads, delivering 35-40% higher throughput for long-context AI applications on NVIDIA B200 GPUs.
Anthropic Ships Claude Sonnet 4.6 With 1M Token Context Window
Anthropic releases Claude Sonnet 4.6, delivering Opus-level AI performance at $3/$15 per million tokens with major computer use and coding improvements.
Anthropic Upgrades Claude AI Web Search Tools With 11% Accuracy Boost
Claude's new dynamic filtering feature cuts input tokens by 24% while improving search accuracy. Opus 4.6 hits 61.6% on BrowseComp benchmark.
Monday.com Achieves 8.7x Faster AI Agent Testing with LangSmith Integration
Monday Service reveals eval-driven development framework that cut AI agent testing from 162 seconds to 18 seconds using LangSmith and parallel processing.
Anthropic Study Reveals AI Agents Run 45 Minutes Autonomously as Trust Builds
New Anthropic research shows Claude Code autonomy nearly doubled in 3 months, with experienced users granting more independence while maintaining oversight.
Together AI's CDLM Achieves 14.5x Faster AI Inference Without Quality Loss
Consistency Diffusion Language Models solve two critical bottlenecks in AI inference, delivering up to 14.5x latency improvements while maintaining accuracy on coding and math tasks.
Anthropic Study Reveals Users Skip Critical Checks on AI-Generated Code
New research from $380B-valued Anthropic shows users are 5.2% less likely to verify AI outputs when creating artifacts, raising questions about automation risks.
NVIDIA NVFP4 Training Delivers 1.59x Speed Boost Without Accuracy Loss
NVIDIA's NVFP4 4-bit training format achieves 59% faster AI model training than BF16 while matching accuracy on Llama 3 8B benchmarks, per new research.
NVIDIA Run:ai Delivers 2x GPU Utilization Gains for AI Inference Workloads
NVIDIA benchmarks show Run:ai platform doubles GPU utilization while cutting latency 61x for enterprise AI deployments running NIM inference microservices.
Anthropic Brings Software Testing Rigor to AI Agent Skills
Claude's skill-creator update adds evals, benchmarks, and A/B testing for non-engineers building AI agent skills. Here's what it means for the ecosystem.
OpenAI Abandons SWE-bench Verified After Finding 59% of Failed Tests Were Flawed
OpenAI reveals major contamination issues in SWE-bench Verified benchmark, showing frontier AI models memorized solutions and tests rejected correct code.
Google Unleashes Gemini 3.1 Pro and AI Music Tools in February Blitz
Google's February 2026 AI rollout includes Gemini 3.1 Pro with 2x reasoning gains, Lyria 3 music generation, and Nano Banana 2 image tools for developers.
