AI INFRASTRUCTURE
OpenAI's GPT-5.5 Powers Codex on NVIDIA Systems, Transforming Workflows
OpenAI's GPT-5.5, running on NVIDIA infrastructure, is revolutionizing workflows across industries with groundbreaking performance gains and enterprise adoption.
How Multi-Tenant GPU Clusters Optimize AI Workloads
Learn how multi-tenant GPU clusters combine efficiency and isolation for AI-native teams, solving capacity challenges without idle resources.
GitHub Pauses Copilot Signups as AI Agents Overwhelm Infrastructure
GitHub halts new Copilot Pro and Pro+ subscriptions, tightens usage limits, and removes Opus models from cheaper tiers as agentic AI workloads strain capacity.
NVIDIA Dynamo Gets Agentic AI Overhaul With 97% Cache Hit Rates
NVIDIA unveils major Dynamo updates targeting AI coding agents, achieving up to 97% KV cache hit rates and 4x latency improvements for enterprise deployments.
NVIDIA Claims 35x Cost Reduction in AI Token Generation With Blackwell
NVIDIA's Blackwell architecture delivers $0.12 per million tokens versus $4.20 on Hopper, reshaping AI infrastructure economics for enterprise deployments.
Eigen Labs Launches Project Darkbloom to Turn Idle Macs Into AI Compute Network
New research initiative from Eigen Labs aims to route AI inference through underused Apple Silicon machines, claiming 50% cost reduction versus major providers.
NVIDIA NVbandwidth Tool Gets Multi-Node Support for AI Infrastructure Testing
NVIDIA's NVbandwidth benchmarking tool now supports multi-node GPU clusters, enabling developers to measure bandwidth across NVLink connections at 397+ GB/s.
MiniMax M2.7 Brings 230B-Parameter AI Model to NVIDIA Infrastructure
MiniMax releases M2.7, a 230B-parameter mixture-of-experts model optimized for NVIDIA GPUs with up to 2.7x throughput gains on Blackwell hardware.
NVIDIA Open-Sources Slinky to Run Slurm GPU Workloads on Kubernetes
NVIDIA's Slinky project enables running Slurm clusters on Kubernetes, already deployed on 8,000+ GPU systems for large-scale AI training infrastructure.
Notion Slashes AI Embedding Costs 80% After Ditching Spark for Ray
Notion migrated from Spark on EMR to Ray, cutting embedding costs 80% and improving query latency 10x. Uber and Salesforce shared similar AI infrastructure wins.
NVIDIA Unveils Mission Control Software for Blackwell AI Supercomputers
NVIDIA's Mission Control bridges rack-scale GPU hardware with AI workload schedulers, enabling topology-aware job placement on GB200 and GB300 NVL72 systems.
Harvey AI Unveils Spectre Cloud Agent Platform for Enterprise Development
Legal AI startup Harvey reveals internal cloud agent platform Spectre, signaling infrastructure approach that could reshape enterprise AI deployment across industries.
Riot Platforms Sells $289M in Bitcoin as Mining Output Drops 4% in Q1
RIOT sold 3,778 BTC at $76,626 average while Q1 production fell to 1,473 coins. Hash rate jumped 26% but treasury shrinks 18% as miners pivot toward AI.
Ray 2.55 Adds Fault Tolerance for Large-Scale AI Model Deployments
Anyscale's Ray Serve LLM update enables DP group fault tolerance for vLLM WideEP deployments, reducing downtime risk for distributed AI inference systems.
Together AI Kernels Team Achieves 3.6x Performance Gains on NVIDIA Hardware
Together AI's kernel research team delivers major GPU optimization breakthroughs, cutting inference latency from 281ms to 77ms for enterprise AI deployments.