NVIDIA
NVIDIA FastGen Cuts AI Video Generation Time by 100x With Open Source Library
NVIDIA releases FastGen, an open-source library that accelerates diffusion models up to 100x. 14B parameter video models now train in 16 hours on 64 H100 GPUs.
NVIDIA TensorRT for RTX Brings Self-Optimizing AI to Consumer GPUs
NVIDIA's TensorRT for RTX introduces adaptive inference that automatically optimizes AI workloads at runtime, delivering 1.32x performance gains on RTX 5090.
NVIDIA Earth-2 CorrDiff Model Achieves 11x Climate Resolution Boost
NVIDIA's Earth-2 platform now downscales coarse climate projections to reveal hurricanes and typhoons invisible in raw data. S&P Global already testing for risk analytics.
FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs
NVIDIA's FlashAttention-4 achieves 71% hardware efficiency on Blackwell chips, delivering 3.6x speedup over FA2 for AI training workloads.
NVIDIA Achieves 10x AI Image Generation Speedup on Blackwell Data Center GPUs
NVIDIA's new NVFP4 optimizations deliver 10.2x faster FLUX.2 inference on Blackwell B200 GPUs versus H200, with near-linear multi-GPU scaling.
NVIDIA DRIVE AV Powers Mercedes-Benz CLA to Top Euro NCAP Safety Rating
Mercedes-Benz CLA earns Euro NCAP's Best Performer of 2025 award using NVIDIA DRIVE AV software, marking a shift toward AI-driven safety standards in vehicles.
NVIDIA GeForce NOW Adds Flight Controls as Cloud Gaming Expands
NVIDIA rolls out flight stick support for GeForce NOW, adds four new games, and teases Delta Force arrival. Here's what it means for the $4.3T tech giant.
NVIDIA Pushes Local AI Art Generation With RTX-Optimized ComfyUI Workflows
NVIDIA releases comprehensive guide for running FLUX.2 and LTX-2 visual AI models locally on RTX GPUs, eliminating cloud costs and token fees for creators.
NVIDIA CUDA 13.1 Drops CUB Boilerplate with New Single-Call API
NVIDIA simplifies GPU development with CUB single-call API in CUDA 13.1, eliminating repetitive two-phase memory allocation code without performance loss.
GPU Waste Crisis Hits AI Production as Utilization Drops Below 50%
New analysis reveals production AI workloads achieve under 50% GPU utilization, with CPU-centric architectures blamed for billions in wasted compute resources.
NVIDIA Unveils AI Agent Training Method Using Synthetic Data and GRPO
NVIDIA's new approach combines synthetic data generation with reinforcement learning to train CLI agents on a single GPU, cutting training time from months to days.
NVIDIA cuTile Python Guide Shows 90% cuBLAS Performance for Matrix Ops
NVIDIA releases detailed cuTile Python tutorial for Blackwell GPUs, demonstrating matrix multiplication achieving over 90% of cuBLAS performance with simplified code.
NVIDIA DLSS 4.5 Launches With 5x More Compute Power for Gaming AI
NVIDIA unveils DLSS 4.5 at CES 2026 with second-gen transformer model, 6x frame generation, and neural shading upgrades for RTX GPUs.
NVIDIA cuOpt Solver Cracks Four Previously Unsolved Optimization Problems
NVIDIA's GPU-accelerated cuOpt engine discovers new solutions for four MIPLIB benchmark problems, outperforming CPU solvers with 22% lower objective gaps.
Multi-Node GPU Training Guide Reveals 72B Model Scaling Secrets
Together.ai details how to train 72B parameter models across 128 GPUs, achieving 45-50% utilization with proper network tuning and fault tolerance.