LLM
LLM Agents Help Win Kaggle Competition with 600K Lines of Code
Generative AI agents produced 600,000 lines of code and ran 850 experiments to secure first place in a Kaggle competition. Here's how they did it.
NVIDIA Megatron Boosts LLM Training With Muon Optimizer
NVIDIA integrates Muon and advanced optimizers into Megatron to enhance large-scale LLM training with near-parity throughput to AdamW.
NVIDIA Jetson Memory Tricks Let Edge Devices Run 10B Parameter AI Models
NVIDIA reveals optimization techniques that reclaim up to 12GB of memory on Jetson devices, enabling multi-billion parameter LLMs to run on edge hardware.
NVIDIA Blackwell Smashes Finance AI Benchmark With 3.2x Speed Gains
NVIDIA's GB200 NVL72 sets new STAC-AI record for LLM inference in financial trading, delivering up to 3.2x performance over Hopper architecture.
Open-Source AI Judges Beat GPT-5.2 at 15x Lower Cost Using DPO Fine-Tuning
Together AI demonstrates fine-tuned open-source LLMs can outperform GPT-5.2 as evaluation judges using just 5,400 preference pairs, slashing costs dramatically.
NVIDIA's Breakthrough in LLM Memory: Test-Time Training for Enhanced Context Learning
NVIDIA introduces a novel approach to LLM memory using Test-Time Training (TTT-E2E), offering efficient long-context processing with reduced latency and loss, paving the way for future AI advancements.
AutoJudge Revolutionizes LLM Inference with Enhanced Token Processing
AutoJudge introduces a novel method to accelerate large language model inference by optimizing token processing, reducing human annotation needs, and improving processing speed with minimal accuracy loss.
Unsloth Simplifies LLM Training on NVIDIA Blackwell GPUs
Unsloth's open-source framework enables efficient LLM training on NVIDIA Blackwell GPUs, democratizing AI development with faster throughput and reduced VRAM usage.
ATLAS: Revolutionizing LLM Inference with Adaptive Learning
Together.ai introduces ATLAS, a system enhancing LLM inference speed by adapting to workloads, achieving 500 TPS on DeepSeek-V3.1.
Enhancing LLM Inference with NVIDIA Run:ai and Dynamo Integration
NVIDIA's Run:ai v2.23 integrates with Dynamo to address large language model inference challenges, offering gang scheduling and topology-aware placement for efficient, scalable deployments.
NVIDIA's Run:ai Model Streamer Enhances LLM Inference Speed
NVIDIA introduces the Run:ai Model Streamer, significantly reducing cold start latency for large language models in GPU environments, enhancing user experience and scalability.
Enhancing LLM Inference with CPU-GPU Memory Sharing
NVIDIA introduces a unified memory architecture to optimize large language model inference, addressing memory constraints and improving performance.
NVIDIA's ProRL v2 Advances LLM Reinforcement Learning with Extended Training
NVIDIA unveils ProRL v2, a significant leap in reinforcement learning for large language models (LLMs), enhancing performance through extended training and innovative algorithms.
Together AI Introduces Flexible Benchmarking for LLMs
Together AI unveils Together Evaluations, a framework for benchmarking large language models using open-source models as judges, offering customizable insights into model performance.
NVIDIA's NeMo Framework Enables Weekend Training of Reasoning-Capable LLMs
NVIDIA introduces an efficient method to train reasoning-capable language models over a weekend using the NeMo framework, leveraging the Llama Nemotron dataset and LoRA adapters.