Llm News

Llm

Anyscale Launches LLM Post-Training Tool to Simplify Fine-Tuning

Anyscale unveils a post-training skill for large language models, streamlining methodology selection, GPU planning, and configuration generation.

by Tony Kim
May 16, 2026

Llm

Ray Serve Introduces Scalable Multi-Agent AI Architecture

Ray Serve leverages MCP and A2A protocols for scalable AI agents, solving production bottlenecks in LLM and multi-agent deployments.

by Luisa Crawford
May 08, 2026

Llm

LLM Agents Help Win Kaggle Competition with 600K Lines of Code

Generative AI agents produced 600,000 lines of code and ran 850 experiments to secure first place in a Kaggle competition. Here's how they did it.

by Iris Coleman
Apr 24, 2026

Llm

NVIDIA Megatron Boosts LLM Training With Muon Optimizer

NVIDIA integrates Muon and advanced optimizers into Megatron to enhance large-scale LLM training with near-parity throughput to AdamW.

by Zach Anderson
Apr 23, 2026

Llm

NVIDIA Jetson Memory Tricks Let Edge Devices Run 10B Parameter AI Models

NVIDIA reveals optimization techniques that reclaim up to 12GB of memory on Jetson devices, enabling multi-billion parameter LLMs to run on edge hardware.

by Rongchai Wang
Apr 21, 2026

Llm

NVIDIA Blackwell Smashes Finance AI Benchmark With 3.2x Speed Gains

NVIDIA's GB200 NVL72 sets new STAC-AI record for LLM inference in financial trading, delivering up to 3.2x performance over Hopper architecture.

by Iris Coleman
Mar 06, 2026

Llm

Open-Source AI Judges Beat GPT-5.2 at 15x Lower Cost Using DPO Fine-Tuning

Together AI demonstrates fine-tuned open-source LLMs can outperform GPT-5.2 as evaluation judges using just 5,400 preference pairs, slashing costs dramatically.

by Luisa Crawford
Feb 03, 2026

Llm

NVIDIA's Breakthrough in LLM Memory: Test-Time Training for Enhanced Context Learning

NVIDIA introduces a novel approach to LLM memory using Test-Time Training (TTT-E2E), offering efficient long-context processing with reduced latency and loss, paving the way for future AI advancements.

by Alvin Lang
Jan 10, 2026

Llm

AutoJudge Revolutionizes LLM Inference with Enhanced Token Processing

AutoJudge introduces a novel method to accelerate large language model inference by optimizing token processing, reducing human annotation needs, and improving processing speed with minimal accuracy loss.

by Caroline Bishop
Dec 05, 2025

Llm

Unsloth Simplifies LLM Training on NVIDIA Blackwell GPUs

Unsloth's open-source framework enables efficient LLM training on NVIDIA Blackwell GPUs, democratizing AI development with faster throughput and reduced VRAM usage.

by Iris Coleman
Oct 24, 2025

Llm

ATLAS: Revolutionizing LLM Inference with Adaptive Learning

Together.ai introduces ATLAS, a system enhancing LLM inference speed by adapting to workloads, achieving 500 TPS on DeepSeek-V3.1.

by Rongchai Wang
Oct 10, 2025

Llm

Enhancing LLM Inference with NVIDIA Run:ai and Dynamo Integration

NVIDIA's Run:ai v2.23 integrates with Dynamo to address large language model inference challenges, offering gang scheduling and topology-aware placement for efficient, scalable deployments.

by Lawrence Jengar
Sep 29, 2025

Llm

NVIDIA's Run:ai Model Streamer Enhances LLM Inference Speed

NVIDIA introduces the Run:ai Model Streamer, significantly reducing cold start latency for large language models in GPU environments, enhancing user experience and scalability.

by Ted Hisokawa
Sep 17, 2025

Llm

Enhancing LLM Inference with CPU-GPU Memory Sharing

NVIDIA introduces a unified memory architecture to optimize large language model inference, addressing memory constraints and improving performance.

by Felix Pinkston
Sep 06, 2025

Llm

NVIDIA's ProRL v2 Advances LLM Reinforcement Learning with Extended Training

NVIDIA unveils ProRL v2, a significant leap in reinforcement learning for large language models (LLMs), enhancing performance through extended training and innovative algorithms.

by Zach Anderson
Aug 14, 2025