LLM
NVIDIA Unveils Llama 3.1-Nemotron-70B-Reward to Enhance AI Alignment with Human Preferences
NVIDIA introduces Llama 3.1-Nemotron-70B-Reward, a leading reward model that improves AI alignment with human preferences using RLHF, topping the RewardBench leaderboard.
NVIDIA and Outerbounds Revolutionize LLM-Powered Production Systems
NVIDIA and Outerbounds collaborate to streamline the development and deployment of LLM-powered production systems with advanced microservices and MLOps platforms.
Ollama Enables Local Running of Llama 3.2 on AMD GPUs
Ollama makes it easier to run Meta's Llama 3.2 model locally on AMD GPUs, offering support for both Linux and Windows systems.
LangGraph.js v0.2 Enhances JavaScript Agents with Cloud and Studio Support
LangChain releases LangGraph.js v0.2 with new features for building and deploying JavaScript agents, including support for LangGraph Cloud and LangGraph Studio.
TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency
TEAL offers a training-free approach to activation sparsity, significantly enhancing the efficiency of large language models (LLMs) with minimal degradation.
AMD Radeon PRO GPUs and ROCm Software Expand LLM Inference Capabilities
AMD's Radeon PRO GPUs and ROCm software enable small enterprises to leverage advanced AI tools, including Meta's Llama models, for various business applications.
NVIDIA's Blackwell Platform Breaks New Records in MLPerf Inference v4.1
NVIDIA's Blackwell architecture sets new benchmarks in MLPerf Inference v4.1, showcasing significant performance improvements in LLM inference.
MIT Research Unveils AI's Potential in Safeguarding Critical Infrastructure
MIT's new study reveals how large language models (LLMs) can efficiently detect anomalies in critical infrastructure systems, offering a plug-and-play solution.
Understanding Decoding Strategies in Large Language Models (LLMs)
Explore how Large Language Models (LLMs) choose the next word using decoding strategies. Learn about different methods like greedy search, beam search, and more.
Strategies to Optimize Large Language Model (LLM) Inference Performance
NVIDIA experts share strategies to optimize large language model (LLM) inference performance, focusing on hardware sizing, resource optimization, and deployment methods.
NVIDIA Unveils Pruning and Distillation Techniques for Efficient LLMs
NVIDIA introduces structured pruning and distillation methods to create efficient language models, significantly reducing resource demands while maintaining performance.
LangSmith Enhances LLM Apps with Dynamic Few-Shot Examples
LangSmith introduces dynamic few-shot example selectors, allowing for improved LLM app performance by dynamically selecting relevant examples based on user input.
Character.AI Enters Agreement with Google, Announces Leadership Changes
Character.AI announces a strategic agreement with Google and key leadership changes to accelerate the development of personalized AI products.
NVIDIA Introduces Efficient Fine-Tuning with NeMo Curator for Custom LLM Datasets
NVIDIA's NeMo Curator offers a streamlined method for fine-tuning large language models (LLMs) with custom datasets, enhancing machine learning workflows.
LangSmith Introduces Flexible Dataset Schemas for Efficient Data Curation
LangSmith now offers flexible dataset schemas, enabling efficient and iterative data curation for LLM applications, as announced by LangChain Blog.