NVIDIA Unveils New Language Models for RTX AI PCs
Terrill Dicki Dec 17, 2024 18:31
NVIDIA introduces small language models to enhance digital human responses, enabling improved interaction with agents, assistants, and avatars on RTX AI PCs.
 
                                
                            NVIDIA has announced a new series of small language models (SLMs) aimed at enhancing the capabilities of digital humans, according to NVIDIA. These models are a part of NVIDIA ACE, a suite of technologies designed to bring life to agents, assistants, and avatars, leveraging the power of RTX AI PCs.
Introducing Multi-Modal Capabilities
The new models include the NVIDIA Nemovision-4B-Instruct, a multi-modal SLM that allows digital humans to interpret visual imagery and provide contextually relevant responses. Built using the latest NVIDIA VILA and NeMo frameworks, these models are optimized for performance across a wide range of NVIDIA RTX GPUs, maintaining high accuracy levels essential for developers.
Large-Context Language Models
NVIDIA's new large-context SLMs are designed to manage extensive data inputs, facilitating the understanding of complex prompts. The Mistral-NeMo-Minitron-128k-Instruct family, available in 8B, 4B, and 2B parameter versions, balances speed, memory usage, and accuracy on NVIDIA RTX AI PCs. These models can process significant data volumes in a single pass, enhancing accuracy by reducing the need for data segmentation.
Enhancements in Audio2Face-3D NIM
NVIDIA has also updated its Audio2Face-3D NIM microservice to improve the realism of facial animations, crucial for authentic digital human interactions. This microservice now supports real-time lip-sync and facial animation, enhancing customization options through a single downloadable optimized container.
Streamlining Deployment on RTX AI PCs
Deploying digital humans on RTX AI PCs requires efficient orchestration of animation, intelligence, and speech AI models. NVIDIA is introducing new SDK plugins and samples to facilitate on-device workflows, including the NVIDIA Riva Automatic Speech Recognition and an Unreal Engine 5 sample application powered by Audio2Face-3D. These tools are part of the NVIDIA In-Game Inference SDK, currently available in beta, simplifying AI integration by managing model and dependency downloads and enabling hybrid AI operations.
Developers interested in these advancements can access these tools through the NVIDIA Developer platform.
Image source: Shutterstock.jpg)