Search Results for "large language models"
Deploying Trillion Parameter AI Models: NVIDIA's Solutions and Strategies
Explore NVIDIA's strategies for deploying trillion-parameter AI models, including parallelism techniques and the Blackwell architecture.
NVIDIA NVLink and NVSwitch Enhance Large Language Model Inference
NVIDIA's NVLink and NVSwitch technologies boost large language model inference, enabling faster and more efficient multi-GPU processing.
Enhancing Large Language Models with NVIDIA Triton and TensorRT-LLM on Kubernetes
Explore NVIDIA's methodology for optimizing large language models using Triton and TensorRT-LLM, while deploying and scaling these models efficiently in a Kubernetes environment.
NVIDIA Enhances TensorRT-LLM with KV Cache Optimization Features
NVIDIA introduces new KV cache optimizations in TensorRT-LLM, enhancing performance and efficiency for large language models on GPUs by managing memory and computational resources.
NVIDIA's AI Sales Assistant: Insights and Innovations
Explore the development and key learnings from NVIDIA's AI sales assistant, leveraging large language models and retrieval-augmented generation to streamline sales workflows.
NVIDIA Launches Secure AI General Availability with Enhanced Protection for Large Language Models
NVIDIA announces the general availability of its Secure AI solution, focusing on protecting large language models with enhanced security features.
NVIDIA Enhances Local LLM Experience on RTX PCs with New Tools and Updates
NVIDIA introduces optimizations for running large language models locally on RTX PCs with tools like Ollama and LM Studio, enhancing AI applications' performance and privacy.
Optimizing Large Language Models with NVIDIA's TensorRT: Pruning and Distillation Explained
Explore how NVIDIA's TensorRT Model Optimizer utilizes pruning and distillation to enhance large language models, making them more efficient and cost-effective.
TOFU: How AI Can Forget Your Privacy Data
TOFU, a AI model, tackles the challenge of machine unlearning, aiming to make AI systems forget specific, unwanted data while retaining overall knowledge.
Former Twitter CEO Parag Agrawal's AI Startup Raises $30 Million
Ex-Twitter CEO Parag Agrawal's new AI startup secures $30 million in funding, focusing on software for large language model developers. Backed by prominent investors, the venture reflects Agrawal's shift from social media to AI innovation.
Enhancing AI's Operational Efficiency: Breakthroughs from Microsoft Research and Peking University
Researchers from Microsoft Research and Peking University have developed groundbreaking methods to enhance LLMs' ability to follow complex instructions and generate high-quality graphic designs, showcasing significant advancements in AI operational efficiency.
Stanford's WikiChat Addresses Hallucinations Problem and Surpasses GPT-4 in Accuracy
Stanford's WikiChat elevates AI chatbot accuracy by integrating Wikipedia, addresses the inherent problem of hallucinations, significantly outperforms GPT-4 in benchmark tests.