LARGE LANGUAGE MODELS
Anthropic Drops Long-Context Premium as Claude 4.6 Models Hit 1M Tokens
Claude Opus 4.6 and Sonnet 4.6 now offer full 1M token context windows at standard API pricing, eliminating the long-context premium entirely.
Anthropic Discovers 'Assistant Axis' to Prevent AI Jailbreaks and Persona Drift
Anthropic researchers map neural 'persona space' in LLMs, finding a key axis that controls AI character stability and blocks harmful behavior patterns.
Optimizing Large Language Models with NVIDIA's TensorRT: Pruning and Distillation Explained
Explore how NVIDIA's TensorRT Model Optimizer utilizes pruning and distillation to enhance large language models, making them more efficient and cost-effective.
NVIDIA Enhances Local LLM Experience on RTX PCs with New Tools and Updates
NVIDIA introduces optimizations for running large language models locally on RTX PCs with tools like Ollama and LM Studio, enhancing AI applications' performance and privacy.
NVIDIA Launches Secure AI General Availability with Enhanced Protection for Large Language Models
NVIDIA announces the general availability of its Secure AI solution, focusing on protecting large language models with enhanced security features.
NVIDIA's AI Sales Assistant: Insights and Innovations
Explore the development and key learnings from NVIDIA's AI sales assistant, leveraging large language models and retrieval-augmented generation to streamline sales workflows.
NVIDIA Enhances TensorRT-LLM with KV Cache Optimization Features
NVIDIA introduces new KV cache optimizations in TensorRT-LLM, enhancing performance and efficiency for large language models on GPUs by managing memory and computational resources.
Enhancing Large Language Models with NVIDIA Triton and TensorRT-LLM on Kubernetes
Explore NVIDIA's methodology for optimizing large language models using Triton and TensorRT-LLM, while deploying and scaling these models efficiently in a Kubernetes environment.
NVIDIA NVLink and NVSwitch Enhance Large Language Model Inference
NVIDIA's NVLink and NVSwitch technologies boost large language model inference, enabling faster and more efficient multi-GPU processing.
Deploying Trillion Parameter AI Models: NVIDIA's Solutions and Strategies
Explore NVIDIA's strategies for deploying trillion-parameter AI models, including parallelism techniques and the Blackwell architecture.
Enhancing AI's Operational Efficiency: Breakthroughs from Microsoft Research and Peking University
Researchers from Microsoft Research and Peking University have developed groundbreaking methods to enhance LLMs' ability to follow complex instructions and generate high-quality graphic designs, showcasing significant advancements in AI operational efficiency.
How Jailbreak Attacks Compromise ChatGPT and AI Models' Security
Recent studies reveal the vulnerabilities of large language models like GPT-4 to jailbreak attacks. Innovative defense strategies, such as self-reminders, are being developed to mitigate these risks, underscoring the need for enhanced AI security and ethical considerations.
TOFU: How AI Can Forget Your Privacy Data
TOFU, a AI model, tackles the challenge of machine unlearning, aiming to make AI systems forget specific, unwanted data while retaining overall knowledge.
Navigating the Resource Efficiency of Large Language Models: A Comprehensive Survey
A survey explores the resource efficiency in Large Language Models (LLMs) like OpenAI's ChatGPT, addressing high computational demands and proposing optimization strategies.
How LLM Is Reshaping Agent-Based Modeling and Simulation
LLMs are reshaping agent-based modeling, enhancing simulations in social, economic, and cyber domains with advanced AI integration.