GPU
RAPIDS Introduces GPU Polars Streaming and Unified GNN API Enhancements
NVIDIA's RAPIDS suite version 25.06 unveils new features including GPU Polars streaming, a unified GNN API, and zero-code ML speedups, enhancing Python data science capabilities.
Efficient AI Pipelines: NVIDIA's NeMo Retriever Extraction on a Single GPU
NVIDIA's NeMo Retriever offers a streamlined solution for multimodal document extraction using a single GPU, enhancing AI pipelines' efficiency and reducing operational costs.
NVIDIA Enhances Multi-GPU Communication with NCCL 2.26 Release
NVIDIA's NCCL 2.26 introduces performance enhancements, improved monitoring, and quality of service features, optimizing multi-GPU and multinode communications for AI and HPC applications.
Aethir and Bitfinex Host Insightful AMA on Decentralized GPU Infrastructure
Aethir and Bitfinex held an AMA session exploring decentralized GPU infrastructure, its impact on AI and gaming, and future plans involving the $ATH token.
Aethir's Decentralized Infrastructure Gains Spotlight in Bitfinex AMA
Aethir's decentralized infrastructure for GPU computing was discussed in a recent AMA hosted by Bitfinex and BitFreedomGus, highlighting its impact on AI and gaming sectors.
Enhancing Molecular Dynamics with NVIDIA's Multi-Process Service
NVIDIA's Multi-Process Service optimizes GPU usage in molecular dynamics simulations, boosting throughput by running concurrent processes on a single GPU.
Kaggle Competition Winner Reveals Stacking Strategy with cuML
Kaggle Grandmaster Chris Deotte shares insights on winning the April 2025 Kaggle competition using stacking with cuML, leveraging GPU acceleration for fast and efficient modeling.
Harnessing AI's Potential with Decentralized Compute Networks
Explore how decentralized compute networks address the rising demand for AI applications, offering scalable solutions through consumer-grade GPUs. Learn about real-world use cases and industry partnerships.
NVIDIA's cuEmbed Boosts GPU Performance for Embedding Lookups
NVIDIA unveils cuEmbed, a CUDA library that significantly enhances embedding lookups on GPUs, promising improved performance for recommendation systems and other applications.
Enhancing Polars GPU Parquet Reader Performance with Chunked Reading and UVM
Explore how Polars GPU Parquet Reader boosts performance using chunked reading and Unified Virtual Memory, enhancing data processing capabilities for large datasets.
DeepSeek-R1 Enhances GPU Kernel Generation with Inference Time Scaling
NVIDIA's DeepSeek-R1 model uses inference-time scaling to improve GPU kernel generation, optimizing performance in AI models by efficiently managing computational resources during inference.
NVIDIA Unveils Enhanced Features in NCCL 2.23 for Improved GPU Communication
NVIDIA's NCCL 2.23 release introduces a new scaling algorithm, accelerated initialization, and a profiler plugin API, optimizing inter-GPU and multinode communication for AI and HPC applications.
Injective (INJ)and Aethir Transform GPU Compute Resources with Tokenization
Injective (INJ)and Aethir collaborate to tokenize GPU compute resources, enhancing access and efficiency in AI and blockchain sectors through a novel tradeable token system.
Warp 1.5.0 Introduces Tile-Based Programming for Enhanced GPU Efficiency
Warp 1.5.0 launches tile-based programming in Python, leveraging cuBLASDx and cuFFTDx for efficient GPU operations, significantly improving performance in scientific computing and simulation.
NVIDIA's RAPIDS cuDF Enhances pandas Through Unified Virtual Memory
NVIDIA's RAPIDS cuDF utilizes Unified Virtual Memory to boost pandas' performance by 50x, offering seamless integration with existing workflows and GPU acceleration.