NVIDIA and DDN Collaborate to Enhance AI Infrastructure with BlueField-3 DPU Integration
Lawrence Jengar Jul 24, 2024 05:28
NVIDIA and DDN Storage's integration of BlueField-3 DPUs promises to revolutionize AI infrastructure with improved efficiency, scalability, and security.
 
                                
                            As AI becomes integral to organizational innovation and competitive advantage, the need for efficient and scalable infrastructure is more critical than ever. A partnership between NVIDIA and DDN Storage is setting new standards in this area. By integrating NVIDIA BlueField DPUs into DDN EXAScaler and DDN Infinia, DDN Storage is transforming data-centric workloads, according to NVIDIA Technical Blog.
An Integrated DPU Storage Solution
DDN Infinia, a software-defined data platform, harnesses the power of BlueField-3 DPUs to manage data-centric workloads effectively, especially in accelerated computing and generative AI. The integration enhances multi-tenancy, amplifies operational efficiency, and bolsters data protection. This makes it an ideal solution for organizations keen on using AI and cloud technologies to drive innovation and operational agility.
Their solution involves several key components:
- Offloading data processing
- Accelerating storage performance
- Improving efficiency
- Supporting multi-tenancy
- Enhancing security
- Enhancing scaling
Offloading Data Processing
BlueField DPUs alleviate the CPU by taking over data processing tasks, which frees up compute resources and boosts overall system performance. This offloading of storage and security tasks enables more efficient CPU usage, leading to significantly reduced delays and quicker data processing.
Accelerating Storage Performance
DDN’s storage solutions, empowered by BlueField DPUs, enhance storage performance for AI workloads. Using the advanced data processing capabilities of BlueField DPUs, these solutions achieve higher throughput and improved system responsiveness to accelerate AI applications.
NVIDIA GPUDirect Storage (GDS) facilitates a direct data path between GPU platforms and storage, minimizing system memory traffic, which in turn enhances bandwidth and reduces CPU load to optimize AI workflows.
Improving Efficiency
Traditional storage systems perform various tasks such as flash management, RAID, access control, and encryption on general-purpose x86 CPUs. However, they are becoming inefficient as network speeds and security demands escalate.
Integrating BlueField DPUs within storage servers and host access significantly enhances storage efficiency by offloading and accelerating tasks like the NVMe-oF storage protocol, thus freeing up CPU cycles for other applications.
Supporting Multi-Tenancy
The DDN Infinia storage platform employs containerization, enabling different storage functions to run in separate containers. This architecture facilitates scalability and optimizes the entire data path by offloading tasks to DPUs, reducing latency.
Multi-tenant deployment consolidates multiple namespaces within a single file system, improving capacity utilization, reducing hardware costs, and streamlining deployment and management.
The hardware-based isolation and resource allocation capabilities of the BlueField DPUs enable the secure sharing of infrastructure resources among multiple users and applications, improving resource utilization and operational efficiency.
 Figure 1. DDN Infinia isolates user data securely
Figure 1. DDN Infinia isolates user data securely
Enhancing Security
The dedicated processing resources and memory of BlueField DPUs provide a secure environment, preventing unauthorized access and protecting against potential attacks. Hardware-accelerated encryption ensures that data stored in the storage system is encrypted at rest, safeguarding sensitive information.
Access control mechanisms of the BlueField DPU enable administrators to define and enforce fine-grained access policies, ensuring that only authorized users or applications can access and modify the data and secure boot capabilities to verify the integrity of firmware and software components during the boot process, preventing tampering or unauthorized modifications.
Offloading security-related tasks from the host CPU reduces the attack surface and frees up CPU resources for other critical tasks.
With these combined security features, BlueField DPUs provide a robust and secure storage solution for AI workloads and data from the DPU to the CPU. The combined technology stack ensures that data remains protected, addressing concerns around data security and integrity in AI-driven environments. Organizations can provide greater protection against cyberthreats and unauthorized access, enhancing overall data security and compliance.
Enhancing Scaling
DDN Infinia is a fully containerized platform, structured around a set of orchestrated microservices that deliver the entire storage service. Using BlueField DPUs, DDN has developed a completely new architecture that supports a full cloud-native stack. This innovative use of BlueField DPUs enables the storage platform to extend across the network.
Specifically, DDN Infinia’s Amazon S3 object services are containerized and can operate independently of the Infinia storage system by using resources from NVIDIA DPUs in NVIDIA DGX client systems. The design shift completely revamps how data flows through the storage system. Traditionally, Amazon S3 object calls are made locally to services running on BlueField. Conventional storage relies on commands sent over a network (RESTful calls), which can be slow.
With BlueField, those calls are replaced with RDMA calls from the DPU to the storage system. This offloads the storage tasks from the main system and uses a more efficient data path, significantly reducing delays and boosting bandwidth for AI acceleration. This reconfiguration of the storage architecture transforms the Amazon S3 object datapath, significantly enhancing performance and scalability.
Summary
The collaboration between DDN and NVIDIA is poised to significantly advance AI applications within data center infrastructures, setting the stage for more efficient and secure AI-driven workflows. By using the combined strengths of advanced data processing and storage solutions, organizations can expect enhanced efficiency, scalability, and security in AI initiatives.
Discover how DDN is driving advancements in generative AI and accelerating data processing in the era of accelerated computing in the following resources:
Image source: Shutterstock.jpg)