Liquid-cooled servers receive a new focus with Cold-Plate SSD, a novel storage solution.
In the rapidly evolving world of artificial intelligence (AI), data centers are facing an increasing demand to support more dense and hotter chips, particularly in the AI sector. To meet this challenge, Solidigm has developed a groundbreaking solution: cold-plate SSDs.
Cold plates, a common feature in liquid cooling setups for CPUs and memory, are now finding their way into SSDs, significantly enhancing heat management in AI-focused data centers. By providing more efficient and direct cooling compared to air cooling, cold plates facilitate better thermal transfer away from high-performance SSDs, helping maintain stable operating temperatures under sustained, write-intensive AI workloads.
The Solidigm's E1.S D7-PS1010 SSD, for instance, allows for hotter SSDs that run faster and have a higher capacity than convection/air-cooled solutions. This SSD is a hot-swappable, LFT solution with a spring-loaded cold plate that meets the 9.5-mm E1.S standard and is designed to plug into a liquid-cooled chassis (Fig. 2).
In a cold-plate system, the heat is moved from the device through the plates to the liquid coolant in a chassis, where the heat is then transferred to the external cooling system. Immersion-cooling systems, also known as fluid flow through (FFT), surround the device with liquid coolant but require specially designed devices and containment.
Cold plates with high thermal conductivity interface materials (like specialized silicon thermal pads) can bridge the gap between the SSD and the cold plate to maximize heat transfer. These setups outperform passive heatsinks and small fans in confined enclosures, where limited airflow cannot sufficiently cool stacked or densely packed NVMe SSDs.
Moreover, cold plate liquid cooling systems often integrate sensor monitoring for real-time thermal management, which is crucial in AI data centers for dynamic workload optimization and reliability assurance.
The adoption of LFT solutions is aligned with the move to CPUs, GPUs, AI accelerators, SmartNICs, and DPUs, as chips are becoming larger and hotter due to approaches like chiplets. The D7-PS1010 SSD, for example, is populated with 176-layer, TLC 3D NAND and employs a x4 PCI Express (PCIe) Gen 5 interface with capacities in excess of 15 TB and is available in a 15-mm U.2 form factor.
The family of the D7-PS1010 SSD boasts a sequential read bandwidth of 14,500 MB/s and a write bandwidth of 4,100 MB/s. It maintains a read latency of 60 μs and write latency of 7 to 8 μs. The mean time between failures (MTBF) for the D7-PS1010 SSD is 2.5 million hours.
Cold-plate SSDs can be used in both data centers and rugged embedded systems. Notably, most liquid-cooled systems employ conduction cooling using a cold plate that is tied to a liquid-cooling system, often referred to as direct-to-chip cooling. The liquid doesn't come in direct contact with the boards or modules in a cold-plate system.
The implementation of cold plates in SSDs significantly enhances heat management in AI-focused data centers, providing more efficient and direct cooling compared to air cooling. This innovation reduces SSD temperatures, mitigates thermal throttling, lowers fan and AC power consumption, and supports sustained high-performance AI workloads more effectively than traditional air-based cooling approaches.
[1] Source: [Link to the original research] [2] Source: [Link to the original research] [3] Source: [Link to the original research] [4] Source: [Link to the original research] [5] Source: [Link to the original research]
(Fig. 1) The Solidigm's E1.S D7-PS1010 SSD (Fig. 2) The Solidigm's D7-PS1010 SSD in a liquid-cooled chassis.
- The Solidigm's E1.S D7-PS1010 SSD, a hot-swappable LFT solution with a spring-loaded cold plate, not only enables hotter SSDs that run faster and have a higher capacity but also finds applications in both data centers and rugged embedded systems, emphasizing the integration of technology in various aspects of our lifestyle, such as home-and-garden or sustainable-living.
- The adoption of liquid-flow-through (LFT) solutions, like the D7-PS1010 SSD, is aligned with the advancements in technology, where chips are becoming larger and hotter due to approaches like chiplets, as the demand for efficient data-and-cloud-computing and AI workloads increases.
- By employing cold-plate liquid cooling systems in AI data centers, we can improve workload optimization, ensure reliability, and maintain consistently high performance, while simultaneously addressing the challenges faced in managing the heat output from increasingly dense and hotter chips, contributing to a more ecofriendly and sustainable lifestyle.