Last month, PSC received an NVIDIA DGX-2, the world’s most powerful deep learning system for the most complex challenges. The DGX-2 is the first 2 petaFLOPS system that combines 16 fully interconnected GPUs for 10X the deep learning performance.
These photos show, from delivery to completion, the installation of the DGX-2.
The Arrival
Paola Buitrago, director of PSC’s Artificial Intelligence & Big Data Group, had the honor of wheeling in PSC’S new DGX-2, delivered to PSC’s machine room on “need date here.”
Bridges has been a popular platform for AI research, partly because it offers a mix of traditional CPUs and GPUs–graphics processing units, which are much more powerful than CPUs for some computation tasks, particularly in AI. But the new Bridges-AI addition to the system, of which the DGX-2 is an important component, will raise that capability to the next power.
The Unveiling
Interim Director Nick Nystrom and Paola Buitrago are excited to remove the packaging from the new resource.
The Staging
Clint Perrone, senior systems specialist, and Nick Nystrom carefully take the machine components to the staging area.
…
Staging Complete. On to the installation!
The Installation Begins
…
…
The DGX-2 is installed in its Bridges cabinet.
With a projected 2 petaflop-per-second processing speed, the unit will increase Bridges’ overall speed considerably. The DGX-2 at PSC is the first of its kind available for open research–a new generation of GPU machine specially designed to accelerate AI research.
Let’s Talk
[Nick, I think this would be a good place to talk about the interconnects and the role they play in the DGX-2’s performance — but I’m not sure exactly what the brag points would be. I know that the DGX-2 has it’s own interconnect that’s different from Omnipath, but not much more than that. What could we say here?]
…
With its power and interconnect installed, the DGX-2 runs through its initial paces.
The relative ease of installation is paralleled by ease of use for users; the DGX-2 will offer popular languages such as Python, Jupyter, R, MATLAB, Java, Spark and Hadoop. In addition to these popular platforms, Bridges can also run virtual machines and containers, support gateways and even function interactively.
Testing, Testing 1-2-3
Powered up now, the unit gets its firmware upgraded and diagnostics run.
Ready for Users
The complete Bridges-AI, with the DGX-2 installed, is ready to go.
The system’s early user period for testing its capabilities has started, with groups using the system for critical problems such as using deep learning to mine fMRI data for clues about the likely clinical course in emphysema; to identify the subtle relations between genes that govern cancer; to identify patterns in traffic data that can be used to prevent gridlock; and to investigate whether the new capabilities of the Volta GPUs in Bridges-AI can allow new AI programming that removes some of the field’s current limitations.