Marc’s Best GPU Servers of SC14

This afternoon I spent a bit of time walking around the SC14 show floor and here is a list of my favorite NVIDIA GPU-powered servers. While very unofficial, I did follow a few guidelines. First, the NVIDIA partner had to have the server with NVIDIA GPUs displayed on the show floor. Second, there had to be someone in the booth who offered to talk to me intelligently about their GPU powered solutions. Based on those simple guidelines, here are my favorites. Its great to see so many new GPU powered solutions out on the show floor. If I missed one of your favorites, let me know, I’ll be out on the show floor again tomorrow and happy to take a look and listen.

Best Water-Cooled GPU Solution

HP’s Apollo 8000 also wins extra bonus points as the tallest GPU server. This beast is not for the casual user. Not only does this rack pack 72 server nodes with 144 GPUs into a single rack, it also manages to include all the Mellanox InfiniBand leaf switches you need for a full fat-tree topology. Besides the efficiency of HP’s unique liquid cooling solution, the Apollo 8000 also saves power with its 480V power supply and HVDC internal power distribution. While some of the other solutions may physically fit more than 144 GPUs in a rack, this is likely the densest GPU solution you can actually operate, especially when you consider it integrates in all the InfiniBand leaf switches. Downside? Only two GPUs per node are offered.

Best 8-GPU Solution Proven in Top500

Cray’s CS-Storm hits the other end of the GPUs per node range, supporting 8 GPUs in a compact 2RU form factor. As so many new GPU powered servers are now available, many of the systems out on the show floor have yet to be proven out in large Top500 configurations. Not the CS-Storm, that managed to be the only new server to break into the Top 10 of the Top500. While the CS-Storm is a standalone rack-mount server, it really is intended to be sold in complete rack configurations, with Cray integrating not only the power and optional rear door water cooling likely to be required by most full-rack configurations, Cray also does one of the best jobs of integrating an entire software stack including OS and management tools. Downside? The CS-Storm requires a non-standard width rack. Penalty points for only being displayed behind a plastic cover. SC14 is the last great hardware show on the planet, we want to leave fingerprints on your servers.

Most Improved 4-GPU Solution

While Dell ships a lot of NVIDIA GPUs, they haven’t historically had category leading products. Well that all changed on Monday with the new C4130. Moving away from earlier multi-node GPU designs and their complications, the C4130 is a new single-node, 1RU, 4GPU server which is even “EDR-ready” for the new Mellanox 100G InfiniBand, thanks to careful PCI slot layout. Dell also figured out how to support all 4 GPUs with a single x86 CPU, so customers who’s applications don’t need the extra serial performance can skip paying for the extra CPU. Especially with the new NVIDIA K80 GPU module sporting 2 Kepler GK210 GPU chips in each module (8 GPUs total), the C4130 promises to quickly become a workhorse GPU solution.

Best Non-x86 GPU Server

The IBM booth was happily displaying this unnamed future OpenPower based server. Supporting two NVIDIA K80 GPUs in 2RU, with up to 1TB of RAM, this promises to be an interesting server for customers wanting to get started with with Power + GPUs today before the next-generation NVLink connected Pascal + Power8+ systems start shipping.

Best 8-way PCI Design

Cirrascale has an interesting 8-way design that allows up to 8 NVIDIA GPUs to configured on a single PCIe root complex which is optimal for some applications with heavy GPU peer to peer communications. Most other 8-way designs split the GPUs between the separate PCIe root complex’s of the two host CPUs. The same server also supports a more traditional split PCI design. This isn’t the densest solution at 5RU, but as denser solutions typically require water cooling, the 5RU design isn’t likely to be an issue and in fact makes air cooling a lot easier than some of the denser designs.

Best Dense 8-way Standard Rack Mount Server

Penguin wins this one by managing to fit 8 K80’s into a standard width 2RU rack mount server. More than a quarter rack of these and you had better start shopping for water cooled rear doors. But if you are looking for a super-dense 8-way server, you should take a look at Penguin.


About Marc Hamilton

Marc Hamilton – Vice President, Solutions Architecture and Engineering, NVIDIA. At NVIDIA, the Visual Computing Company, Marc leads the worldwide Solutions Architecture and Engineering team, responsible for working with NVIDIA’s customers and partners to deliver the world’s best end to end solutions for professional visualization and design, high performance computing, and big data analytics. Prior to NVIDIA, Marc worked in the Hyperscale Business Unit within HP’s Enterprise Group where he led the HPC team for the Americas region. Marc spent 16 years at Sun Microsystems in HPC and other sales and marketing executive management roles. Marc also worked at TRW developing HPC applications for the US aerospace and defense industry. He has published a number of technical articles and is the author of the book, “Software Development, Building Reliable Systems”. Marc holds a BS degree in Math and Computer Science from UCLA, an MS degree in Electrical Engineering from USC, and is a graduate of the UCLA Executive Management program.
