One of this week’s big GPU Technology Conference announcements was NVLink. If you somehow missed the big announcements, NVIDIA’s Ian Buck does a great job explaining the highlights to insideHPC in the short video below.
Thinking back on my days working for server companies, NVLink is going to open up tremendous opportunities for innovation in the server space. Some of the basic NVLink configurations possible are illustrated below.
The large majority of data center servers today, when you lift up the covers, share the same basic design built around two CPUs. Long before server vendors added a GPU to any server, the PCI bus was used for all sorts of different add-on cards, including network adapters, storage controllers, and hundreds of different types of relatively low speed interfaces. A recent search for pci card on Amazon returned over 39,000 results. But I doubt the original creators of PCI ever envisioned connecting something as powerful as a modern GPU via PCI. Server vendors, processor vendors, and GPU vendors have gone to great lengths to continue to increase PCI performance, but with the requisite backward compatibility required, PCI simply has not kept up.
On Tuesday, GTC session 4145 by Chevron’s Thor Johnsen spake about how they are using servers with 16 Kepler GPUs for high frequency elastic seismic modeling. That application clearly has very different requirements than one running on a server with just one or two GPUs. NVLink frees the server vendor to design servers which much more closely match GPU performance for a specific market to CPU performance. I expect rather than start with two CPUs and then add GPUs as needed, we will see server vendors drive performance innovation by designing in exactly as many GPUs and CPUs into a server as needed for different classes of applications.
NVLink is also expected to drive performance innovations across an ever broadening ecosystem of ARM and OpenPower processor vendors. Design cycles for modern processors can take several years or more, much longer than the design cycle for the servers that ultimately will use that processor. Avoiding the PCI bottleneck by combining a GPU and a CPU into a single processor chip has the disadvantage of fixing the CPU to GPU ratio at design time. By allowing a broad and flexible range of CPU to GPU ratios, NVLink allows many more possible performance innovations than a solution based on fixed CPU to GPU ratios.
Combined with the new 3D memory announced for our next generation Pascal GPU, along with a strong roadmap of new CUDA features, NVLink promises to drive performance innovations in a new generation of servers to address the insatiable computing demands of HPC, big data, and machine learning problems. Innovation is certainly alive and well at the GPU Technology Conference this week.