When FLOPS Become Free

The Top500 list has for years focused on peak FLOPS (floating point operations per second), and companies like HP take great pride in pointing out our 159 systems in the Top500, so what happens when FLOPS become free? That was certainly one of many intriguing topics being discussed by speakers such as Nvidia’s Chief Scientist (and Stanford University Professor) Bill Daly at this week’s Salishan Conference on High Speed Computing.

Bill’s talk this morning focused on what he called the new challenge in HPC systems design, memory locality. Memory locality is directly related to power consumption. Accessing nearby memory on chip, say in a processor core’s L1 cache, takes an average of 2pJ (picojoules). Accessing memory on the far side of a chip, say in the L2 cache of a different core, can take up to 150 pJ, while accessing off-chip memory such as DRAM can take up to 2nJ (nanojoules) or 1000x as much energy as local on-chip memory.

Bill went on to estimate that many HPC systems today use 1-2 nJ/FLOP. To get to Exascale by the end of the decade will require improvements in system design and in software design to get to 20 pJ/FLOP. Simple Moore’s law extrapolation and anticipated integrated circuit advances to 10 nm process technology will not by themselves yield usable exascale systems in this decade. Speaker after speaker echoed the theme that software, as well as the underlying algorithm design, will need to change to focus more on work done per unit of data movement, i.e. on data locality, than on FLOPS.

So there you have it, FLOPS become free and memory locality becomes king. So while today you can continue to expect to pay more for a server with a 2.5 GHz processor than one with the same processor running at 2.0 GHz, who knows, maybe by the end of the decade we will price servers based on new memory locality metrics rather than on FLOPS. But for the software developer, the message is clear, consider FLOPS to be free and focus on optimizing for memory locality.

About Marc Hamilton

Marc Hamilton – Vice President, Solutions Architecture and Engineering, NVIDIA. At NVIDIA, the Visual Computing Company, Marc leads the worldwide Solutions Architecture and Engineering team, responsible for working with NVIDIA’s customers and partners to deliver the world’s best end to end solutions for professional visualization and design, high performance computing, and big data analytics. Prior to NVIDIA, Marc worked in the Hyperscale Business Unit within HP’s Enterprise Group where he led the HPC team for the Americas region. Marc spent 16 years at Sun Microsystems in HPC and other sales and marketing executive management roles. Marc also worked at TRW developing HPC applications for the US aerospace and defense industry. He has published a number of technical articles and is the author of the book, “Software Development, Building Reliable Systems”. Marc holds a BS degree in Math and Computer Science from UCLA, an MS degree in Electrical Engineering from USC, and is a graduate of the UCLA Executive Management program.
This entry was posted in Uncategorized. Bookmark the permalink.