Last week I had the opportunity to meet with a number of HP engineers as well as several of our HPC partners and several startups and discuss some of the design challenges of future HPC systems. Achieving Exascale performance by the end of the decade will require many new technologies to be developed, such as those being researched by the HP Labs Intelligent Infrastructure project. But even by the middle of the decade, HPC systems are likely to look quite different from today. Here are a few of the potential changes.
Power and cooling continue to gain importance in HPC as well as in other hyperscale environments such as the mega data centers being built by today’s social networking, cloud, and search companies. Computer systems engineers, typically with a background in electrical engineering or computer science, are today being required to think about the basics of plumbing and advanced thermodynamics. While many different advanced cooling systems have been demonstrated by vendors over the last several years, few if any are ready to be economically deployed today at the scale of a 10,000 server HPC system much less a 100,000 server mega data center. Promising techniques being worked on today in the industry include cooling servers with room temperature water, vs the typical chilled water cooling, as well as heat re-use. Systems such as the CLUMEQ supercomputer demonstrated the potential for heat reuse several years ago, and the challenge is now to do this at the rack level with industry standard components.
Faster storage. Sure, I can scale a Lustre parallel file system solution to dozens of PBs and more and more HPC centers are looking at distributed technologies like Hadoop to solve the “big data” challenge, but what other fundamental technology changes are likely to impact HPC storage over the decade? Near term, there are many startups working to build higher performance solutions out of SSD-Flash technology. The short term (1-2 years) advances here are not going to be in revolutionary flash technologies. Flash roadmaps are well understood and seeing evolutionary improvements. However, the looming mainstream introduction of PCIeGen3 server interconnects and new PCIeGen3 flash controllers offers interesting possibilities. To take advantage of the storage bandwidth possible with PCIeGen3 will require rethinking the software interface. Strip away legacy storage protocols (FC, SCSI, SAS, etc.) and even perhaps the file system and you now have real possibilities. Longer term, today’s flash technology will give way to fundamental new memory technologies, such as HP’s memristor which promise to provide not only new levels of performance but significantly lower power usage.
Faster networking doesn’t just mean moving from 10G to 40G ethernet or QDR Infiniband to FDR Infiniband, but addressing the management and scalability of today’s HPC networks. It is going to be an exciting decade for networking as the proliferation of “merchant silicon” for high speed networking from the likes of Intel/Fulcrum, Mellanox, Broadcom, and others enables new startups to take on the industry giants just as x86/Linux took on proprietary server vendors a decade ago. Of course with a level playing field for the hardware COGS of a switch, my technology bet is on the players that bring differentiated software to the table. As a networking switch becomes not much more than an x86 or other industry standard processor and a commodity networking chip, the boundaries between servers and switches will become increasingly fuzzy. Is that your server acting like a switch or is it your switch running apps?
Add to the above list alternative, more power-efficient, processors and their programming models and HPC system designers have plenty to keep them busy for many years to come. A few recurring themes though echoed throughout the week. First of all, its hard to bet against open industry standards in the long run. Second, scale matters. Winning technologies are going to be the ones that can be used throughout the industry vs one-off specialized systems that are only used by 1 or 2 customers. Finally, there is lots of room for innovation, be it at the world’s largest technology companies or 25 person startups. One thing however hasn’t changed, from the days of the first Cray-1 supercomputer, to today, the HPC industry has been an exciting one to be in as it pushes ahead the leading edge of technology.