Nearly four years ago, the US announced the world’s first petascale system on the Top500 list. At the time, predictions where that the first exascale system would debut on the Top500 list around 2018. So how are we doing now that we are about 40% of the way to 2018?
This week, many of the key researchers behind exascale computing in the United States, as well as worldwide researchers, gathered in Oregon for the invitation only Salishan Conference on High-Speed Computing. A quick check of the latest Top500 list shows a total of 10 petascale systems, only five of them being from the US and the fastest US system now at position #3. Of course, this upcoming June when the next Top500 list is published, the US hopes to have a few new systems towards the top of the list, although none are expected to be much faster than perhaps 2% of an exascale (20 petaflops). Not to make any predictions, but exascale by 2018 is starting to look like quite a stretch.
So what did the combined wisdom of the Salishan gathering have to say about exascale? In many presentations, the software challenges of exascale made the top of the list. Not only will applications need to be updated to take advantage of future exascale systems, but everything from the basic operating system to system management tools will need to change to scale 500x. It is a good time to be a computer scientists, as witnessed by a resurgence in undergraduate admissions in many university computer science departments.
After software, interconnects for exascale was another hot topic, and that was before news broke mid-day of Intel’s latest shopping spree. By adding Cray’s interconnect assets to the InfiniBand portfolio Intel acquired from Qlogic earlier this year certainly launches Intel into a world powerhouse in HPC networking technology. Intel no doubt was getting tired of hearing Mellanox talk about how great Intel’s Sandybridge CPU was for their business.
While not unanimously accepted by everyone in the room, the DOE direction was certainly that in order to be affordable, exascale systems needed to be a built out of COTS (commercial off-the-shelf) technologies and couldn’t be 1-off proprietary white elephants. That certainly resonates with HP’s strategy of building everything from our sub-$100K GPU Starter Kit to Top-5 supercomputers out of industry standard processors, accelerators, networking, storage, and open source software. Cray’s sale of their interconnect assets to Intel, one could argue, acknowledges the same. I think it is a smart move on Cray’s part.
So when will HP build an exascale system and what will it look like? Certainly HP Labs is working on many interesting technologies from memristor non-volatile storage to advanced low power photonics that are likely to make their way into exascale systems. Taking a look at HP’s Project Moonshot will give you some other ideas. Because building an exascale system out of 100,000 or more servers will require a lot more than just racking and stacking those servers as one commonly does today. In Project Moonshot HP is starting to rethink how we build servers for hyperscale environments, be they exascale or massive web data centers. Getting 100,000 servers to work together requires you to design in the networking, storage, power, cooling, and management in conjunction with the server, not as an afterthought.
But no matter when we reach the exascale milestone, the coming years promise to be an exciting time for computer architecture, at both the hardware and software levels. Perhaps the best advice of the day was what I overheard in the lunch buffet line, where a very well known HPC center director was sharing his thoughts, “I don’t want my software developers to worry about what processors are going to look like, I want them developing parallel algorithms they express in OpenACC and then I will go beat up the compiler writers to make better OpenACC compilers for whatever processor we are deploying that year.” I bet Steve Scott from Nvidia wishes he had been standing next to me in line!