With the US Labor Day holiday once again behind us, most schools in the US are now back in session or close to it. I spend a lot of time interviewing recent or soon to be college graduates for HPC sales, marketing, and engineering jobs, so I thought I would help start off the new school year with some tips for students who have an interest in high performance computing.
Learn Hadoop (HDFS, Mapreduce, Pig, Hive, and other related open source software). This may seem like a strange first item for an HPC list, but fundamentally at the hardware level Hadoop clusters are very similar to HPC clusters. While five years ago few HPC sites knew how to spell Hadoop, today more and more computationally intensive HPC clusters are interacting with data intensive Hadoop clusters. Today fast networking like Infiniband and fast storage like SSD/Flash is just as likely to be considered for a state of the art Hadoop cluster as for an HPC cluster.
Speaking on being unknown five years ago, take a class in CUDA. CUDA is only five years old, so you have a pretty good chance that your professor is keeping up with technology and not repeating a decade old lesson plan, and that is always a good start. Even if you never write a CUDA application after the class is over, you will have learned a lot of important concepts about parallel programming and expressing parallelism in your code. It is exactly the same reason why taking a Java class is a great way to learn object-oriented programming. The world is only becoming more parallel and distributed.
Learn about open source software (generically), as both a development model and as a business model. To learn the former, pick an open source project that interests you and look at the bug list and pick a bug to fix and go give it a try. Now in reality, submitting a bug fix and getting it accepted into one of the larger, more established open source projects isn’t something a typical undergrad student has the time to do, but there are literally 1000’s of open source projects out there and you can learn a lot just by trying. Learning about open source business models might be a little trickier but I expect there are some good classes out there if you look that cover some parts of it.
Rethink that physics curriculum – aka take more physics. For computer science and computer engineering majors, physics was often a dreaded requirement thought of as more appropriate for future hard core electrical engineering chip designers. Guess what, architecting modern HPC solutions involves more physics than ever. From power delivery to cooling, core physics concepts are more essential than ever. And for the future HPC software pro, just like understanding the underlying architecture of an Nvidia Kepler GPU or Intel Xeon Phi co-processor is important, future exascale systems will require some levels of code to interact with and understand the power and cooling vs performance tradeoffs that future systems will enable.
Take some cross-disciplinary classes. While cross-disciplinary curriculum have been de rigueur at most schools for some time, industry has caught up too. HPC use in industry continues to grow faster than Moore’s Law as HPC technology is increasingly important to product quality, time to market, and overall competitiveness across almost every industry. Knowing a bit about financial services, life sciences, manufacturing, oil and gas, or media and animation will help you get an HPC job a lot more than just being an expert in the underlying HPC technologies.
Keep your head in the clouds. OK, cloud computing is perhaps one of the most overused words today, but if your school doesn’t provide HPC resources, there are many academic and commercial providers of HPC you can gain access to. From getting a free trial account on HPCloud.com to a million CPU-hour grant one of the NSF XSEDE resources, there are lots of ways to get hands-on experience with HPC clouds.
Thing big. If you are a college senior, when you started four years ago, there where only two Petaflop systems in the world. Today there are 20 and untold more will be deployed before you graduate. All too many great ideas that work great when implemented on a laptop or a small cluster fail to gain commercial adoption because they don’t scale. Think massively parallel at both the processor/node level and massively distributed at the inter-node level. Four or six cores doesn’t stress the parallelism of an application, four or six hundred cores starts to. Same at the inter-node level.
So for all those students out there, from first-year to post grads, have a great year, explore your passion for HPC in these and as many other ways as you can, and get ready for another exciting year of technology developments including those perhaps starting today in your next homework assignment!
This entry was posted in Uncategorized
. Bookmark the permalink