Getting Started With GPU Computing

With the launch of the M2090 last week, I challenged my team to come up with an affordable, easy to buy, easy to install, GPU Starter Kit. HP customers like George Tech have been benefiting from the performance and efficiency of GPU computing with their Keeneland HP SL390 & Nvidia M2070 powered supercomputer for over six months, but I still see too many customers trying to assemble their own GPU systems from the 14 different HP ProLiant servers qualified with Nvidia GPUs, or worse, other vendors x86 systems that may not even be tested or qualified with Nvidia.

My guidelines were pretty simple:

  • Make it a turn-key system, something we can ship from the factory installed in a single rack with all the hardware and software needed to run CUDA-enabled applications when first powered on at a customer site
  • Offer at least 10 TF of peak performance for about $100K

    There are still a few folks that would rather have their GPU supercomputers delivered like this

    but that isn’t how Georgia Tech managed to get up and running in less than a week. So if the 7-rack Keeneland system was up and running in less than a week, the single rack GPU Starter Kit should be designed to be up and running in less than a day.

    Here is what the team came up with.

    We will be sharing more details around our new GPU starter kit next month at the HP Discover event in Las Vegas, at the HP-CAST meeting, and on the show floor at ISC11, but you can buy one today by contacting your HP sales rep and asking for a quote for the GPU Starter Kit.

  • About Marc Hamilton

    Marc Hamilton – Vice President, Solutions Architecture and Engineering, NVIDIA. At NVIDIA, the Visual Computing Company, Marc leads the worldwide Solutions Architecture and Engineering team, responsible for working with NVIDIA’s customers and partners to deliver the world’s best end to end solutions for professional visualization and design, high performance computing, and big data analytics. Prior to NVIDIA, Marc worked in the Hyperscale Business Unit within HP’s Enterprise Group where he led the HPC team for the Americas region. Marc spent 16 years at Sun Microsystems in HPC and other sales and marketing executive management roles. Marc also worked at TRW developing HPC applications for the US aerospace and defense industry. He has published a number of technical articles and is the author of the book, “Software Development, Building Reliable Systems”. Marc holds a BS degree in Math and Computer Science from UCLA, an MS degree in Electrical Engineering from USC, and is a graduate of the UCLA Executive Management program.
    This entry was posted in Uncategorized. Bookmark the permalink.

    6 Responses to Getting Started With GPU Computing

    1. Pingback: Firing Up HP’s GPU Starter Kit |

    2. Pingback: HP POD 240a (aka EcoPOD) | Marc Hamilton's Blog

    3. Steven Eliuk says:

      Any idea on cost of this system? One must remember it is still a cluster, any latency graphs on data share between nodes?

    4. Hi Steven,
      Sorry for the long time to reply to your comment, your comment got buried in my email queue. We are selling the GPU starter kit for $99,098 (US).
      Latency between nodes is going to depend on the application and the protocol used. The Mellanox CX-2 QDR IB chip in the SL390 has latency as low as 1.2us.

    5. steven says:

      Thank you Mark,

      However, latency between copies is a real problem as 95-percent of algorithms can not be broken down so they are completely independent. Technologies such as infiniban from SGI have greatly reduced note to node communication to a constant and this is incredibly important for ease of algorithm design and facilitation of GPU use of greater than four. As nodes with one GPU mean each GPU has complete use of the PCI bus and does not need to compete for PCI throughput, and with the new virtual addressing scheme found in CUDA most of this is even easier as the DMA requests are now automated.

      If you could provide more information concerning the exact node, GPU layout, it would make purchasing easier.

      Can we run some of our multi-GPU visualization and reconstruction algorithm on your systems to see how they fair?

      • Hi Steven. Infiniband is an industry standard supported by many vendors, not just SGI, and all HP SL390s servers come with Infiniband built-in. On the SL390s, each GPU has a dedicated PCIe x16 connection to avoid PCI bus contention. I will email you directly a block diagram that shows the actual GPU to CPU connections. We would also be happy to run any sample code you have in one of our benchmark centers.
        Happy Holidays.

    Comments are closed.