Big Data Takes Control Back

This week HP launched the HP ProLiant SL4500 server, purpose-built for big data. Its about time someone helped big data take control back from generic server and storage platforms. Walking the show floor at SC12 just about every vendor was touting how their storage platform, typically based on hardware architectures developed long before Google developed Hadoop, could be used for “big data”. Lets take a look at how the SL4500 is different and will let big data take control back from your legacy storage devices.

Historically the storage world classified storage as direct attach (DAS), network attach (NAS), or SAN (storage area network). While “big data” encompasses a broad set of use cases, in general, big data tried to move compute closer to the storage on a large scale — 100’s or 1000’s of storage nodes, if not more. While SAN still has its place in the enterprise, the cost and complexity of scaling a SAN to 100’s of compute nodes generally rules out SANs from consideration for big data applications. High performance NAS solutions are sometimes used for “big data”, although scaling NAS to 1000’s of compute nodes also runs out of runway quickly. That left early adopters of big data apps like Hadoop cobbling together hardware either by finding an expandable x86 server you could stuff a lot of disk drives into or by stacking a low-end storage array on top of a server. With the announcement this week of the SL4500, big data fans can take control back of their data and let all those standalone x86 servers go back to jobs they are better suited for.

Big data today means a lot more than just Hadoop, so the SL4500 product family gives you a choice of different modular configurations optimized for different types of big data applications. While probably the least important configuration choice in the SL4500 family is CPU, I’ll start there since our naming starts to differentiate with the CPU type. For AMD fans, we have the ProLiant SL4545 G7 and for Intel fans we have the ProLiant SL4540 Gen8. Followers of HP product naming should recognize the familiar pattern, “5” at the end of the product number indicates an AMD CPU and “0” indicates Intel. If we introduce ARM based servers in the future it will be good job security for marketing.

CPU choices aside, the rest of the modularity of the SL4500 is mostly the same on both AMD and Intel versions so I can just stick to the generic SL4500 naming and you can substitute either SL4540 or SL4545 below depending on your processor vendor affinity.

Lets start with the aptly named 1 node version of the SL4500. The 1-node SL4500 packs up to 60 SAS, SATA, or SSD drives together with a 2-socket x86 server in a single 7.5 inch high enclosure. That is up to 180 TB of storage today or 240 TB of storage in Q1 when 4 TB drives become available. While the classic big data use case for this server would be cloud-based object storage using OpenStack or your favorite cloud object storage software, nearly every visitor to our NDA suite at SC12 came up with additional use cases. After all, the SL4500, logically, is a regular x86 server with a whole bunch of disk drives. Install your favorite Linux distro and you have a super-dense NFS server, or load up Windows Server software and you have a great CIFS file server.

Of course, for some applications, 60 drives worth of data need a little more than 2 sockets of compute or want higher availability than can be provided by a single server. Blink an eye and the SL4500 turns into the SL4500 2 node (kudos to marketing, do you catch a simplified naming trend attempting to take back control of product naming). Now with the same enclosure, you get two of your favorite AMD or Intel servers, each with 25 SAS, SATA, or SSD drives. Of course I should add that as in the 1-node, each server module in the SL4500 gets its own smart array storage controller supporting not only various RAID options but all sorts of other goodies (which vary slightly by the smart array model). The 2-node SL4500 is ideal for running Microsoft Exchange or any other type of big data application that wants servers in pairs with lots of drives.

Saving the classic big data app Hadoop for last, yep, you guessed it, the SL4500 3-node is based on the same enclosure and three of your favorite AMD or Intel server now with 15 SAS, SATA, or SSD drives. Server options remain the same as in the 1-node and 2-node options, and some of the key options I have not yet mentioned is HP’s I/O accelerator, a PCI card filled with flash chips for when your app needs non-volatile storage even faster than an SSD, and your choice of 1G and 10G ethernet as well as Infiniband network interconnects.

I’ve mentioned just a few of the big data use cases for the SL4500. There are of course 100’s of big data applications being deployed by customers today, and finally those apps can stand up and take control back from server form factors invented long before Google developed what turned into Hadoop. Let me know what you would like to run on the SL4500.


About Marc Hamilton

Marc Hamilton – Vice President, Solutions Architecture and Engineering, NVIDIA. At NVIDIA, the Visual Computing Company, Marc leads the worldwide Solutions Architecture and Engineering team, responsible for working with NVIDIA’s customers and partners to deliver the world’s best end to end solutions for professional visualization and design, high performance computing, and big data analytics. Prior to NVIDIA, Marc worked in the Hyperscale Business Unit within HP’s Enterprise Group where he led the HPC team for the Americas region. Marc spent 16 years at Sun Microsystems in HPC and other sales and marketing executive management roles. Marc also worked at TRW developing HPC applications for the US aerospace and defense industry. He has published a number of technical articles and is the author of the book, “Software Development, Building Reliable Systems”. Marc holds a BS degree in Math and Computer Science from UCLA, an MS degree in Electrical Engineering from USC, and is a graduate of the UCLA Executive Management program.
This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to Big Data Takes Control Back

  1. Are you having flashbacks to working at sun now?
    I recall a 4500 product with a whole lot of SATA disks in it being release by sun a few years ago.

Comments are closed.