Choosing the right memory configuration for your next server can have a significant impact on application performance. Long gone are the days when all you needed to do was specify memory size based on simple “GB per core” rules. Yet I am surprised how many customer RFPs still have little in the way of memory requirements besides capacity. The DIMM type, number of memory channels, DIMMs per channel (DPC), DIMM size, DIMM speed, and channel speed can all impact application performance, often resulting in 33% or greater variation in application performance, even with two identical servers configured with exactly the same memory capacity. In this blog I’ll cover some of the important things to remember when specifying server memory configurations.
For starters, you need to understand some basics about UDIMMs, also referred to as unbuffered or unregistered memory and RDIMMs referred to as registered or buffered memory. To this alphabet soup you should also add LRDIMM, or Load-Reduced DIMM.
Most modern day servers support either 3 or 4 memory channels per CPU socket and soon 4 memory channels per socket will become the dominant standard. This part of memory configuration is pretty simple, be sure you are specifying at least one memory DIMM per channel. A server with four 4-GB memory DIMMs (one DPC) will almost always perform better than a server with two 8-GB memory DIMMs per channel. Bigger is not always better when it comes to memory DIMMs.
In addition to the number of memory channels, DPC is important. Most servers support from one to three DPC. A cut-rate server with only one DPC may seem like a good buy, until you need to expand your memory and find out there is no way to do so without throwing away your DIMMs and purchasing higher capacity DIMMs. But, more DPC is not always better. Many servers clock down the memory to a lower speed when you add a second or a third DPC. Buying two state-of-the-art 1600 MHz DIMMs per channel does you little good if the server will clock down the memory to 1333 MHz. In that case you might as well purchase the less expensive 1333 MHz memory. So be sure to specify both the speed of the DIMM and the speed of the memory channel if purchasing more than one DPC.
DIMM size is at least still fairly straightforward. At least until you consider the impact of the above. All other things being equal (DIMM type, DIMM speed, DPC, channel speed), a larger DIMM will give you more capacity. Memory prices fluctuate widely based on market conditions, and thanks to Moore’s law we continue to see regular density (although not speed) increases in memory DIMMs. Today, while 4 GB DIMMs are still sold in some servers, the lowest cost/MB is typically achieved with 8GB DIMMs, with a small price penalty to move up to 16 GB DIMMs. Larger 32 GB DIMMs are still quite rare to find in general use because of their high cost but of course this will change over time.
Recently introduced 1600 MHz DIMMs are generally the fastest available today. Compared to the slower 1333 MHz DIMMs, you will see a fairly linear decrease in latency and increase in throughput with 1600 MHz DIMMs. For any performance sensitive application, you should avoid the slower 1066 MHz DIMMs or any server configuration that clocks down the memory bus to 1066 MHz. Again, be sure to ask not only about the max memory and channel speed, but the channel speed your server will operate at as configured. When adding a 2nd or a 3rd DPC, many servers will clock down the memory bus.
In researching this article, I worked with HP’s HPC benchmarking lab to measure latency and throughput of various memory DIMMs and memory configurations using HP’s Cluster Platform 3000 SL6500 with Xeon E5 (Sandy Bridge – EP) 8C 2.60GHz CPUs and FDR Infiniband, along with other HP servers. While this server has not yet been officially launched, some lucky customers like Purdue University already have similar systems up and running and on the Top500 list. We tested a variety of 4 GB, 8 GB, 16 GB, and 32 GB DIMMs, using UDIMMs, RDIMMs, and LRDIMMs, in a variety of configurations including 1 and 2 DPC. Some HP servers which we did not test including the HP ProLiant DL360 also support 3 DPC. In general, the best combination of latency and throughput was achieved with 16 GB RDIMMs running at 1600 MHz with 2 DPC.
Of course, if all of this sounds a bit confusing, don’t worry, HP’s HPC Competency Center is standing by and ready to help you configure your next HPC solution and optimize the memory configuration as well as all other aspects of your system.