
Using LRDIMMs in High Performance Servers
LRDIMM (Load-Reduced Dual Inline Memory Module or “DIMM with reduced load”) is a type of memory modules supported by server platforms since 2012. LRDIMMs are similar to register DIMMs and fit the same memory slots. However, the operating principle of LRDIMM is different from RDIMM. Using LRDIMM in a regular server, you can make 512GB, 1TB or 1.5TB of memory.

Register DIMMs are connected directly to a bus connected to processor memory controllers. In the mode of operation with DIMMs, the memory controller controls each DRAM chip connected to the control line of the module. And the more these chips in the memory module (the so-called ranks), the greater the electrical load on the controller. Rank - the number of chipsets connected to one chip selection line. The rank is a characteristic of a memory module. The two and four ranged memory modules are shown below.

A two-rank module is two logical modules soldered on a printed circuit board and using the same physical data transmission channel in turn. Four-rank - a similar solution, but on a fourfold scale.
RDIMM is a register memory module. The name “register” means that modules of this type have a buffering register, which is used to buffer address and command signals.

In the case of LRDIMM, a special memory buffer chip is attached to the bus, attached to each module. When the controller works with LRDIMM modules, management is reduced to sending packet information (data and commands) to this module buffer - iMB (Isolation Memory Buffer). Unlike RDIMMs, not only control signals, but also data are buffered.

The buffer manages all read and write operations in DRAM. Signals of data and commands / addresses pass through it - this is an intermediary between the memory controller (Host Memory Controller) and DRAM.

When adding new DRAM chips (ranks) to the register DIMMs, the electrical load of the memory modules increases. As the number of ranks per memory channel increases, the memory performance decreases — the speed of its operation. For RDIMMs, it is optimal to install no more than two DIMMs per channel, because when using the third bank, the memory speed decreases. A channel is a “path” from a memory module to a controller through which read and write data are transmitted.
LRDIMMs do not have these limitations because they use memory buffer chips. When working with LRDIMM, the memory controllers in the processors operate in serial mode. Commands and data are transferred to the memory buffer, which controls all read and write operations in DRAM.
LRDIMMs significantly reduce the electrical load of DRAM chips on the data bus, and thanks to the so-called Rank Multiplication. Physical DRAM ranks look like a single logical rank of higher capacity to the memory controller. The following shows the multiplication of ranks for three LRDIMMs per memory channel.

Multiplication of ranks can be disabled, set to 2: 1 or 4: 1 - up to 8 physical ranks on LRDIMM. For example, four-rank LRDIMMs are converted for a memory controller into two-rank ones. That is, the controller perceives the four-rank module as two-rank, and the eight-rank module as four-rank. Due to this, the load of the multi-rank module becomes two times lower. As a result, the server can support LRDIMMs at higher speeds than RDIMMs.
Reducing the electrical load allows the system with LRDIMM to operate at a higher speed (memory clock frequency) at the same capacity, or to increase the RAM capacity while maintaining the same speed as in the configuration with RDIMM.

Thus, in practice, LRDIMM can be used to increase the speed of memory and / or increase its capacity. LRDIMMs provide higher speeds with higher capacities for users whose requirements are not met by 16 GB dual-rank RDIMMs or 32 GB four-rank RDIMMs.
For example, a two processor server with twenty-four memory slots can be configured as follows:
Another example: Intel Xeon E5 v3 processors contain a four-channel memory controller and support up to eight logical ranks per channel. In total, a maximum of eight four-rank modules of 32 GB can be installed per processor (two per channel). The memory capacity on the dual-processor board in this case cannot exceed 512 GB. Peer-to-peer or peer-to-peer modules can be put up to three per channel, but they will have less capacity.
If you use the four-rank LRDIMM modules, which the memory controller perceives as two-rank, then you can install up to 12 modules of 32 GB per processor - a total of 768 GB of memory operating at a higher frequency. Now there are LRDIMM on 64 and 128 Gigabytes, this allows you to get a fantastic amount of memory on the server - up to 1.5-2Tb!
Note that you cannot combine LRDIMM and DIMM - the system simply will not start.
In addition to increasing the capacity of RAM and its speed, the LRDIMM architecture has a number of other useful features. iMB, LRDIMM memory buffer, supports DRAM and LRDIMM testing tools, including transparent mode and MemBIST (Memory Built-In Self-Test), VREF (voltage reference) for data bus (DQ) and commands / addresses (CA), parity for commands, built-in control similar to register 32882 for RDIMM, optional SMBus interface (Serial Management Bus) for LRDIMM configuration and status registers, as well as an integrated temperature sensor.
Transparent Mode: used to test the memory module. The module works just like a buffer and transfers signals and data to DRAM chips.
MemBIST:To initialize DRAM and test components, LRDIMM supports the MemBIST (Memory Built-In-Self Test) feature. It serves to fully test DRAM. Testing is performed with a working frequency, access is used via the command / address bus or via SMBus.
VREF: LRDIMMs can use external voltage parameters for data (VREFDQ) and commands / addresses (VREFCA) or internal, from the memory buffer. If VREF is set by the memory buffer, then the host memory controller can control the voltage level. For this, the memory buffer configuration registers are used. Programmable voltage levels enable suppliers of memory modules and system components to guarantee the reliability and robust operation of LRDIMM memory interfaces.
Parity Check:to detect distorted commands on the command / address bus, a parity check is performed for incoming commands in the memory buffer. On error, the signal ERROUT_n is generated.
SMBus interface: the memory buffer supports out-of-band serial management bus control. It allows you to write and read data from status registers. Temperature sensor: it is integrated in the memory buffer and is updated 8 times per second. You can access it through the SMBus interface. To send a message to the memory controller about the high temperature, you can use the pin EVENT_n of the buffer.
The unbuffered data bus remains the weak link in the RDIMM memory system. For example, the four-rank DDR3 RDIMM is four electrical loads on the data bus. Therefore, the maximum speed of the four-rank DDR3 RDIMM is 1066 MT / s (million transactions per second) in the configuration of “one DIMM per channel” (one DPC) and 800 MT / s in the configuration “two DIMM per channel” (two DPC). In LRDIMM, the buffer uses both the data bus and the command / address bus. This allows you to increase the data transfer speed and memory density.
The following diagram shows the data bus diagram of the four-rank RDIMM in the “two DIMM per channel” configuration. It demonstrates that with 8 electrical loads on the data bus, the integrity of the signal in the memory channel is seriously degraded, which limits the frequency. At eight electrical loads and 1333 MT / s, the maximum “data eye” on the bus is reduced to 212 ps at the ideal VREF point and does not exceed 115 mV at maximum voltage. A “data window” is a period of time when the controller can read data, and this period shortens with an increase in the frequency at which the memory operates.

The compression effect of the data window means that two four-rank RDIMMs in the “two DIMM per channel” configuration are not suitable for operation at a speed of 1333 MT / s. We have to choose a compromise between memory capacity and its speed.
The data window diagram is shown below for two four-rank LRDIMMs in the “two DIMM per channel” configuration. The electrical load of the 8 physical grades of DRAM is replaced by two electrical loads of the memory buffer. Signal integrity has improved significantly. Although the conditions are similar to the previous illustration, the data window increased from 212 to 520 ps, and its maximum height increased from 115 to 327 mV.

Improving signal integrity means that LRDIMM can operate at speeds of 1333 MT / s and higher, even with multiple LRDIMMs per channel. You won’t have to choose between capacity and memory bandwidth.
One of the main advantages of LRDIMM is the ability to significantly increase RAM capacity without sacrificing memory performance. Due to the electrical isolation of DRAM from the data bus, you can add additional ranks to each DIMM while maintaining signal integrity, and install additional DIMM chips on each memory channel. A common option is a 32 GB LRDIMM. This is a 4 GB 4 GB module, DDP (dual-die package) DRAM. Since each LRDIMM represents one electrical load for the memory controller, you can also set more DIMMs per channel.
Take, for example, a dual-processor server with three DIMM memory slots per channel, four channels on the CPU. Using LRDIMM, the RAM capacity can be increased in comparison with RDIMM by two to three times. Below are the maximum capacities of RDIMM and LRDIMM for different speeds and voltages.

For example, for a 1.5V DDR3 memory at a speed of 800 MT / s for a system with a full set of RDIMMs, the RAM capacity when using 16GB 2Rx4 RDIMMs per channel can reach 384 GB. The use of LRDIMMs allows you to double this capacity - up to 768 GB. The limitations of the motherboard (usually 8 DRAM ranks per channel) are overcome by multiplying the ranks of the LRDIMMs. In this case, 12 physical ranks per channel are obtained.
At a speed of 1066 or 1333 MT / s, signal integrity limitations do not allow using more than three DIMMs per channel in a configuration with RDIMM. For 1.5V DDR3 memory with a speed of 1066 or 1333 MT / s, the maximum RAM capacity with RDIMM will be 256 GB. LRDIMM has no such restrictions, and you can set up three DIMMs per channel at 1066 MT / s (or 1333 MT / s). In this case, the total RAM capacity will be 768 GB, that is, three times more. For 1.35V DDR3L memory at 1333 MT / s, the LRDIMM advantage is even greater.
LRDIMM memory modules not only allow you to increase the memory capacity of the north, but also do it with minimal loss of energy efficiency. Although the memory buffer in LRDIMM in the “one DIMM per channel” configuration consumes more than the RDIMM in the same configuration, in high density configurations - 2 and 3 DIMM per channel - the difference is leveled.
The normalized power consumption on RDIMM or LRDIMM in configurations with one and two DIMMs per channel at different memory speeds is shown below. Since the actual power consumption depends on the density and DRAM technology used, the relative power is shown for the LRDIMM and RDIMM of the same generation of DRAM. These are 32GB 4Rx4 modules. The power of the RDIMM module at 800 MT / s is taken per unit. For measurements, standard tests were used with 50% write operation and 50% read operation.

At 800 MT / s in the “one DIMM per channel” configuration, LRDIMM consumes 17% more power than the RDIMM, but in the “two DIMM per channel” configuration, the difference is only 3%. At 1066 MT / s this is 15%, but in the “two DIMM per channel” configuration, the difference is also small. At 1333 MT / s, the power consumption on the LRDIMM in the “two DIMM per channel” configuration is 28% less than in the “one DIMM per channel” configuration.
Below are similar results for 100% reading. Since LRDIMM is mainly used in systems with a high memory density, the consumption of LRDIMM in the “two DIMM per channel” configuration is of more interest. There are practically no losses in energy efficiency in this case.

Most Intel E5 platforms can support two LRDIMMs per channel at 1333 MHz and a voltage of 1.5 V and three LRDIMMs per channel at 1066 MHz, which allows configurations with twelve LRDIMMs per processor; when using four-rank RDIMM modules, only 8 sockets per processor are involved and the maximum speed is 800 MHz.
How do I know if I need to use LRDIMMs at all? Determine the memory transfer rate for your server (see the vendor's docs for performance). If you need more than 8 x 32 GB per processor, then you need LRDIMM modules, otherwise it will be enough four-rank RDIMM modules with a capacity of 32 GB with a frequency of 800 MHz. If 1066 MHz or 1333 MHz are needed, only LRDIMMs should be used.
The restrictions on the ranks and maximum frequencies of memory functioning are shown below on the example of dual-processor motherboards Supermicro X9 (LGA2011) and X10 (LGA2011-3) series when installing Intel Xeon E5 2600 series processors of different generations.
Supermicro X10 Series + E5-2600 v3 (Haswell)

Supermicro X10 Series dual-processor cards do not support unbuffered memory modules (UDIMMs). Obviously, to achieve maximum RAM capacity and maximum speed, LRDIMM DDR4 modules are needed.

Hynix HMTA8GL7AHR4C-PBM2: RAM for the server, memory capacity: 64 GB, bandwidth: PC12800, type: DDR3 LRDIMM.
Kingston KVR16LL114 / 32 - DDR3L memory module, 32 GB capacity, LRDIMM form factor, 240-pin, 1600 MHz frequency, ECC, CAS Latency (CL) support: 11. The average price of such a module is 28 thousand rubles.

Samsung DDR4 2133 Registered ECC LRDIMM 32Gb memory module. The average price is about 22 thousand rubles. This is a 288-pin LRDIMM module with a frequency of 2133 MHz. There is support for ECC, CAS Latency (CL): 15.

Samsung 32GB 288-Pin DDR4 SDRAM DDR4 2133 (PC4 17000) Server Memory Model M386A4G40DM0-CPB, Cas Latency 15
memory module . In general, LRDIMM modules provide up to 35% higher memory bandwidth compared to standard RDIMM modules.
The use of LRDIMM will give the greatest effect for applications that use memory intensively, cloud computing, and HPC (high-performance computing) tasks when it is necessary to load into RAM and process large amounts of data. In a virtual environment, this makes it possible to increase the "density" of virtual machines. In data centers - to increase energy efficiency and reduce TCO (Total Cost of Ownership).
The technology does not stand still and Samsung introduced the new 128 GB LRDIMM memory modules. They use the technology of packaging chips called TSV (Through Silicon Via) - DRAM chips are connected vertically using electrodes passing through microscopic holes, as they did on 3D VNAND.

128GB RDIM TSV DDR4 DRAM is considered a true technological breakthrough. Its advantages are doubled capacity compared to previous standard modules, high speed and efficiency. Thanks to the 20-nm process technology, 128GB TSV DDR4 memory has 50% lower power consumption compared to 64GB LRDIMMs. It remains to clarify the price of the issue.
128GB in a server with 8 memory locations can be assembled on DDR3 RDIMM 16GBb8 each, that is, 9000 rubles * 8 = 72000 rubles. At LRDIMM, these are two 64GB strips of 30500r each, that is, costs will amount to 61,000 rubles, which is cheaper than a traditional solution. Moreover, now it makes little sense to overpay for motherboards with 16 memory slots - 99% of the servers can be assembled on 8-slot boards. This leaves 512GB of memory on the standard X9DRL.
While large 64GB DDR4 LRDIMMs cost 75000r apiece (64GB PC17000 LR M386A8K40BM1-CPB0Q SAMSUNG memory module in ELKO). If you put 32GB, then the price of LRDIMM DDR4 at 21000r apiece is 84000r for 128GB, which is slightly more expensive than regular register memory.
All this allows us to lease large dedicated servers at HOSTKEY even cheaper, reduce the price of virtual machines and make private clusters even more reliable and for less money.
Since 2008, we lease dedicated and virtual servers for rent, provide server hosting services in 4 data centers in Moscow, including two Tier III certified data centers. We specialize in large dedicated servers and creating private clouds and clusters for our clients based on them.
For our readers, we have a hot offer: Servers available on the basis of T-Platform supercomputers and Intel Xeon E5-2630v2 processors with a 15% discount until the end of December (or until they end) using the TMW5U0S8SE promo code
For example, for comparison:
- 2xE5 -2630v2 (12x2.6 GHz) / 64Gb RAM / 1x1Tb SSD + 1x1Tb 7.2K HDD = 17000r per month, with a discount of 14450.
- 2xE5-2630v2 (12x2.6 GHz) / 128Gb RAM / 1x2Tb SSD + 1x2Tb 7.2K HDD = 25700r per month, with a discount of 21800
- 2xE5-2630v2 (12x2.6 GHz) / 256Gb RAM / 2x2Tb Samsung SSD = 36500r per month, with a discount of 31000
- 2xE5-2630v2 (12x2.6 GHz) / 32Gb RAM / 2x600Gb SAS 10K = 13650r per month, with a discount of 11600r
All prices are inclusive of VAT, almost any configuration is possible.
All servers are connected on a gigabit channel, the traffic limit is 10 TB without restrictions. Each dedicated server is provided with remote access via IPMI, VLAN organization at a speed of up to 10Gbps is possible.

Memory buffer - the foundation of LRDIMM technology
Register DIMMs are connected directly to a bus connected to processor memory controllers. In the mode of operation with DIMMs, the memory controller controls each DRAM chip connected to the control line of the module. And the more these chips in the memory module (the so-called ranks), the greater the electrical load on the controller. Rank - the number of chipsets connected to one chip selection line. The rank is a characteristic of a memory module. The two and four ranged memory modules are shown below.

A two-rank module is two logical modules soldered on a printed circuit board and using the same physical data transmission channel in turn. Four-rank - a similar solution, but on a fourfold scale.
RDIMM is a register memory module. The name “register” means that modules of this type have a buffering register, which is used to buffer address and command signals.

In the case of LRDIMM, a special memory buffer chip is attached to the bus, attached to each module. When the controller works with LRDIMM modules, management is reduced to sending packet information (data and commands) to this module buffer - iMB (Isolation Memory Buffer). Unlike RDIMMs, not only control signals, but also data are buffered.

The buffer manages all read and write operations in DRAM. Signals of data and commands / addresses pass through it - this is an intermediary between the memory controller (Host Memory Controller) and DRAM.

When adding new DRAM chips (ranks) to the register DIMMs, the electrical load of the memory modules increases. As the number of ranks per memory channel increases, the memory performance decreases — the speed of its operation. For RDIMMs, it is optimal to install no more than two DIMMs per channel, because when using the third bank, the memory speed decreases. A channel is a “path” from a memory module to a controller through which read and write data are transmitted.
LRDIMMs do not have these limitations because they use memory buffer chips. When working with LRDIMM, the memory controllers in the processors operate in serial mode. Commands and data are transferred to the memory buffer, which controls all read and write operations in DRAM.
Multiplication Ranks
LRDIMMs significantly reduce the electrical load of DRAM chips on the data bus, and thanks to the so-called Rank Multiplication. Physical DRAM ranks look like a single logical rank of higher capacity to the memory controller. The following shows the multiplication of ranks for three LRDIMMs per memory channel.

Multiplication of ranks can be disabled, set to 2: 1 or 4: 1 - up to 8 physical ranks on LRDIMM. For example, four-rank LRDIMMs are converted for a memory controller into two-rank ones. That is, the controller perceives the four-rank module as two-rank, and the eight-rank module as four-rank. Due to this, the load of the multi-rank module becomes two times lower. As a result, the server can support LRDIMMs at higher speeds than RDIMMs.
Reducing the electrical load allows the system with LRDIMM to operate at a higher speed (memory clock frequency) at the same capacity, or to increase the RAM capacity while maintaining the same speed as in the configuration with RDIMM.

Thus, in practice, LRDIMM can be used to increase the speed of memory and / or increase its capacity. LRDIMMs provide higher speeds with higher capacities for users whose requirements are not met by 16 GB dual-rank RDIMMs or 32 GB four-rank RDIMMs.
For example, a two processor server with twenty-four memory slots can be configured as follows:
- LRDIMMs: 32 GB x 24 = 768 GB with a frequency of 1066 MHz and a voltage of 1.5V and 1.35V.
- RDIMMs: 32 GB x 16 = 512 GB with a frequency of 800 MHz and a voltage of 1.5V.
Another example: Intel Xeon E5 v3 processors contain a four-channel memory controller and support up to eight logical ranks per channel. In total, a maximum of eight four-rank modules of 32 GB can be installed per processor (two per channel). The memory capacity on the dual-processor board in this case cannot exceed 512 GB. Peer-to-peer or peer-to-peer modules can be put up to three per channel, but they will have less capacity.
If you use the four-rank LRDIMM modules, which the memory controller perceives as two-rank, then you can install up to 12 modules of 32 GB per processor - a total of 768 GB of memory operating at a higher frequency. Now there are LRDIMM on 64 and 128 Gigabytes, this allows you to get a fantastic amount of memory on the server - up to 1.5-2Tb!
Note that you cannot combine LRDIMM and DIMM - the system simply will not start.
LRDIMM Features
In addition to increasing the capacity of RAM and its speed, the LRDIMM architecture has a number of other useful features. iMB, LRDIMM memory buffer, supports DRAM and LRDIMM testing tools, including transparent mode and MemBIST (Memory Built-In Self-Test), VREF (voltage reference) for data bus (DQ) and commands / addresses (CA), parity for commands, built-in control similar to register 32882 for RDIMM, optional SMBus interface (Serial Management Bus) for LRDIMM configuration and status registers, as well as an integrated temperature sensor.
Transparent Mode: used to test the memory module. The module works just like a buffer and transfers signals and data to DRAM chips.
MemBIST:To initialize DRAM and test components, LRDIMM supports the MemBIST (Memory Built-In-Self Test) feature. It serves to fully test DRAM. Testing is performed with a working frequency, access is used via the command / address bus or via SMBus.
VREF: LRDIMMs can use external voltage parameters for data (VREFDQ) and commands / addresses (VREFCA) or internal, from the memory buffer. If VREF is set by the memory buffer, then the host memory controller can control the voltage level. For this, the memory buffer configuration registers are used. Programmable voltage levels enable suppliers of memory modules and system components to guarantee the reliability and robust operation of LRDIMM memory interfaces.
Parity Check:to detect distorted commands on the command / address bus, a parity check is performed for incoming commands in the memory buffer. On error, the signal ERROUT_n is generated.
SMBus interface: the memory buffer supports out-of-band serial management bus control. It allows you to write and read data from status registers. Temperature sensor: it is integrated in the memory buffer and is updated 8 times per second. You can access it through the SMBus interface. To send a message to the memory controller about the high temperature, you can use the pin EVENT_n of the buffer.
How to “overclock” LRDIMM?
The unbuffered data bus remains the weak link in the RDIMM memory system. For example, the four-rank DDR3 RDIMM is four electrical loads on the data bus. Therefore, the maximum speed of the four-rank DDR3 RDIMM is 1066 MT / s (million transactions per second) in the configuration of “one DIMM per channel” (one DPC) and 800 MT / s in the configuration “two DIMM per channel” (two DPC). In LRDIMM, the buffer uses both the data bus and the command / address bus. This allows you to increase the data transfer speed and memory density.
The following diagram shows the data bus diagram of the four-rank RDIMM in the “two DIMM per channel” configuration. It demonstrates that with 8 electrical loads on the data bus, the integrity of the signal in the memory channel is seriously degraded, which limits the frequency. At eight electrical loads and 1333 MT / s, the maximum “data eye” on the bus is reduced to 212 ps at the ideal VREF point and does not exceed 115 mV at maximum voltage. A “data window” is a period of time when the controller can read data, and this period shortens with an increase in the frequency at which the memory operates.

The compression effect of the data window means that two four-rank RDIMMs in the “two DIMM per channel” configuration are not suitable for operation at a speed of 1333 MT / s. We have to choose a compromise between memory capacity and its speed.
The data window diagram is shown below for two four-rank LRDIMMs in the “two DIMM per channel” configuration. The electrical load of the 8 physical grades of DRAM is replaced by two electrical loads of the memory buffer. Signal integrity has improved significantly. Although the conditions are similar to the previous illustration, the data window increased from 212 to 520 ps, and its maximum height increased from 115 to 327 mV.

Improving signal integrity means that LRDIMM can operate at speeds of 1333 MT / s and higher, even with multiple LRDIMMs per channel. You won’t have to choose between capacity and memory bandwidth.
A little bit about the capacity of system memory
One of the main advantages of LRDIMM is the ability to significantly increase RAM capacity without sacrificing memory performance. Due to the electrical isolation of DRAM from the data bus, you can add additional ranks to each DIMM while maintaining signal integrity, and install additional DIMM chips on each memory channel. A common option is a 32 GB LRDIMM. This is a 4 GB 4 GB module, DDP (dual-die package) DRAM. Since each LRDIMM represents one electrical load for the memory controller, you can also set more DIMMs per channel.
Take, for example, a dual-processor server with three DIMM memory slots per channel, four channels on the CPU. Using LRDIMM, the RAM capacity can be increased in comparison with RDIMM by two to three times. Below are the maximum capacities of RDIMM and LRDIMM for different speeds and voltages.

For example, for a 1.5V DDR3 memory at a speed of 800 MT / s for a system with a full set of RDIMMs, the RAM capacity when using 16GB 2Rx4 RDIMMs per channel can reach 384 GB. The use of LRDIMMs allows you to double this capacity - up to 768 GB. The limitations of the motherboard (usually 8 DRAM ranks per channel) are overcome by multiplying the ranks of the LRDIMMs. In this case, 12 physical ranks per channel are obtained.
At a speed of 1066 or 1333 MT / s, signal integrity limitations do not allow using more than three DIMMs per channel in a configuration with RDIMM. For 1.5V DDR3 memory with a speed of 1066 or 1333 MT / s, the maximum RAM capacity with RDIMM will be 256 GB. LRDIMM has no such restrictions, and you can set up three DIMMs per channel at 1066 MT / s (or 1333 MT / s). In this case, the total RAM capacity will be 768 GB, that is, three times more. For 1.35V DDR3L memory at 1333 MT / s, the LRDIMM advantage is even greater.
And what about the power consumption of LRDIMM?
LRDIMM memory modules not only allow you to increase the memory capacity of the north, but also do it with minimal loss of energy efficiency. Although the memory buffer in LRDIMM in the “one DIMM per channel” configuration consumes more than the RDIMM in the same configuration, in high density configurations - 2 and 3 DIMM per channel - the difference is leveled.
The normalized power consumption on RDIMM or LRDIMM in configurations with one and two DIMMs per channel at different memory speeds is shown below. Since the actual power consumption depends on the density and DRAM technology used, the relative power is shown for the LRDIMM and RDIMM of the same generation of DRAM. These are 32GB 4Rx4 modules. The power of the RDIMM module at 800 MT / s is taken per unit. For measurements, standard tests were used with 50% write operation and 50% read operation.

At 800 MT / s in the “one DIMM per channel” configuration, LRDIMM consumes 17% more power than the RDIMM, but in the “two DIMM per channel” configuration, the difference is only 3%. At 1066 MT / s this is 15%, but in the “two DIMM per channel” configuration, the difference is also small. At 1333 MT / s, the power consumption on the LRDIMM in the “two DIMM per channel” configuration is 28% less than in the “one DIMM per channel” configuration.
Below are similar results for 100% reading. Since LRDIMM is mainly used in systems with a high memory density, the consumption of LRDIMM in the “two DIMM per channel” configuration is of more interest. There are practically no losses in energy efficiency in this case.

Most Intel E5 platforms can support two LRDIMMs per channel at 1333 MHz and a voltage of 1.5 V and three LRDIMMs per channel at 1066 MHz, which allows configurations with twelve LRDIMMs per processor; when using four-rank RDIMM modules, only 8 sockets per processor are involved and the maximum speed is 800 MHz.
Do I need LRDIMMs?
How do I know if I need to use LRDIMMs at all? Determine the memory transfer rate for your server (see the vendor's docs for performance). If you need more than 8 x 32 GB per processor, then you need LRDIMM modules, otherwise it will be enough four-rank RDIMM modules with a capacity of 32 GB with a frequency of 800 MHz. If 1066 MHz or 1333 MHz are needed, only LRDIMMs should be used.
The restrictions on the ranks and maximum frequencies of memory functioning are shown below on the example of dual-processor motherboards Supermicro X9 (LGA2011) and X10 (LGA2011-3) series when installing Intel Xeon E5 2600 series processors of different generations.
Supermicro X10 Series + E5-2600 v3 (Haswell)

Supermicro X10 Series dual-processor cards do not support unbuffered memory modules (UDIMMs). Obviously, to achieve maximum RAM capacity and maximum speed, LRDIMM DDR4 modules are needed.

Hynix HMTA8GL7AHR4C-PBM2: RAM for the server, memory capacity: 64 GB, bandwidth: PC12800, type: DDR3 LRDIMM.

Kingston KVR16LL114 / 32 - DDR3L memory module, 32 GB capacity, LRDIMM form factor, 240-pin, 1600 MHz frequency, ECC, CAS Latency (CL) support: 11. The average price of such a module is 28 thousand rubles.

Samsung DDR4 2133 Registered ECC LRDIMM 32Gb memory module. The average price is about 22 thousand rubles. This is a 288-pin LRDIMM module with a frequency of 2133 MHz. There is support for ECC, CAS Latency (CL): 15.

Samsung 32GB 288-Pin DDR4 SDRAM DDR4 2133 (PC4 17000) Server Memory Model M386A4G40DM0-CPB, Cas Latency 15
memory module . In general, LRDIMM modules provide up to 35% higher memory bandwidth compared to standard RDIMM modules.
The use of LRDIMM will give the greatest effect for applications that use memory intensively, cloud computing, and HPC (high-performance computing) tasks when it is necessary to load into RAM and process large amounts of data. In a virtual environment, this makes it possible to increase the "density" of virtual machines. In data centers - to increase energy efficiency and reduce TCO (Total Cost of Ownership).
Alternative? 128GB LRDIMM!
The technology does not stand still and Samsung introduced the new 128 GB LRDIMM memory modules. They use the technology of packaging chips called TSV (Through Silicon Via) - DRAM chips are connected vertically using electrodes passing through microscopic holes, as they did on 3D VNAND.

128GB RDIM TSV DDR4 DRAM is considered a true technological breakthrough. Its advantages are doubled capacity compared to previous standard modules, high speed and efficiency. Thanks to the 20-nm process technology, 128GB TSV DDR4 memory has 50% lower power consumption compared to 64GB LRDIMMs. It remains to clarify the price of the issue.
Practical benefit
128GB in a server with 8 memory locations can be assembled on DDR3 RDIMM 16GBb8 each, that is, 9000 rubles * 8 = 72000 rubles. At LRDIMM, these are two 64GB strips of 30500r each, that is, costs will amount to 61,000 rubles, which is cheaper than a traditional solution. Moreover, now it makes little sense to overpay for motherboards with 16 memory slots - 99% of the servers can be assembled on 8-slot boards. This leaves 512GB of memory on the standard X9DRL.
While large 64GB DDR4 LRDIMMs cost 75000r apiece (64GB PC17000 LR M386A8K40BM1-CPB0Q SAMSUNG memory module in ELKO). If you put 32GB, then the price of LRDIMM DDR4 at 21000r apiece is 84000r for 128GB, which is slightly more expensive than regular register memory.
All this allows us to lease large dedicated servers at HOSTKEY even cheaper, reduce the price of virtual machines and make private clusters even more reliable and for less money.
A bit about HOSTKEY

For our readers, we have a hot offer: Servers available on the basis of T-Platform supercomputers and Intel Xeon E5-2630v2 processors with a 15% discount until the end of December (or until they end) using the TMW5U0S8SE promo code
For example, for comparison:

- 2xE5-2630v2 (12x2.6 GHz) / 128Gb RAM / 1x2Tb SSD + 1x2Tb 7.2K HDD = 25700r per month, with a discount of 21800
- 2xE5-2630v2 (12x2.6 GHz) / 256Gb RAM / 2x2Tb Samsung SSD = 36500r per month, with a discount of 31000
- 2xE5-2630v2 (12x2.6 GHz) / 32Gb RAM / 2x600Gb SAS 10K = 13650r per month, with a discount of 11600r
All prices are inclusive of VAT, almost any configuration is possible.
All servers are connected on a gigabit channel, the traffic limit is 10 TB without restrictions. Each dedicated server is provided with remote access via IPMI, VLAN organization at a speed of up to 10Gbps is possible.