fortyseven October 1, 2013 at 17:58

New Dedicated Server Configuration

Today we introduce a new configuration of dedicated servers: Intel Xeon E3-1270v3, 32GB RAM, 2x240GB SSD. Behind these brief figures are really great opportunities. Let's consider them in more detail.

The new configuration uses the latest Intel development - the Xeon E3 processor based on Haswell architecture. Processors of the Haswell family are available in a 22-nanometer process using three-dimensional transistors (Tri-Gate technology).

Among the innovations, one should name, firstly, support for the AVX2 and FMA3 instruction sets, thanks to which the processor can perform addition and multiplication operations within a single clock cycle. In theory, this leads to a significant increase in productivity. To use these instructions, you need to update or at least recompile the code.

Secondly, Haswell processors are also characterized by an extended bandwidth to the L1 and L2 cache, which can significantly speed up access to data and, therefore, the execution of applications.

Thirdly, the new processors include hardware support for transactional memory. Many experts call this innovation the most non-trivial extension of the x86 architecture in recent years, and it is worth talking about it separately.

Transactional memory

All programs have variable memory areas in which their data is stored. If several control streams work with this data, then the work should be organized so that there are no problems with parallel access (such as, for example, reading a memory area that is being written in parallel from another stream, or writing from two streams simultaneously).

Most multithreaded applications use lock-based synchronization to prevent these problems. Before any access to data, a lock must be set on them. While one thread is modifying data, other threads are waiting for this lock to be released. To ensure the parallel operation of several threads, it is necessary to set a lock for each more or less dependent part of the program data. Putting this into practice, however, is very difficult.

An alternative to lock-based synchronization is to use transactional memory. Transactional memory methods work as follows: a thread completes shared memory changes without regard to what other threads are doing, and logs any read or write in the log file. After completing the full operation, the reader checks to see if the other threads made changes to the memory that was previously accessed. If a transaction cannot be completed due to change conflicts, it is aborted and executed again until it successfully completes. The advantages of this approach are obvious: no thread needs to wait to gain access to the resource, and different threads can simultaneously modify non-overlapping data structures that would be protected by a lock.

Until recently, transactional memory support could only be implemented programmatically. Software support for transactional memory is a very complex and time-consuming task, which not every programmer can do. The new expansion of the x86 architecture allows you to solve many problems at the hardware level and is a definite step forward.

Haswell transactional memory support is implemented using the TSX (Transactional Synchronization Extensions) instruction set, which includes two mechanisms: HLE (Hardware Lock Elision) and RTM (Restricted Transactional Memory).

The HLE mechanism can improve the performance of multi-threaded applications with locks. It uses the XACQUIRE and XRELEASE prefixes. If the XACQUIRE prefix is placed before the instruction instructing to perform a blocked atomic operation, the lock is released. The XRELEASE prefix placed before the same instruction returns the processor to “normal” mode of operation, including locking again. Of course, performing atomic operations without blocking is fraught with errors. The control logic monitors the occurrence of problem situations: the section of code that caused the error will be executed again, but with the lock turned on.

The RTM mechanism uses the XBEGIN, XEND, and XABORT prefixes. The XBEGIN instruction tells the processor to start executing a section of code that works with memory regions accessed by unlocked program threads. All errors are detected by hardware, and control is transferred to the process at the address specified in the instructions. The processor automatically returns to the state it was in when the XBEGIN instruction started. The XEND instruction informs you that the code section that worked with transactional memory has completed execution. If an error is detected programmatically, the XABORT instruction explicitly initiates the procedure for processing this error.

TSX is already supported in GCC v4.8, the latest version of Microsoft Visual Studio 2012, the latest C ++ compiler from Intel, as well as in the Glibc v2.18 library, which is widely used by linux applications. TSX allows for good scalability of multi-core processors without fine-tuning the locks. The programmer does not even need to modify the program code: just plug in the appropriate library or recompile the code.

More possibilities

The new configuration is excellent for storage servers with intensive work with the disk subsystem. Each server is equipped with two solid state drives (SSD) with a capacity of 240GB. Modern SSDs are characterized by low access time, as well as high speed read / write operations. They can be used to host large databases and cache hot web storage data.

The servers of the new configuration are equipped with 32GB of RAM. This amount is enough to use a sufficiently large in-memory database, such as Redis, Memcached or Couchbase (they place data directly in RAM and periodically save the database state to disk). At the same time, classical databases will also gain performance due to intensive caching of queries in memory.

I want it already!

New servers are already available for order in Moscow and St. Petersburg. The rental price is only 7,500 rubles per month.
For those who cannot comment on posts on Habré, we invite to our blog .

PS

Thanks to the new graphics core, servers based on Haswell processors do an excellent job of transcoding video on the fly and can be used, for example, as hardware platforms for video broadcasting and hosting. In addition, due to a more efficient graphics subsystem, new processors can increase the performance of virtual desktop servers (VDI) and the density of clients.
The Intel Xeon processor E3-1270v3 used in the new configuration does not have an integrated graphics core. If there are tasks in your work for which you can use the graphics core of the Haswell family processor, we are ready to provide you with a platform with an E3-1285v3 processor for a month. In return, we will ask you to provide a test report, which we will share with everyone on our blog. You can leave a request with a short test plan through our ticket system .

Tags:

New Dedicated Server Configuration

Transactional memory

More possibilities

I want it already!

PS

Also popular now: