Choose a server. What to look for? Check list
In my opinion, the choice of server ("because they are all the same") is given too little attention. Below I will try to describe why you should not neglect this, and what you really need to pay attention to, as well as talk about features that will help simplify the life of an administrator and save money. Everything described below is a personal opinion based on many years of experience.
Key points to consider when choosing a server
The main and main factor of choice is the type and nature of the load. Based on these, general configuration parameters are selected: the number and characteristics of the CPU, the amount of RAM, the parameters of the disk subsystem, etc. Obviously, the configuration of the loaded DBMS server will differ from the domain controller or virtualization host. There are usually repelled from the system requirements of specific software for the necessary load, as well as experience in assessing the required performance for the necessary software. If we talk about some tips, then for a virtualization host it is better to configure the server with the maximum amount of RAM for the budget (it will not be enough soon anyway :)). For a DBMS server, it is better to take care of the performance of the processors and is very fast both in IOPS and in the minimal latency of the disk subsystem (if, of course, use of local disks is planned). File storage server should be chosen with a large number of disk slots and decent RAID controller.
Despite the standard practice of adding a certain margin on characteristics when purchasing a server, there are often situations where an unplanned increase in load requires more resources than is available. In this case, forethought in the matter of further upgrading will help to do much less. First of all, it concerns the amount of RAM (the number of free slots and channel utilization), the number of disks and PCIe expansion ports to add some network adapter, HBA, nVMe SSD, etc. However, I highly recommend, for example, not to buy a two-socket server with a single processor, since there are often trivial situations when a second processor for an upgrade cannot be bought (after years old) anywhere other than eBay. Saving money at the start turns into overpayment. Also, many customers may later face the fact that revision and stepping processors are different, and there are strange hangs, errors and other troubles, which, however, is usually solved by updating the BIOS / UEFI to the latest version, if there is one, of course. And if vendors of branded iron are trying to update the firmware during the entire server support cycle, then in the case of a self-assembled solution and near-noname manufacturers of components (first of all, motherboards), it is possible to stay at the broken trough.
Reliability, Availability, Serviceability - a term introduced by IBM and describes the reliability of the system as a whole, as it ensures the continuity of the work assigned to it. If you need to have sufficiently high RAS rates, you should look towards serious brands of cars, since they pay quite a lot of attention to these features, unlike brands of the lower segment or self-assembly of components.
Reliability (or, in Russian, reliability)
It implies the ability of the system to independently eliminate failures without affecting the final result. This characteristic includes a variety of technologies that are used in almost all components: both typical error detection in processor instructions and notification of the operating system (for example, Intel's MCA), error correction in RAM (ECC, scrubbing), and vendor-specific like predictive analysis at the level of the service processor (PFA).
Determines how long the system is in working condition relative to the scheduled time. Accessibility is increased due to the use of high-quality components, redundancy of critical equipment (power supplies, fans, HBA), the general margin of the server for specific operating conditions. A typical anti-example — desktop SSDs under server load: yes, it’s about as fast, yes, it’s seriously cheaper, but when the DWPD threshold is exceeded (which is extremely low on desktop drives), SSDs easily fail, and it’s good if Administrator and confluence of circumstances led only to downtime, and not to data loss.
Serviceability (simplicity and speed of service)
It provides an opportunity to increase availability in case a failure nevertheless occurred, due to fast recovery. For this, a large number of hot-swappable components, convenient non-disruptive service rails, various diagnostic solutions, both available over the network through the service processor and located on the server case, are used — they allow you to quickly determine the failed component. Some manufacturers add “Call Home” functionality that automatically reports a failure to technical support, thereby reducing recovery time. If the criticality of services located on the server is high enough, it is worthwhile to pay serious attention to RAS.
These include power parameters (power and efficiency of the PSU), cooling (quality of the cooling system, ability to work at elevated temperatures, including without loss of warranty), temperature sensors inside the case, form factor (which also affects the performance and cooling efficiency - relevant for high density placement). In the presence of "hot" components (CPU with high TDP, GPU, etc.), it is not necessary to chase a small form factor without the obvious need for high-density placement, it is better to choose something of 2U size or even more.
The presence of the server and components in the HCL of the right manufacturer will avoid unpleasant situations associated with the launch of software. Also, a request for support to a software vendor can either turn into ping-pong between hardware and software vendors, and it can even be rejected if it is launched on unsupported hardware. In general, it is much more pleasant to get a solution working out of the box, rather than repacking the hypervisor image in order to put the RAID controller driver there (this example is a reference to the compatibility of ESXi and Adaptec controllers, which formally is, but requires preliminary caresses). Therefore,
Almost all servers are equipped with remote controllers providing an IPMI-compatible interface and / or web console. Depending on the vendor, the controllers can have various functions, from mounting images over the network, automatic OS installation and centralized firmware upgrades to full Life-cycle Management, which greatly simplifies and speeds up the commissioning of new servers and their further maintenance. The degree of attention to this item depends on the size of the server fleet and the need for convenience of remote control. Honestly, I always put in the configuration optional licenses for additional management functionality (with the exception of LCM without explicitly indicating its needs), since this is trivially convenient, and serviceability seriously reduces its time.
At first glance, a strange point: after all, the servers of different vendors use the same processors, RAM, disks, etc. However, if you measure the performance of servers from different manufacturers in the same configurations, you can get unequal results. First of all, this is explained (but not limited to) by various settings and optimizations at the firmware level. To understand the level of performance relative to competitive offers, you can refer to the server benchmarks (for example, VMmark from VMware).
Warranty and Service
Many vendors offer service packages that make it possible to quickly identify the cause of a hardware failure and fix it by replacing components. Packages differ in warranty and service terms, as well as reaction and recovery times. Also, the availability of spare parts in service warehouses after the removal of a particular model from production varies. In the case of self-assembly, you have to either keep the spare parts, or rely on the supplier / assembler of equipment in matters of availability of spare parts in stock and the duration of their delivery.
Here are the main points that you should pay attention to when choosing a server. I hope this will be useful to someone and will allow you to avoid common mistakes. If you have additional questions, write in the comments, or come to the upcoming seminar - Fujitsu hardware overview and VDI environment setup . You can register and watch the program at this link .
You can also subscribe to our channels ( YouTube , VK , Telegram ), so as not to miss new articles, courses and seminars.
Only registered users can participate in the survey. Sign in , please.