Therascale OCS Solution

    In the previous article about the Open Compute Project, we talked about the initiative of Facebook to spread an open platform for building hyper-data centers. The idea is simple - existing solutions are not suitable for operation, you need to create something that suits you.

    In a blog post, Microsoft talks a lot about how to use the Azure cloud service built on their infrastructure. The infrastructure, as it turned out, was also developed “from scratch” for the vision of Microsoft.

    The process of publishing its platform turned out to be contagious, Microsoft joined the initiative and shared its vision of the optimal infrastructure, and we will talk about this in detail.





    Key Features
    1. Width is standard, 19 inches. Chassis 12U high.
    2. Unified cooling and power system.
    3. Unified server management system.
    4. All maintenance does not affect cable management (unless the problem is in the switch).
    5. Standard PDUs are used.
    6. Power supplies match our rack mount range in form factor.



    View of the complete rack

    Details about the architecture The structure of the chassis is a complex trough in which baskets with universal backplanes for servers and disk shelves, a power distribution board, power supplies, a remote control controller and a fan text wall are installed. Rear view Six large fans (140 * 140 * 38) are combined into a single unit on the hinged door, control is from the chassis controller. The door closes the I / O ports and the space for cables, where special channels are laid for accurate wiring and installation.













    Microsoft uses standard 19 "racks 50U high, of which 48U are occupied directly by servers, and another 2U is allocated for switches. Power supplies and PDUs are located on one side of the rack and do not occupy space, unlike OCP.



    A service module is built into each chassis ( more about it later) and there are 6 power supplies (exactly the same as in our RS130 G4 , RS230 G4 models ) according to the 5 + 1 scheme. Rear view of the basket A good idea is that regardless of the modules used, the set of ports on the basket is the same. Always two SAS ports, two SFP + ports and two RJ45 ports are displayed, and what about it will be connected with the second side - determined on the spot. Compare formats







    Open rackOpen cloud server19 ”Stand
    Inner width21 ”19"19"
    Rack height2100 mm (41 OU)48-52U, 12U / Chassis42-48U
    OU / RUOU (48 mm)RU (44.45 mm)RU (44.45 mm)
    CoolingOn system (HS / Fixed)Wall fans (6 pieces)In system
    Fan size60 mm140 mmVaries
    Power bus3 pairs - -
    Power Zones / Budget3, budget up to 36 kW4, budget up to 28 kW -
    Power shelfFrontBehind -
    Power Shelf Size3 OU0 U0 U
    Integrated UPSYesYes (coming soon)No (but no one bothers to put any)
    Rack managementYes, RMC (but not fully resistant)Yes CM -
    Blade Identification (Sled ID)NotYes (in the chassis)Not
    CablesFrontBehindFront or back


    Cooling system efficiency
    Total system consumption 7200 W (24 blades x 300 W)
    Outside temperature (° C)Fan speed (%)Fan consumption (W)Fan Consumption (W)Percentage of consumption per rack (%)
    256619,2115,21,6
    359340,2241.23.35


    Chassis manager management system The management server is inserted and removed “hot”, the software part is built (which is not surprising) on ​​Windows Server 2012. The source code is available to the general public and everyone can make their own changes (the toolkit is used free of charge). general idea Directly board Functionality:













    • Input / Output (I / O):
      - 2 x 1GbE Ethernet (to access the network or manage ToR switches)
      - 4 x RS-232 (Connect to switches to control their boot)
      - Power management (single input, three outputs for connection PDU or other CMM)
    • Windows Server 2012
    • Hot swap module
    • Iron:
      - Integrated Atom S1200 processor
      - 4GB memory with ECC
      - 64GB SSD
    • TPM module as an option
    • Integrated blade communication multiplexers
    • Blade power management is deduced separately (so more reliable)



    Bus layout


    How it goes through the connectors

    The software part is not forgotten: Out-of-Band logic part





    • CM has a REST-like API and CLI command line to scale the control system
    • Fan control / monitor
    • PSU Control / Monitor
    • Chassis Power Management ON / OFF / Reset
    • Serial Console Redirect
    • Blade Identify LED
    • TOR Power Control ON / OFF / Reset
    • Power mapping


    In-band part
    • BMC blades work with standard IPMI over KCS interface
    • Windows - WMI-IPMI
    • Linux - IPMItool
    • Identification, Chassis Power, Event Logging, Temperature Monitor, Power Management, Power Capping, HDD and SSD monitoring.


    Facebook relies on Intel ME capabilities, Microsoft takes advantage of the already familiar
    IPMI via BMC.

    Blades Two types of blades are developed, one with a server, the other with a 10-disk JBOD. Possible combinations You can combine , of course, as you like - at least two servers, at least both JBODs. Cables remain in place Cables are connected to the backplate, therefore, to replace the blade, just remove the basket (a significant difference from OCP). This approach reduces maintenance time and eliminates a possible error when reconnecting cables.















    SpecificationOCS server
    CPU2x Intel Xeon E5-2400 v2 per node, up to 115 W
    ChipsetIntel C602
    Memory12 DDR3 800/1066/1333/1600 ECC UDIMM / RDIMM / LRDIMM slots per server
    Disks4 3.5 "SATA per server
    Expansion slots[options = "compact"] * 1 PCIe x16 G3 * SAS mezz * 10G mezz
    ControlBMC-lite connected to Chassis Manager via I2C bus




    The PCIe slot is designed for a very specific purpose - the use of PCIe Flash cards.


    So divorced mezzanine boards

    SpecificationOCS JBOD
    Controllers1 SAS Interface Module (SIM)
    External ports2 6Gb / s mini-SAS ports
    Disks10 3.5 "SAS / SATA hot-swappable entire basket
    ControlSCSI Enclosure Service (SES-2)



    Disk Shelf

    Network Infrastructure

    All network connections are 10G, gigabit is used only for management, and not everywhere. An important point - Microsoft is extremely active in popularizing Software Defined Networking (SDN) technologies, their own services are based on software-configured networks.

    It is not for nothing that in the past Ethernet Summit, their product for monitoring SDN networks, DEMon, received six of the six possible awards , and the company is listed on the list of platinum sponsors OpenDaylight .

    At the same time, we remind you that we have long announced a product for hybrid networks with SDN support - the Eos 410i switch with a port cost below $ 100 :) Total





    The largest companies in the industry rely on Software Defined Datacenter and Microsoft is no exception. The development of Ethernet technologies has led to the fact that 10G and RDMA (using RDMA over Converged Ethernet, RoCE) can do without a separate FC network (the increase from RDMA was mentioned here: habrahabr.ru/company/etegro/blog/213911 ) without loss in performance . The capabilities of Windows Storage Spaces are such that hardware storage systems are gradually replaced by solutions on Shared DAS / SAS networks (it was written in detail here habrahabr.ru/company/etegro/blog/215333 and here habrahabr.ru/company/etegro/blog/220613 ).

    MS Benefits:
    Chassis-based design reduces cost and power consumption
    • Standard rack EIA 19 "
    • Modular design simplifies deployment: mounted sidewalls, 1U baskets, high-performance power supplies, large fans for efficient control, hot-swappable control board
    • Up to 24 servers per chassis, optional JBOD shelves
    • Optimized for contract manufacturing
    • Cost savings of up to 40% and 15% better energy efficiency compared to traditional enterprise servers
    • Expected savings of 10,000 tons of metal for every million servers (here they obviously bent, 10 Kg per server is too much)

    All signal and power lines are transmitted through fixed connectors
    • Server and chassis separation simplifies deployment and recovery
    • Cable-free design reduces human error during maintenance
    • Less incidents caused by loose cables
    • 50% reduction in deployment and maintenance time

    Network and SAS cables in a single backplane
    • Passive PCB simplifies design and reduces risk of signal loss
    • Supports various network and cable types, 10 / 40G Ethernet and copper / optical cables
    • Cables are routed once per chassis during assembly
    • Cables do not touch during operation and maintenance
    • Save hundreds of kilometers of cables on every million servers

    Secure and scalable management system
    • Server in each chassis
    • Several levels of protection: TPM at boot, SSL transport for teams, authentication based on roles from AD
    • REST API and CLI to scale management system
    • 75% more convenient to use than regular servers


    Conclusion

    The invention of the next incarnation of blades has taken place. The key difference from the rest
    is the completely open platform specifications and the use of standard network switches. In traditional blades from a well-known circle of companies, the choice of the network part is limited by the imagination of the developers, in an open solution you can use any of the existing products, from used Juniper to SDN switches of your own design.

    Well, we presented the solution based on Open CloudServer in the product line .


    Also popular now: