Hardware Planning for Your OpenStack Cluster: Answers to Your Questions
Posted by Greg Elkinbard
My colleague Anne Friend and I recently presented a webinar on “How to handle hardware planning for your OpenStack cloud . ” During the webinar, we promised to give you answers to questions that we did not have time to voice live. This article will be devoted to the answers to these questions.
You mentioned adding storage to a rack with an overloaded switch. Can you talk about how to configure this?
A typical central switch does not have the same uplink bandwidth compared to the downlink bandwidth. For example, typically a trident + switch will have 48 10-gigabit ports for downlink with a total throughput of 960 Gb / s, but only 4 × 40-gigabit ports or 320 gigabytes as bandwidth in the upstream direction, thus the limit is exceeded by approximately 3/1 ratio.
This means that you should limit the traffic that goes up the communication channels. There are two ways to do this. One is to run user VMs in the domain (L2 segment) of the edge switch to reduce traffic between outbound connections.
The second main source of traffic is Cinder traffic between the Cinder node and the Compute node. Concentrating this traffic on one switch will also offload the uplink. For example, if you are using Cinder iSCSI storage, you can provide one or two switches per rack and make sure that the Cinder scheduler creates volumes from storage located in the same rack as the Compute resources. Both of these filters are custom, they must be created for the Nova and Cinder schedulers. This is not exactly a “turnkey solution,” but it is a simple change.
I am trying to understand how we can apply some of the trade-offs that you described in more private terms. Can you give a numerical example of a compromise when allocating vCPU / VRAM for two different cases?
There are too many use cases to delve into, but let's look at real-world calculations.
CPU requirements
-100 virtual machines
-On average 2 EC2
computing nodes -Maximum 16 EC2 computing nodes
-Lack of exceeding the limit
This corresponds to:
-200 GHz CPU capacity (100 users x 2 GHz / user)
-Maximum number of cores - 5 (16 GHz / 2.4 GHz per core)
Based on calculations:
-Coof Hyperthreading 1.3
-10-11 cores E5 2640 (200 GHz / 2.4 GHz CPU / 6 cores)
-5-6 dual-core servers (11 cores / 2 cores per server)
- 17 VMs per server (100 VMs / 6 servers)
Memory requirements
-100 virtual machines
-4 GB per virtual machine
-Minimum 512 MB, maximum 32 GB
This corresponds to:
-400 GB total (100 VM * 4 GB per VM)
Based on the following calculations:
-Need for four machines of 128 GB (400 GB / 128 GB)
- Balancing with the CPU, you need 6 machines for the total CPU capacity -
Reduce server memory and work with 6 machines of 64 or 96 GB (6x64 GB is 384 GB, 6 × 96 is 596 GB)
If you need a little more memory, you need 96 GB machines
When you say that VLAN is suitable for a small network, how small a network do you mean?
A small network has less than 4 thousand virtual networks. However, since Neutron allows each user to have multiple networks, you cannot assume that you can accommodate 4 thousand users. Also do not forget that you have some need for a static infrastructure; Remember to save tags for these networks.
How does Fuel help automate network configuration?
Fuel can check the network configuration to make sure that your nodes are connected correctly and that all suitable VLAN tags are unlocked on the switch.
Do you think it is better to use the hardware of well-known manufacturers such as Dell, HP, etc., or can we achieve the same performance using the software we created? Is it recommended to use the Open Compute Platform?
The short answer is if you have a large enough company to support your own hardware or a small enough company not to worry about downtime during a hardware failure, then you can use virtual data warehouses or a computer of your own assembly. If you are a medium-sized company, we recommend that you use equipment from well-known manufacturers, as you get optimal service level agreements.
Open Compute is a promising platform, but it depends on the broader hardware support that will come soon.
Do you recommend separate software for nodes running separate nova services? For example, should a node running nova-api have more memory than a node running glance-api?
At Mirantis, we recommend consolidating all OpenStack infrastructure services into dedicated nodes called controllers. This type of architecture facilitates high availability.
What about ARM (or Atom) based microservers?
If you have a general-purpose cloud, it will be difficult for you to create a significant CPU load on micro-servers based on ARM or Atom. Try to start the MsSQL or Oracle server on ARM; you will not achieve much. If you have a special cloud that fits within the limitations of these CPUs, then use them anyway. The cloud does not rely entirely on CPUs, and the architecture of many ARM / Atom-based processors does not imply sufficient bandwidth or disk space to become a good platform.
What about blade servers?
Leave the shaving blades. Use regular servers for the cloud. If you need higher density, use sleds form factor servers (Dell C-class, HP SL-class) instead of blade servers. The central blade server module usually does not have enough bandwidth to work well with the cloud, and there is not enough local storage space, which puts a double load on your chassis bandwidth requirements. In addition, you pay a premium for such servers. One or two blade device designs have begun to eliminate at least a network bottleneck, but other doubts remain.
Can we provide real-time migration without shared storage?
You can perform real-time migration without shared storage. It just takes more time.
For a small private cloud, do you recommend a fiber channel for shared storage on compute nodes or a 1 Gigabit shared file system?
Neither one nor the other. Use 10 gigs and Ceph or other block storage. You do not need shared FS or fiber costs.
Can you talk a little more about the requirement for swift 6.5x?
This is a separate question with a more detailed answer in the recorded webinar, but here is a simple calculation:
Accept the replication factor of 3.
Add two hand-held devices (necessary for additional space for failures)
In addition, if you exceed 75% of the XFS disk capacity, you will have problems , and get this calculation:
(3 + 2) /. 75 = 6.7
After deployment, what tools do you use or used to check CPU and hardware utilization?
At Mirantis, we used (and successfully) Nagios and Zabbix.
Can I deploy OpenStack on OCP (Open Compute Platform)?
Yes. Mirantis Fuel is generally independent of hardware architecture.
How do diskless hypervisors fit into the storage equation 'local vs sharing vs object'? Is it possible to manage compute nodes as iSCSI clients without a disk without violating Cinder's ability to connect to iSCSI targets, or does the hardware need another SAN solution?
Let's change the question a little and ask why you need such difficulties. Mirantis Fuel already has an operating system deployed for you. Having several small disks for the OS will make setup easier. We tried this before, but there are problems in arrays when several initiators for the OS and Cinder from one node want to turn to the same goal. It's not worth it.
Does Fuel support interface bonding?
Yes, but you need to use the command line interface, not the web interface.
Have you worked with Illumos-based hypervisors or anything using Illumos, or was it done only on Linux?
ZFS is not comprehensive enough to pay attention to side operating systems like Solaris. Yes, you can run XEN and KVM with warnings and restrictions. If you are rich enough to support your own operating system development team, you can do it, but you will always be behind in functionality. From scratch, I created several OS development teams for various companies, and I can tell you that if this is your professional field, then go ahead. Otherwise, it is better to go along the beaten track: it will be more convenient for you than to get through the jungle.
Original article in English
My colleague Anne Friend and I recently presented a webinar on “How to handle hardware planning for your OpenStack cloud . ” During the webinar, we promised to give you answers to questions that we did not have time to voice live. This article will be devoted to the answers to these questions.
You mentioned adding storage to a rack with an overloaded switch. Can you talk about how to configure this?
A typical central switch does not have the same uplink bandwidth compared to the downlink bandwidth. For example, typically a trident + switch will have 48 10-gigabit ports for downlink with a total throughput of 960 Gb / s, but only 4 × 40-gigabit ports or 320 gigabytes as bandwidth in the upstream direction, thus the limit is exceeded by approximately 3/1 ratio.
This means that you should limit the traffic that goes up the communication channels. There are two ways to do this. One is to run user VMs in the domain (L2 segment) of the edge switch to reduce traffic between outbound connections.
The second main source of traffic is Cinder traffic between the Cinder node and the Compute node. Concentrating this traffic on one switch will also offload the uplink. For example, if you are using Cinder iSCSI storage, you can provide one or two switches per rack and make sure that the Cinder scheduler creates volumes from storage located in the same rack as the Compute resources. Both of these filters are custom, they must be created for the Nova and Cinder schedulers. This is not exactly a “turnkey solution,” but it is a simple change.
I am trying to understand how we can apply some of the trade-offs that you described in more private terms. Can you give a numerical example of a compromise when allocating vCPU / VRAM for two different cases?
There are too many use cases to delve into, but let's look at real-world calculations.
CPU requirements
-100 virtual machines
-On average 2 EC2
computing nodes -Maximum 16 EC2 computing nodes
-Lack of exceeding the limit
This corresponds to:
-200 GHz CPU capacity (100 users x 2 GHz / user)
-Maximum number of cores - 5 (16 GHz / 2.4 GHz per core)
Based on calculations:
-Coof Hyperthreading 1.3
-10-11 cores E5 2640 (200 GHz / 2.4 GHz CPU / 6 cores)
-5-6 dual-core servers (11 cores / 2 cores per server)
- 17 VMs per server (100 VMs / 6 servers)
Memory requirements
-100 virtual machines
-4 GB per virtual machine
-Minimum 512 MB, maximum 32 GB
This corresponds to:
-400 GB total (100 VM * 4 GB per VM)
Based on the following calculations:
-Need for four machines of 128 GB (400 GB / 128 GB)
- Balancing with the CPU, you need 6 machines for the total CPU capacity -
Reduce server memory and work with 6 machines of 64 or 96 GB (6x64 GB is 384 GB, 6 × 96 is 596 GB)
If you need a little more memory, you need 96 GB machines
When you say that VLAN is suitable for a small network, how small a network do you mean?
A small network has less than 4 thousand virtual networks. However, since Neutron allows each user to have multiple networks, you cannot assume that you can accommodate 4 thousand users. Also do not forget that you have some need for a static infrastructure; Remember to save tags for these networks.
How does Fuel help automate network configuration?
Fuel can check the network configuration to make sure that your nodes are connected correctly and that all suitable VLAN tags are unlocked on the switch.
Do you think it is better to use the hardware of well-known manufacturers such as Dell, HP, etc., or can we achieve the same performance using the software we created? Is it recommended to use the Open Compute Platform?
The short answer is if you have a large enough company to support your own hardware or a small enough company not to worry about downtime during a hardware failure, then you can use virtual data warehouses or a computer of your own assembly. If you are a medium-sized company, we recommend that you use equipment from well-known manufacturers, as you get optimal service level agreements.
Open Compute is a promising platform, but it depends on the broader hardware support that will come soon.
Do you recommend separate software for nodes running separate nova services? For example, should a node running nova-api have more memory than a node running glance-api?
At Mirantis, we recommend consolidating all OpenStack infrastructure services into dedicated nodes called controllers. This type of architecture facilitates high availability.
What about ARM (or Atom) based microservers?
If you have a general-purpose cloud, it will be difficult for you to create a significant CPU load on micro-servers based on ARM or Atom. Try to start the MsSQL or Oracle server on ARM; you will not achieve much. If you have a special cloud that fits within the limitations of these CPUs, then use them anyway. The cloud does not rely entirely on CPUs, and the architecture of many ARM / Atom-based processors does not imply sufficient bandwidth or disk space to become a good platform.
What about blade servers?
Leave the shaving blades. Use regular servers for the cloud. If you need higher density, use sleds form factor servers (Dell C-class, HP SL-class) instead of blade servers. The central blade server module usually does not have enough bandwidth to work well with the cloud, and there is not enough local storage space, which puts a double load on your chassis bandwidth requirements. In addition, you pay a premium for such servers. One or two blade device designs have begun to eliminate at least a network bottleneck, but other doubts remain.
Can we provide real-time migration without shared storage?
You can perform real-time migration without shared storage. It just takes more time.
For a small private cloud, do you recommend a fiber channel for shared storage on compute nodes or a 1 Gigabit shared file system?
Neither one nor the other. Use 10 gigs and Ceph or other block storage. You do not need shared FS or fiber costs.
Can you talk a little more about the requirement for swift 6.5x?
This is a separate question with a more detailed answer in the recorded webinar, but here is a simple calculation:
Accept the replication factor of 3.
Add two hand-held devices (necessary for additional space for failures)
In addition, if you exceed 75% of the XFS disk capacity, you will have problems , and get this calculation:
(3 + 2) /. 75 = 6.7
After deployment, what tools do you use or used to check CPU and hardware utilization?
At Mirantis, we used (and successfully) Nagios and Zabbix.
Can I deploy OpenStack on OCP (Open Compute Platform)?
Yes. Mirantis Fuel is generally independent of hardware architecture.
How do diskless hypervisors fit into the storage equation 'local vs sharing vs object'? Is it possible to manage compute nodes as iSCSI clients without a disk without violating Cinder's ability to connect to iSCSI targets, or does the hardware need another SAN solution?
Let's change the question a little and ask why you need such difficulties. Mirantis Fuel already has an operating system deployed for you. Having several small disks for the OS will make setup easier. We tried this before, but there are problems in arrays when several initiators for the OS and Cinder from one node want to turn to the same goal. It's not worth it.
Does Fuel support interface bonding?
Yes, but you need to use the command line interface, not the web interface.
Have you worked with Illumos-based hypervisors or anything using Illumos, or was it done only on Linux?
ZFS is not comprehensive enough to pay attention to side operating systems like Solaris. Yes, you can run XEN and KVM with warnings and restrictions. If you are rich enough to support your own operating system development team, you can do it, but you will always be behind in functionality. From scratch, I created several OS development teams for various companies, and I can tell you that if this is your professional field, then go ahead. Otherwise, it is better to go along the beaten track: it will be more convenient for you than to get through the jungle.
Original article in English