sysmetic September 29, 2014 at 10:56

How to Take Control of Virtual Machine Sprawl: 7 Useful Reports in Veeam Availability Suite

Tutorial

“Do you see a gopher?” Not? But he is! ”

A similar situation is quite possible with an uncontrolled increase in the number of virtual machines (Virtual Machine Sprawl) - until a certain point, we do not believe that such a problem exists in our infrastructure, but in fact it exists, it simply does not show itself in full size.

And what, in fact, is the matter?

The problem of an uncontrolled increase in the number of virtual machines is quite typical for rapidly growing virtual infrastructures. It all starts with the quite reasonable desire of a particular division of the company to work with its specialized applications on dedicated servers. Of course, thanks to virtualization, they get this opportunity. And now, dear readers, IT industry employees, developers and testers, estimate at least approximately how many virtual machines you have populated, say, over the past six months? And how many of them were forgotten and abandoned after a successful release?

It turns out that the more virtualization penetrates the work environment and helps to solve production problems, the more “virtual waste” appears in this environment (as, however, in almost any production). This is what leads to the problem of uncontrolled growth in the number of virtual machines (VM sprawl).

Cars are virtual and money is real

“Drawn” machines continue to consume resources: even when inactive, they occupy a place in the storage system, a processor is diverted to some of them - but these resources are sometimes not enough for important and necessary applications to work! It is likely that in the end the company will have to spend a serious amount on additional storage. For the case of a data center of 1000 virtual machines, the possible financial losses from VM Sprawl are estimated at several tens of thousands of dollars per year (see VMware article ).

In addition, “virtual garbage” poses a threat to information security, since forgotten “virtual machines” can fall out of regular maintenance processes: installing patches, updating anti-virus databases, changing group policies, etc.

Is there any way to deal with VM sprawl?

The answer is yes. For example, Veeam Availability Suite, in particular its Veeam ONE component, includes more than 80 reports on VMware and Hyper-V infrastructures, as well as on Veeam backup infrastructure. Among them, there are those that allow revealing a ~~hidden threat,~~ signs of uncontrolled growth in the number of virtual machines and contain recommendations for effective planning and allocation of resources. About them and will be discussed below.

“Everything disappeared somewhere, nothing was left ...”

To assess the current situation, we use the capabilities of Veeam Availability Suite, designed to plan and predict the use of storage resources, memory and processor.

Let's see the Capacity Planning dashboard , and the lack of space (or memory, or both) will make you think hard - do you need to urgently apply for the purchase of additional storage right now or can you still “live by tomorrow”?

"Calm, only calm!"

Let’s ~~not get depressed and~~ see how, instead of “knocking out” additional financing, we can, on the contrary, help the company save money by following 5 simple steps to rationalize the use of existing resources.

Step №1 Calculate the "zombie"

We call them so for brevity - although they do not “devour the brain,” they “ ~~take it out~~ ” consume disk space and other vital infrastructure resources. These are those virtual machines that are completely or almost completely not used, they do not understand why, unlike the useful and popular ones.
a) Run the Idle VMs report and get a list of such virtual machines, and then we decide what to do with them: turn off, reduce the resources allocated to them, or give them up for other tasks. Before starting the report, do not forget to specify the parameters:
- for what period of time we want to see the data
- which values will be considered the threshold of resource use (processor, memory, disk space, network)
- how much time (in% of the selected interval) the machine must spend in Idle state to get into the report

b) To search for virtual machine templates, we use the Idle Templates report , and as the report parameter, set the time of the last template use.

At the output, we get a list of "ownerless" objects with an indication of size and location - these templates can be deleted or migrated to a more spacious place.

c) Using the report on inefficiently used disk space Inefficient Datastore Usage, we examine “zombies” (barely living virtual machines) - we immediately see where such machines are located, when they were last used and how much space they occupy.

Step number 2. Find extra backups

What if the Capacity Planning for Backup Repository report shows that the place in the backup repository is running out?

We recommend that you check if any machines are included in several backup tasks at once. To do this, run the VMs Backed Up by Multiple Jobs report and see who and where the backups of such machines are saved to.

Step №3 We remove the "garbage"

The accumulating “garbage” is a side effect of the vital activity of the virtual infrastructure, that is, the many changes that occur in it every day. Temporary virtual machine files and configuration files can continue to exist on the storage system even after the parent objects have been deleted - and this is an additional expense of disk space.

The Garbage Files report helps here - it determines which objects are no longer used, and where the corresponding “waste” files are located.

Step # 4 Apply Categorization

A useful option of Veeam Availability Suite is also the ability to group infrastructure objects using business criteria. What is it and how will it help to defeat the uncontrolled growth in the number of cars? Everything is very simple - for virtual machines we indicate their “organizational data”: which department uses, on which project, in what quality, etc., etc. If necessary, select the appropriate category in the view and perform a bulk operation.

Let's say the R&D department involved a number of virtual machines to work on the U Temp project - after the project is completed, we look through the list of these machines and delete unnecessary ones.

Step # 5 Find Unnecessary Snapshots

It would seem, what have snapshots to do with it, if we are talking about saving resources, redundant virtual machines, and the like? In fact, an excessive amount of snapshots also negatively affects the infrastructure, so we recommend including this step in the process of combating the VM sprawl problem.

Consider a situation when a snapshot “falls out” of a snapshot chain of a virtual machine - this can happen when the host crashes, it fails to consolidate snapshots, or an incorrect backup is created. Such an orphaned snapshot, however, continues to occupy disk space.

By the way, VMware recommends monitoring the length of the snapshot chain of a virtual machine: it should consist of no more than 3 snapshots, the usage time of each should be no more than 3 days.

Let's try to build our own Custom Infrastructure report , specifying the virtual machine and virtual disk as the object type, and those related to snapshot as the properties that interest us. To do this, in the Select Columns dialog, select Name, VMDK file, Virtual Disk: Label, Snapshot: File name, Snapshot: File size .

Then we will set the filter of values Custom Filter in the form of expression: VMDK file - Contains - 0000 .

At the output we get a list of “ownerless” snapshots found in our VMware infrastructure.

For more detailed instructions on generating such a report, welcome to the Veeam Support Knowledge Base:KB1757: Using Veeam ONE Reporter to Detect Orphaned Snapshots in VMware .

It is also useful to monitor the age of snapshots, for which we use the Active Snapshots report. It shows which snapshots are the largest and which are the oldest (most likely, you hardly need to roll back the virtual machine to such an old state).

In conclusion - useful advice

To simplify the task of fighting VM sprawl, I advise you to create a special folder for the above-mentioned reports in Veeam ONE Reporter, name it, say, VM Sprawl Control , and put the whole "magnificent seven" into it. When you first generate a report, you must remember to specify the required parameters and threshold values (where they exist), and then you can configure automatic generation of the entire report folder according to the schedule and delivery by mail (see the first figure at the beginning of this article).

Additional links:

Top 7 VMware Management Challenges Article
Overview of All Veeam Availability Suite Monitoring and Reporting Features
A rough estimate of the cost of VM Sprawl can be found in VMware 's Controlling Virtual Machine Sprawl
Original post in English. lang

Tags: