200,000 corrupted files signal: backup strategies can be improved

This article is intended for IT managers and system administrators responsible for the development and implementation of backup and data protection strategies. The article discusses typical problems associated with data corruption, the disadvantages of traditional methods for solving these problems, and ways to improve existing strategies to further minimize losses due to failures.

The article is based on unique statistics collected on an array of 200 thousand damaged files that have been repaired in the OfficeRecovery Online system.

The problem and the reasons for its occurrence


One of the most important tasks in the planning and implementation of information infrastructure is to ensure the safety of data. Corruption or disappearance of accumulated information can cause significant damage to the business. Therefore, ensuring the reliability and safety of data should be diverse and multi-level, protecting from as many possible situations as possible data loss.

In order to consider the main methods for ensuring the safety of data, we consider the main causes of their damage:
  1. Hardware failure. Data loss due to a physical media failure. With such damage, arbitrary parts of the files are replaced with meaningless data. In severe cases, the damage goes beyond the files and can affect the file system as a whole, which can cause problems even with finding files, and not just reading them.
  2. Software crash. Loss of data after an error in the processing of their application, for example, when saving changes to a file. Typical types of problems of this type are: lack of memory, an error in the application, a malfunction in the operating system. In this case, the data in the file may no longer be complete, but it can be restored well.
  3. Human factor. For example, the loss of important data due to the erroneous deletion of files. Modern means of recovering deleted data from disk use specialized algorithms, but this does not always bring the desired result. As a result, some parts of the files may be “erased” by arbitrary garbage from the disk.

As the main way to deal with the consequences of the above reasons for data corruption, backup is used, and in large organizations - the so-called Disaster Recovery Planning, emergency planning (hereinafter - DR-strategy and DR-planning).

Backup and DR planning as a solution to the problem of corrupted data


It is interesting to note that while in the West DR planning has been a hot topic for many years, then in the Russian IT glossary this term does not exist, in any case, the corresponding Wikipedia article does not have a Russian counterpart.

The difference between a backup strategy and DR planning is that the second is a comprehensive set of technologies and procedures that answers all the questions surrounding the restoration of business IT infrastructure after catastrophic events. If the backup task is to return a complete set of data to users, then the task of DR planning is to return a functioning IT infrastructure to operation, which is often equivalent to returning the entire business to operation.

Backing up, using fault-tolerant data storages (for example, RAID disk arrays), thus, are only technological methods used in developing a DR strategy.

The ultimate goal of DR planning is the complete elimination of situations where you have to deal with corrupted files and databases. Any solution provider in this area will tell you with confidence how in the event of force majeure your data will return to you, just click a few buttons.

Unfortunately, this is not entirely true. If you face the truth, you have to admit that corrupted files still appear, even in organizations with billions of dollars worth of data protection investments.

The main reasons for this are as follows:
  1. Inappropriate use of backup systems and improper implementation of implemented practices. The likelihood that the data protection solution you deployed works three days after implementation is 99% and, most likely, even much higher. But time goes on, storage is crowded, employees come and go. Two years later, it may turn out that the solution has long ceased to work, but they have not kept track of this.
  2. The inevitable existence of zones not covered by backup systems. Does your employee have a habit of editing a confidential document important for business directly on a USB flash drive? One untimely removal of the USB flash drive from the computer during editing - and now we get another damaged file.
  3. Frequency of backup. The typical frequency is 24 hours, but it can be 72 hours or 12 hours, depending on how many resources you can allocate for storing backups. The problem is that when force majeure occurs, you are guaranteed complete data 24 hours (or 72 hours, or 12 hours) ago. Nobody promises to restore the data accumulated since the last backup. And this is the newest, and often the most valuable data.
  4. The susceptibility of backup systems to the same failures that messed up data on production servers. This, of course, refers to the disadvantages of DR-planning, but it often happens that the backup RAID floods even a little earlier than the working server protected by it, which is nearby.

What to do if the file is damaged, the application working with it refuses to open it, and the backup system cannot offer a copy containing the data you need? Is it possible to recover data in the damaged file itself? What is the probability of being able to do this? What does it mean to improve DR planning in your organization?

OfficeRecovery Online: Analysis of 200,000 corrupted file recovery


In August 2011, OfficeRecovery launched a cloud service for online recovery of damaged files ( https://online.officerecovery.com/ru/ ). By September 2012, 200 thousand files had passed through the system, and statistics were collected that were of considerable interest from the point of view of DR planning for organizations looking for ways to increase the resilience of a business to man-made force majeure circumstances.

It took about a month to process the collected data and identify typical causes of damage. Here are the success statistics for recovering some popular file types:
  1. Corel WordPerfect Files - 93.1% Successfully Restored
  2. ZIP archives - 79.0%
  3. Microsoft Word Documents - 75.9%
  4. Microsoft Project Files - 66.2%
  5. Adobe Photoshop Images - 66.1%
  6. Microsoft Excel Spreadsheets - 63.2%
  7. Microsoft Access Databases - 55.4%
  8. Microsoft PowerPoint Presentations - 52.1%
  9. Graphic formats (pictures, photos) - 46.4%

Note: recovery is considered successful when at least part of the data can be retrieved from the file. Data loss is usually inevitable, but often even a small recovered fragment is of great value to customers.

The main difficulty for recovery is represented by graphic formats. This is due to the fact that often the image content is stored in compressed form and it is almost impossible to restore the part of the picture that follows the damage site in an acceptable form. This mainly concerns JPEG, TIFF and RAW formats.

The situation with office application formats is noticeably better. OfficeRecovery has been working with office application formats for over 14 years and has extensive experience in this area.

Microsoft Word files are considered easy in terms of recovery, since in the case of even very serious damage to the file, there remains at least the ability to get all the text stored in the file, albeit with a loss of formatting. This is often the only way to help users.

The next easiest to restore is Microsoft Excel: if the internal structure of the file is seriously damaged and it is impossible to fully read the set of sheets in the workbook, it remains possible to extract the contents of all cells on one page.

On average, recovery was successful in more than half of the cases! In other words, OfficeRecovery Online returned to users full or partial content of 100 thousand files that were considered lost.

Recovering damaged files as part of a data security strategy


As this article shows, broken files a) are a common problem and b) for the most part are subject to “treatment” of varying degrees of success.

Conclusion: when developing your DR strategy, immediately include software products and procedures in it for recovering corrupted data resulting from failures in your IT infrastructure. Do not rely on the fact that due to the introduction of a backup system, such a situation is “impossible in principle”.

OfficeRecovery offers a range of products that complement traditional DR planning solutions with the ability to recover data that, for one reason or another, is beyond the scope of backup systems.

For such “light” formats as Word, Excel, PowerPoint, and dozens of others, which form the basis of business electronic document management, the OfficeRecovery Online online service is well suited . Using this service, any employee can repair a broken file without special skills and using only a browser. Immediately after recovery, demo results are available, and there is even the possibility of free results in 2-4 weeks from the date of recovery. For an additional fee, analysis and treatment of problem files by qualified specialists is possible.

To recover large volumes of data (for example, if databases, virtual disk images, Exchange mail databases are damaged) on the main OfficeRecovery website (www.officerecovery.com)a set of traditional “offline” software products for recovering data from most common formats is offered. These products are also recommended for cases where online recovery is not possible for privacy reasons. Instead of uploading corrupted data to an online service, a client can buy the appropriate software product and restore the data without giving it outside of his company.

Also popular now: