24 Hour Data Saves the Night with Recent RAID Recovery

24 Hour Data RecoveryBusiness owners who provide important services can’t accept downtime, even if it means working through the night to re-build a RAID array after a hard drive failure. Encartele, a Dallas-based Voice Over IP provider for correctional and confinement facilities, relies on rock-solid technology to process hundreds of thousands of phone calls, keeping loved ones in touch across the miles.

When a RAID-5 array recently went bad, Encartele owner Scott Moreland faced a data recovery emergency at 10:30 at night. The company was performing system upgrades, so back-ups weren’t accessible.

“I couldn’t get the RAID re-built. I couldn’t access my critical data. I admit, I was freaking out a little bit,” Moreland said.

He searched the Yellow Pages for Dallas data recovery specialists. But nearly every data recovery company he called didn’t answer the phone.

“The few who answered quoted me astronomical rates for emergency data recovery at thattime of night,” Moreland says.

Emergency RAID Recovery: 24 Hour Data Saves the Night

Finally, Moreland says, he dialed the right number: 1-866-598-DATA. The round-the-clock data recovery specialists at 24 Hour Data answered the phone and suggested Moreland drop off the drive.

“My immediate reaction was, ‘I like this guy!’” Moreland recalls.

Moreland liked the 24 Hour Data experts even better when they called him five hours later to report they recovered all the mission critical data.

“I came back in the morning to pick up the re-built RAID array with all my data recovered and in place. It was that easy. Life was good,” Moreland says.

Living the Good Life with 24 Hour Data

Since then, Moreland says, his days of shopping for a data recovery service he can trust are over. “They have the most reasonable pricing I could find, and the service is top notch.”
He continues, “Data recovery is a highly specialized field, and we associate that with extremely high prices. But that doesn’t have to be the case.”

Describing 24 Hour Data’s RAID recovery rates as “fair” and “reasonable,” Moreland says he uses 24 Hour Data, and its partner firm, 24 Hour Computer, for all his high-level IT service. “I feel greater peace-of-mind knowing 24 Hour Data and 24 Hour Computer are there as a resource for my business. If anyone is looking for amazingly good service at a very fair price, I feel there’s no one better.”

24 Hour Data President Sean Wade says helping Dallas business owners like Moreland gives his job greater meaning. “Encartele provides an important communication service for correctional and confinement facilities. Inmates view Encartele’s phone and video calling services as their lifeline to loved ones. We’re proud to assist Encartele with RAID data recovery and repair services to help keep those communication lines open.”

Related Link: Best Data Recovery Company: 24 Hour Data

About 24 Hour Data

With more than 15 years experience in the data recovery industry, 24 Hour Data has unmatched success rates in data recovery for all forms of storage media, including flash data recovery, SSD data recovery(solid-state drive data recovery), hard drive data recovery, Mac recovery and more. Looking for a data recovery service you can trust to recover your lost data? Call the data recovery experts at 24 Hour Data.

Read More

RAID Failures & Recovery

Correlated failures
RAID Failures & RecoveryThe theory behind the error correction in RAID assumes that failures of drives are independent. Given these assumptions it is possible to calculate how often they can fail and to arrange the array to make data loss arbitrarily improbable.

In practice, the drives are often the same age, with similar wear, and subject to the same environment. Since many drive failures are due to mechanical issues which are more likely on older drives, this violates those assumptions and failures are in fact statistically correlated. In practice then, the chances of a second failure before the first has been recovered is not nearly as unlikely as might be supposed, and data loss can, in practice, occur at significant rates.

A common misconception is that “server-grade” drives fail less frequently than consumer-grade drives. Two independent studies, one by Carnegie Mellon University and the other by Google, have shown that the “grade” of the drive does not relate to failure rates.

Atomicity
This is a little understood and rarely mentioned failure mode for redundant storage systems that do not utilize transactional features. Database researcher Jim Gray wrote “Update in Place is a Poison Apple”[28] during the early days of relational database commercialization. However, this warning largely went unheeded and fell by the wayside upon the advent of RAID, which many software engineers mistook as solving all data storage integrity and reliability problems. Many software programs update a storage object “in-place”; that is, they write a new version of the object on to the same disk addresses as the old version of the object. While the software may also log some delta information elsewhere, it expects the storage to present “atomic write semantics,” meaning that the write of the data either occurred in its entirety or did not occur at all.

However, very few storage systems provide support for atomic writes, and even fewer specify their rate of failure in providing this semantic. Note that during the act of writing an object, a RAID storage device will usually be writing all redundant copies of the object in parallel, although overlapped or staggered writes are more common when a single RAID processor is responsible for multiple drives. Hence an error that occurs during the process of writing may leave the redundant copies in different states, and furthermore may leave the copies in neither the old nor the new state. The little known failure mode is that delta logging relies on the original data being either in the old or the new state so as to enable backing out the logical change, yet few storage systems provide an atomic write semantic on a RAID disk.

While the battery-backed write cache may partially solve the problem, it is applicable only to a power failure scenario.

Since transactional support is not universally present in hardware RAID, many operating systems include transactional support to protect against data loss during an interrupted write. Novell NetWare, starting with version 3.x, included a transaction tracking system. Microsoft introduced transaction tracking via the journaling feature in NTFS. ext4 has journaling with checksums; ext3 has journaling without checksums but an “append-only” option, or ext3cow (Copy on Write). If the journal itself in a filesystem is corrupted though, this can be problematic. The journaling in NetApp WAFL file system gives atomicity by never updating the data in place, as does ZFS. An alternative method to journaling is soft updates, which are used in some BSD-derived system’s implementation of UFS.

This can present as a sector read failure. Some RAID implementations protect against this failure mode by remapping the bad sector, using the redundant data to retrieve a good copy of the data, and rewriting that good data to the newly mapped replacement sector. The UBE (Unrecoverable Bit Error) rate is typically specified at 1 bit in 1015 for enterprise class disk drives (SCSI, FC, SAS) , and 1 bit in 1014 for desktop class disk drives (IDE/ATA/PATA, SATA). Increasing disk capacities and large RAID 5 redundancy groups have led to an increasing inability to successfully rebuild a RAID group after a disk failure because an unrecoverable sector is found on the remaining drives. Double protection schemes such as RAID 6 are attempting to address this issue, but suffer from a very high write penalty.

Write cache reliability
The disk system can acknowledge the write operation as soon as the data is in the cache, not waiting for the data to be physically written. This typically occurs in old, non-journaled systems such as FAT32, or if the Linux/Unix “writeback” option is chosen without any protections like the “soft updates” option (to promote I/O speed whilst trading-away data reliability). A power outage or system hang such as a BSOD can mean a significant loss of any data queued in such a cache.

Often a battery is protecting the write cache, mostly solving the problem. If a write fails because of power failure, the controller may complete the pending writes as soon as restarted. This solution still has potential failure cases: the battery may have worn out, the power may be off for too long, the disks could be moved to another controller, the controller itself could fail. Some disk systems provide the capability of testing the battery periodically, however this leaves the system without a fully charged battery for several hours.

An additional concern about write cache reliability exists, specifically regarding devices equipped with a write-back cache—a caching system which reports the data as written as soon as it is written to cache, as opposed to the non-volatile medium. The safer cache technique is write-through, which reports transactions as written when they are written to the non-volatile medium.

Equipment compatibility
The methods used to store data by various RAID controllers are not necessarily compatible, so that it may not be possible to read a RAID array on different hardware, with the exception of RAID 1, which is typically represented as plain identical copies of the original data on each disk. Consequently a non-disk hardware failure may require the use of identical hardware to recover the data, and furthermore an identical configuration has to be reassembled without triggering a rebuild and overwriting the data. Software RAID however, such as implemented in the Linux kernel, alleviates this concern, as the setup is not hardware dependent, but runs on ordinary disk controllers, and allows the reassembly of an array. Additionally, individual RAID1 disks (software, and most hardware implementations) can be read like normal disks when removed from the array, so no RAID system is required to retrieve the data. Inexperienced data recovery firms typically have a difficult time recovering data from RAID drives, with the exception of RAID1 drives with conventional data structure.

Data recovery in the event of a failed array
With larger disk capacities the odds of a disk failure during rebuild are not negligible. In that event the difficulty of extracting data from a failed array must be considered. Only RAID 1 stores all data on each disk. Although it may depend on the controller, some RAID 1 disks can be read as a single conventional disk. This means a dropped RAID 1 disk, although damaged, can often be reasonably easily recovered using a software recovery program. If the damage is more severe, data can often be recovered by professional data recovery specialists. RAID 5 and other striped or distributed arrays present much more formidable obstacles to data recovery in the event the array fails.

Read More

RAID Recovery – Don’t Increase the Level of Difficulty

RAID Recovery More and more enthusiast users encounter the destroyed RAID arrays. Generally, data recovery from such a RAID array is possible, but keep in mind that the effort increases disproportionately. First of all, data has to be copied from a RAID drive onto a server, and the data set has to be put back together. The distribution of data into smaller blocks across one or more drives makes RAID 0 the worst possible type to recover. Increasing performance doesn’t necessarily do your data any good here! If a drive is completely defective, only small files, which ended up on only one of the RAID drives (despite the RAID stripe set), can be recovered (at 64 kB stripe size or smaller). RAID 5 offers parity data, which can be used for recovery as well.

RAID data configuration is almost always proprietary, since all RAID manufacturers set up the internals of their arrays in different ways. However, they do not disclose this information, so recovering from a RAID array failure requires years of experience. Where does one find parity bits of a RAID 5, before or after the payload? Will the arrangement of data and parity stay the same or will it cycle? This knowledge is what you are paying for.

Instead of accessing drives on a controller level, the file system level (most likely NTFS) is used, as logical drives will provide the basis for working on a RAID image. This allows the recovery specialist to put together bits and bytes after a successful recovery using special software. The recovery of known data formats is an important approach in order to reach towards a complete data recovery. Take a JPEG file for example – will you be able to recognize a picture after recovery? Or will you be able to open Word.exe, which is found on almost every office system? The selected file should be as large as possible, so it was distributed across all drives and you can know for sure that its recovery was successful.

Two dead hard drives in a RAID 5 are more likely to be restored than two single platters, since RAID still provides parity data.

Read More

RAID 3 Data Recovery

This level uses byte level striping with dedicated parity. In other words, data is striped across the array at the byte level with one dedicated parity drive holding the redundancy information. The idea behind this level is that striping the data increasing performance and using dedicated parity takes care of redundancy. 3 hard drives are required. 2 for striping, and 1 as the dedicated parity drive. Although the performance is good, the added parity does slow down writes. The parity information has to be written to the parity drive whenever a write occurs. This increased computation calls for a hardware controller, so software implementations are not practical. RAID 3 is good for applications that deal with large files since the stripe size is small. Since this level is so rare, we have not come up with a recovery procedure for this RAID level. Recovery is possible by finding the parity disk using the image compression technique, then removing it and treating the RAID as a stripe.

Read More

RAID Introduction & Recovery

Overview of RAID
The heart of the RAID storage system is controller card. This card is usually a SCSI hard disk controller card (however, IDE RAID controller cards are becoming quite common). The task of the controller card is to:
1. Manage Individual Hard Disk Drives
2. Provide a Logical Array Configuration
3. Perform Redundant or Fault Tolerant Operations

RAID History
RAID is an acronym for Redundant Array of Inexpensive Disks. The concept was conceived at the University of California, Berkeley and IBM holds the intellectual patent on RAID level 5. The University of California, Berkeley researchers, David A. Patterson, Garth Gibson, and Randy H. Katz worked to produce working prototypes of five levels of RAID storage systems. The result of this research has formed the basis of today’s complex RAID storage systems.

Individual Drives Management
The RAID controller will translate and communicate directly with the hard disk drives. Some controller cards have additional utilities to work with the disk drives specifically, such as a surface scan function and a drive format utility. In the case of SCSI based cards, these controllers will provide additional options to manage the drives.

Logical Array Configuration
The configuration of the logical array stripes the data across all of the physical drives. This provides balanced data throughput to all of the drives—instead of making one drive do all the work of reading and writing data, now all of them are working together and the data is streaming across all of the physical drives.

The Operations to Redundant or Fault Tolerant
The redundancy in a common RAID 5 configuration is the result of using a Boolean mathematical function called Exclusive OR (XOR). This is commonly referred to as Parity. The XOR function is a logical binary process—its best to think of Parity as combination of the other drive’s data blocks. Every byte that gets written to one data block is calculated against the other data blocks and resultant Parity is written to the Parity block for that particular stripe. What makes this function so unique is that the math will always work regardless of what data block is missing. However, the limitation to RAID 5 is that only one data block can be missing—the math will not work if there are two blocks missing. In the working environment this means that only one drive can fail. The RAID 5 configuration will not provide proper redundancy if two or more drives fail.

As previously mentioned, the controller card is striping the data as well as performing the XOR function on that data as well—the amount of logical computations the controller is doing every second is staggering. Today’s RAID controllers are intricate pieces of hardware, including specially designed processors and SDRAM memory banks to provide performance and redundancy.

RAID Introduction
Storage systems preserve data that has been processed and data that is queued up to be processed and have become an integral part of the computer system. Storage systems have advanced just as other computer components over the years. The RAID storage system was introduced over 15 years ago and has provided an excellent mass storage solution for enterprise systems. Let’s get a little more history about the RAID concept and they work.

Common RAID Configurations —the pictures below graphically show how RAID Arrays are put together (this is handled by the RAID configuration.) Follow the letters to see how the data stripes jump between drives.

RAID Recovery
RAID storage systems are designed to deal with failure. While hardware failure is a strong reason why some RAIDs may fail, there can also be other failures that make the data inaccessible. If your client is having problems with their RAID Array, then Ontrack Data Recovery is your solution.

A RAID recovery evaluation is really the combination of two very important steps. First is the array rebuilding and this has the potential of taking the most time. This investment in time is required in determining the original configuration and getting a quality recovery. The second step is to work on the logical file system. Today’s enterprise journaling file systems are highly complex; if the RAID Array is out of order there will be thousands of errors within the file system and files will be corrupted.

Some of the design goals of the RAID storage system were to provide performance improvements, storage reliability and recovery, and scalability. The redundancy concept employed in the RAID system is unique and provides a method to recover if one drive should fail within the system. In fact, today’s RAID controller cards have the ability to continue reading and writing data even if one drive is ‘off-line.’

Read More