Problem solve Get help with specific problems with your technologies, process and projects.

Solving five common data protection dilemmas

Find out about the five most common data protection problems and get tips for backing up less data in a shorter amount of time.

According to independent author Jon William Toigo, storage managers are faced with two towers of pain: protection and provisioning. While these two towers may not match J.R.R. Tolkien's vision of hobbits racing against time to battle evil before their entire world collapses, they do represent what you may think is an equally difficult mission -- how to protect your data with all of the current options available today.

"We've created a house of cards," Toigo said earlier this month in reference to the current state of networked storage complexity. "We're just waiting for the next thing to come along and topple it." Toigo made these comments and more to a room full of storage managers and storage staffers in Chicago, Ill. at his recent Storage Management 2003 expert workshop, "Business continuance for networked storage."

During the session, Toigo recommended taking some of the pain out of data protection by first ensuring you have "self-describing" data. This requires data to be marked first according to its value as it migrates across a server and then organized. "Without it, data protection is a tactical Band-Aid," Toigo said.

He identified the five most common problems in data protection planning and offered some advice for addressing each issue.

Problem #1: Backing up large quantities of data in a short backup window
In a 24x7 environment, there is little time to schedule a backup, and yet you may still find yourself backing up everything but the kitchen sink when you do run one. The solution is to reduce the amount of data you actually need to back up, Toigo said.

For instance, three days after data is recorded to hard disk, it loses 30% of its utilization level, or the number of times users actually return to that data, whereas 30 days after it has been moved to disk it loses 90% of that utilization, according to Toigo. "Why not just restore what you need and shorten the backup window?"

There are several IP network options to get the job done in a shorter amount of time, said Toigo. These include:

  • Pre-staging data at a remote site (in which you would run a full-volume backup once, then incremental backups)
  • Utilizing a third party data mover for tape
  • Running disk-to-disk with tape emulation (which Toigo believed was a much faster solution than disk-to-tape)

Some fabric solution options he recommended to increase speed and reduce the amount of data to be backed up include:

  • Shared tape across a fabric
  • Disk-to-disk on a fabric with commonality factoring (which reduces the size of data by removing objects that are copies prior to backup)
  • Real or virtual disk-to-disk across a fabric

Problem #2: Short time-to-data requirements and the (in)efficacy of tape-based restore
When dealing with mission-critical data, many times you need to be able to restore it in four hours or less and you need that data to be restored in a usable format, Toigo said. Tape restore is not of great help here because it is considered too slow to retrieve data quickly.

Potential solutions include data pre-staging at a recovery site with minimal tape restore requirements, remote caching with failover (or mirroring with failover). When discussing mirroring, however, Toigo offered the following caveat: "Mirroring is not a panacea," he said. "When you screw up data, it's garbage in and garbage out." In other words, mirroring will copy all of the data whether it's correct or wrong, important or not.

Toigo also suggested doing replication in the fabric by programming in a switch. The theory is that it's better to maintain software on a switch (one location) rather than the host (requires multiple licenses).

Problem #3: Cost of replicating storage infrastructure at hot site
It costs a lot of money to replicate a heterogeneous, diverse storage system someplace else. However, you can consolidate what's in a production environment into a recovery environment if you plan for it, Toigo said. One possibility is to rehost on networked JBODs and use vendor-agnostic data replication software, although he added, "Tape is still the best vehicle for on-the-fly data rehosting."

To address your data protection requirements directly, Toigo emphasized that users need to "beat up on vendors to support cross-platform rehosting." This is where he says the Enhanced Backup Solutions Initiative (EBSI) comes into play. This vendor-agnostic, industry coalition is working to identify and certify data protection models that may mix and match hardware and software technologies from both mirroring and tape vendors. Doing your own product testing and reporting that information back to EBSI can help simplify future data protection solutions.

Problem #4: Recovery of dataset deltas
Dataset deltas can help you identify where data was during a disaster, where it goes after a disaster and how much of it made it to its intended destination. To deal with deltas and determine whether the copy and the primary are synchronous or asynchronous, you need to identify what is causing them and whether those causes can be realistically addressed, according to Toigo.

Problem #5: Assurance of protective measures
Your confidence in a recovery solution needs to derive from testing. This includes testing all vendor claims and considering new solutions. "Don't be afraid to test solutions that leverage both established players and the newbies...many innovative solutions are coming from startup companies," Toigo said.

For more information

Tip: No one-size-fits-all data protection solution

Tip: Think before you invest in disk-to-disk backup

Tip: 10 data protection recommendations

This was last published in April 2003

Dig Deeper on Network Security Best Practices and Products

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.