Problem solve Get help with specific problems with your technologies, process and projects.

Challenges of performing disaster recovery across the WAN

As companies become more reliant on business-critical information, it becomes increasingly imperative to protect resources against unexpected loss. Network architects will benefit from understanding that network backup and data replication have common characteristics that can make it difficult to perform disaster recovery across a WAN.

As companies become more reliant on business-critical information, it becomes increasingly imperative to protect resources against unexpected loss. Threats come in many shapes and sizes – from server maintenance to unexpected network outages to catastrophic natural disasters that can debilitate an entire building or region. As a result, enterprises are required to develop robust disaster recovery plans that ensure maximum availability with maximum flexibility.

Data backup and replication are two solutions that are essential to most enterprises' disaster recovery plans. However, given the fact that both of these solutions typically involve the transfer of large amounts of information across significant geographic distances, limitations in WAN technology can make it difficult to implement them effectively. As a result, WAN acceleration solutions can play an important role in disaster recovery.

Common characteristics
Network backup and data replication (synchronous and asynchronous) have common characteristics that can impair the effectiveness of these solutions when delivered across a WAN. These include:

  • Large volumes of data: When performing backup and replication, database files, control files, and other information must be transferred across the WAN. As a result, WAN links are forced to handle hundreds of megabytes (or terabytes) of data when doing disaster recovery. To accommodate this enormous volume of data, enterprises will typically deploy large WAN links between data centers (e.g., 45 Mbps or higher). Given the price of WAN bandwidth, this often becomes the most expensive component of a disaster recovery solution.

    In some instances, backups can be postponed until non-peak hours to assist with this problem. In many cases, this is not an option, however, because backups must be performed regularly (e.g., every hour) for compliance reasons. Furthermore, data replication is most effective if performed in real time, eliminating the ability to postpone this function until off-hours.

    Historically, 10 Mbps of bandwidth was recommended for each MB of data copied per second. In that scenario, a 45 Mbps T3 link could handle almost 5 MB of data per second. This can become quite costly in environments where exceptionally large volumes of data are being transferred. Fortunately, this metric is changing dramatically, as enterprises are deploying new techniques for data reduction, as discussed below.

  • Sensitive to bandwidth and latency: When performing synchronous data replication, the primary server cannot continue to write until the secondary server finishes writing and sends an acknowledgment. As a result, the process is highly subjective to WAN latency. Even asynchronous replication and network backup can suffer from high latency because transfers can time-out across the WAN, leading to file transfer failures and subsequent synchronization problems.

    The Transport Control Protocol (TCP) often requires significant tweaking to run on WAN links with high latency and/or low bandwidth. Consequently, many data replication solutions, such as Veritas Volume Replicator, default to the User Datagram Protocol (UDP) instead of the TCP. In those instances where the TCP is employed, specific acceleration techniques are often required to maximize the effectiveness of disaster recovery.

  • Repetitive information: A significant portion of information sent across the WAN for disaster recovery purposes is repetitive. As a result, many solutions transfer only data blocks that have changed since the previous backup/replication. Incremental changes can significantly reduce the amount of traffic traversing the WAN, which speeds up the backup/replication process.

    It is important to note, however, that backup/replication solutions will examine large blocks of information to determine what is incremental. They do not have the same level of granularity (and therefore WAN efficiency) as other solutions that can detect repetition at the individual byte level. For instance, if a single byte edit is inserted into a file, all of the blocks within the file will change. Block level data reduction will not catch this change and therefore cannot reduce the amount of data transferred across the WAN in this scenario. Byte level solutions, on the other hand, can detect deltas down to a single byte, detecting the slightest changes, for maximum WAN efficiency.

    In addition, incremental backups require a full backup as a baseline. If the full backup is compromised or out of date, the incremental backups are useless. As a result, it is essential to perform full backups fairly regularly – once a week, for example. In the event of a disaster, it is the full data set that is often required to restore the main servers, so the WAN must be able to handle large volumes of data, above and beyond what is sent as incremental changes.

These characteristics can make it difficult to perform disaster recovery across a WAN. Thus, they can compromise disaster recovery plans by reducing the frequency of backups/replication. Or, in some instances, they increase the cost of performing disaster recovery because IT resources are required to correct errors. Either way, enterprises can be exposed to a "vulnerability gap" if not handled properly.

About the author: Craig Stouffer is vice president of Worldwide Marketing at Silver Peak Systems. He has 17 years of industry experience in marketing, product management and business development. He previously held positions with Juniper Networks, Redline Networks (acquired by Juniper), Optranet (acquired by Extreme Networks), and Tut Systems.

This was last published in June 2006

Dig Deeper on WAN optimization and performance

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.