When you administer a computer locally, you have any number of disaster recovery and protection possibilities available to you -- and you'd better make use of them, too, or Murphy will get you. The same goes for a remotely-hosted system -- in fact, if anything, it goes double, since it's all the harder to perform a recovery on a system that's not physically present.
In this piece I'll be discussing the two most basic ways you can recover from, and fend off, disasters that can overtake a remotely hosted system: backup and redundancy.
8 million ways to crash
So what can go wrong? A list of all the things that could go wrong with a remotely hosted system could easily turn into a catalog, but most of the problems that can emerge fall into a few basic categories: hardware failure, software failure, user error and malice.
Hardware failure includes things like hard drives and power supplies that go bad, while software failure would include fatal bugs that cause a system to become unresponsive. User error could mean you accidentally shutting down a system remotely (instead of rebooting it), or having someone at the hosting center turn off the power to the wrong cage by mistake. Malice includes viruses, worms, and deliberate sabotage -- things that with a little protection, most of us should never have to deal with.
Hardware failure and user error tend to be the two biggest problems in datacenters. Systems in a datacenter are kept in environmentally-controlled rooms, which prolongs their lifespan a great deal, but that doesn't prevent hardware from going bad.
Backup: Think globally, act remotely
Severe user errors typically entail data loss, and to that end you need to have a robust remote backup strategy. Most hosting companies offer different levels of backup depending on the type of plan you're purchasing from them. The amount and cruciality of the data you have on the remote server should dictate your remote backup strategy.
The server hosting company I use (ThePlanet.Com) offers four varieties of backup options for a server hosted in their farm: none (the user's responsible for his own data), tape backup, NAS (from 20-200GB backed up through the network), and DiskSync, a disk-to-disk backup method (from 10-160GB backed up monthly).
My own scenario is fairly low-end. Since I don't keep very much crucial data on the remote server, I simply use a script to compress all the relevant data (about 50MB) into an FTP folder and download a copy once every couple of days. This puts more of the burden of backup on me, however, and the data in question is not at all easily replaced, so I have to be conscientious about it. If such a computer had dozens of gigabytes of corporate assets on it, it would be more than worth shelling out the extra $20-$100 a month to have the hosting company keep regular backups, and it wouldn't be worth trying to download a mirror of the whole thing every couple of days.
Hardware redundancy: An ounce of prevention
Hardware failure is one of those facts of life that you can't even completely get around: if a hard drive dies, you have no choice but to replace it. There are varying levels of contingencies that you can put into effect against such things, depending on your budget and how much is actually being protected. If you can't afford redundancy in your hardware, the best protection is to back up early and often.
Redundant auxiliary hardware. This option's generally only possible when you first buy a server to be hosted. The most common implementation of this is RAID arrays, where each disk in the array has a mirror; if one disk fails, it can be removed and replaced without taking down the whole system. But this philosophy appears in other hardware devices, too: a server with redundant power supplies, for instance, can continue to run even if one of the PSUs fails; the dead one can then be swapped out and replaced while the system continues to run normally. Hot-pluggable memory works the same way -- the system contains multiple and redundant DIMMs, and if one fails it can be removed and replaced without stopping the whole system. Keep in mind that your hosting company will almost certainly charge you a per-incident fee (and possibly the cost of replacement hardware as well) if they have to swap anything out.
Two-node failover / active-passive clustering. This is probably the most expensive solution, but it's also one of the most total: installing a second server which is set up as a passive cluster node. If the active node fails for whatever reason, the passive node kicks in and takes over. Most hosting companies should be able to provision for a two-node cluster, but it may mean extra fees -- and there's also the fact that you're shelling out for a whole second system with its own OS and software licenses. It's costly -- you'll be paying for a second server's hosting plan which will almost never get used, and you'll need an operating system that supports failover clustering -- but for "five nines"-style uptime it's tough to do better.
"Cold-swapping." This technique works best if you are renting several racks at once or even a whole cage. If you rent space for four servers, for instance, you can install two live servers and two backup servers which are nominally kept switched off. In the event one system fails, you can swap drives from one system to the next, or simply keep the drives with crucial data in a SAN or NAS that can be connected to either one as needed. Obviously the expense will be a great deal more than many of the above options, but this is one of the most flexible plans -- the cold-swap system can be configured at any time, without disrupting work, and can contain a radically different hardware setup from the live system if you need it.
Fast Guide: Managing remotely hosted servers
1. Use Remote Desktop to manage remotely hosted servers
2. Alternate approaches to Remote Desktop
3. Managing applications on remotely hosted servers
4. Managing remotely hosted servers: Remote disaster recovery and prevention
About the author: Serdar Yegulalp is editor of the Windows Power Users Newsletter. Check it out for the latest advice and musings on the world of Windows network administrators. He is also the author of the book Windows Server Undocumented Solutions.