There is an irony about digital photos: they can last forever without degrading (even get better through improved imaging software) and yet they are very volatile, many bits on storage media that can easily get lost or damaged.
A backup is a copy of data that is sufficiently independent of the original so that destructive events can’t affect both at the same time. A backup doesn’t prevent destruction of data; it only allows you to recover the data once the destruction has occurred.
A simple example of backup is copying files from a laptop to a CD (the act of copying), and putting the CD in your desk at home (independence of location). If the laptop is stolen (the destructive event), you still have the data that was copied to it.
To design and implement a backup plan, one has to consider the possible threats to data (e.g., theft, electrical surge, fire); the various ways to copy data (e.g., using the Mac Finder or Windows Explorer, using a backup utility, using CD/DVD-burning software); and ways of achieving independence (e.g., online storage or placing media in a safe-deposit box).
Unfortunately, most articles about backup focus on the copying and ignore data threats and independence. But without evaluating all the threats, there’s no way to be sure that the backup will allow you to recover from them, and insufficient independence means that both the data and the backup can be destroyed by the same event. An obvious example is a fire that destroys everything in an office.
Additionally, if backup isn’t convenient—automatic, ideally—it may not get done often enough to be effective. (It’s common after a data loss for someone to regret that their most recent backup is months old.) The restore has to be convenient, too, or else the damage will be compounded— an electrical surge is bad enough, but if your data is unavailable for a week while you restore it from an online service, you still lose a week of productivity.
Since backup is potentially expensive and time- consuming, you also have to consider the importance of your data. Backing up irreplaceable photographs is more important than backing up application preferences.
For photographers, the three places where your data (images, mostly) is threatened are: in the camera; in the field, during or just after a shoot; or back in the office. In all three places, only six types of threats can destroy data:
• User error. A user mistake that accidentally deletes or overwrites one or more files. Examples are reformatting a card by mistake, losing a card, and accidentally deleting a folder of images on your computer.
• Equipment failure. This includes any failure of hardware or software that results in data loss. We put the two together because it’s often difficult to tell whether the problem was caused by software or hardware, and because the effect on the data is usually the same. The most talked- about failure, a disk crash, is in this category, but so is an operating-system upgrade that causes a file system to be corrupted, or an application install that deletes data files. (Apple once accidentally released a version of iTunes that could delete all files on a hard drive.) Camera and card failures go in this category as well.
• Electrical surge. This is in its own category because it can affect every plugged-in device in a home or office, so it makes independence especially difficult. Copying data to an external drive won’t protect you from a surge if the drive is plugged in—but if it’s not plugged in you can’t access it. A good surge protector can prevent damage from some surges, but there’s no device that’s guaranteed to prevent them all.
• Disappearance. This includes theft and accidental loss (leaving a laptop in a taxi, for example). The good thing about theft and loss is that even the slightest amount of independence is effective: Thieves might take the computer on the desk but probably won’t notice the hard drive on a shelf under the desk, and someone who snatches your camera probably won’t take the card in your pocket.
• Office destruction. This includes anything that destroys the location containing the computers and includes fire, explosion, structural collapse, collision, water damage, and vandalism. Everything in the office might be destroyed, including external drives and CDs/DVDs.
• Regional disaster. Anything that damages an entire neighborhood or city, such as radiation, flood, earthquake, tornado, and various acts of war or terrorism. Here even a copy in a bank safe-deposit box may not be safe.
Of course, not all threats affect all locations equally. In the field, a surge is seldom a problem because you’re not normally plugged in. In the camera, card failure is a serious threat because most cameras only record one copy of the image; once you’re back in the office and have made a couple of backups, the card doesn’t even matter anymore.
The perfect backup
The almost-perfect solution in the office is to back up your computer every hour to an ultra-reliable, redundant, online storage service such as Amazon’s S3. We say “almost” because the backup software you use might have defects. To make it perfect, you need two or more completely independent copying utilities and services.
Unfortunately, for many photographers there’s too much data for this to be practical. If a photographer comes back from a shoot with 20GB of photos (not unusual) and has a T1 line (1.544 megabits per second) operating at 100% efficiency (extremely unusual), it would take 29 hours to copy the photos to an online service. Every 1GB of image data modified (50 photos at 20MB each) would take an additional 1.5 hours or so to upload. That’s assuming that there is a T1 line, that it operates at 100% efficiency, that the line isn’t being used for anything else, and that the online services can receive and store the data that fast. At a more realistic upload speed, say 500 kilobits per second, it would take more than three days to upload the 20GB, by which time the photographer might have shot another 60GB. The backup would never finish!
Oh, we forgot . . . the photographer needs two T1 lines, because we were going to use two independent services. Worse, when you’re in the field, there’s usually no Internet access and, even if there were, it would be much slower than T1.
So the minor problem with our perfect backup scheme is that it won’t work. We need to back up to hard drives and/or optical disks, which gets very complicated.
A backup plan therefore consists of a collection of overlapping imperfect solutions, as we will explain. And because digital photography technology evolves and your use of it changes, backup is a construction project.
We listed the six categories of threats that can destroy your data: user error, equipment failure, surge, disappearance, office destruction, and regional disaster. Before acquiring and installing hardware and software for backup, you’ll want to develop a plan for backup and restoring so you can ensure that you’re covered against all six.
Convenience vs. independence
A backup only works if it’s independent of the original data, and multiple backups are effective only if they’re independent of each other. Generally, the more convenient a backup method is, the less independence you get. So, you’ll need a combination of methods: One or two that are convenient but provide just enough independence to protect against the most common threats, and one or two that are inconvenient but provide complete independence.
For example, using a background backup utility such as Apple’s Time Machine is convenient, but since the backup drive has to be within WiFi range and plugged in to power, it doesn’t provide enough independence to protect against surge, office destruction, or regional disaster. Those threats are much less common than user error, equipment failure, and disappearance, so running Time Machine is a great idea. It’s just not the only idea.
For protection against surge, all you need to do is back up to an external disk (probably not with Time Machine) that you can unplug and, for good measure, put into a fireproof media safe. Store that drive in a neighbor’s house and you’ll protect against office destruction as well. Take it to your mother’s house 25 miles away and you’re protected against most regional disasters. Copy your irreplaceable files to online storage such as Amazon’s S3 and you’re even more completely protected.
In the field during a shoot, you have more important things to do (photography!) than to deal with backup, so it has to be even more convenient than it would be in the office. You have fewer choices in equipment, too. The last thing you want is for your concern about the six threats to interfere with your workflow. If it compromises your photography, it isn’t. Of course, you have more options when you’re shooting landscapes or interiors than you do when you’re shooting weddings, sports, or breaking news.
The way we arrived at the combination of methods in the previous section was to list the six types of threats, list the available backup methods, and then pair them up to ensure that we were covered. The more backup methods available and the more you know about them, the more effectively you can come up with something you can live with. If your plan is too inconvenient, you’ll find you’re not using it, and then you won’t be protected.
Hardly any operating system comes with sufficient backup software, so you’ll need to buy a third-party utility.
You’ll have to spend some money, mostly for software and external drives. The software should cost less than $100; a couple of 500GB drives for your most important data will cost less than $150 each. Amazon’s S3 service is really cheap, only a few dollars a month. So, for about $500, a little planning on your part, and a slight change to your work habits, you can get almost 100% protection from all six threats.
The worst mistake is to assume that some exotic piece of equipment is a complete solution. Recently, someone on a digital photography forum said that he had lost a day’s shoot because his portable external backup disk failed, so he solved the problem by replacing the drive with a portable drive-mirroring device. In fact, the drive that failed wasn’t a backup (independent copy) at all—as soon as he erased the card, it became the un-backed-up primary copy. His new portable mirror is a slight improvement that probably protects against an actual drive failure, but it doesn’t protect against failure of the drive controller or power supply, against physical damage or loss, against surge when it’s plugged in, or against user error. Had he followed our approach, he would have added a true backup instead of trying to make the single drive more reliable. Also, at today’s prices, it’s usually unnecessary to erase a card during a single-day’s shooting.
Uwe’s strategy in the field is to assume he’ll be in a hotel at night on most trips. He carries enough memory cards to get through a day (at present, 40–50GB). In the evening, he copies the images to his travel Mac and then also backs up to two USB-powered disks. These two disks stay in his camera bags and are not left in the hotel.
It’s a safe bet that very few people who do a backup have ever tried a restore to see if the backup worked. It’s not hard to see why: Restoring a complete system is pretty disruptive, and if it doesn’t work, you’ve just wiped a perfectly good system. To test a restore, you have to put another hard drive in the system so you can safely overwrite it (or wait until you have a new computer). Marc has a Windows desktop with six drive bays with handy slide-out trays, so it’s very easy for him to pop in a new drive to test a restore while the primary drive is safely out of the computer. (This is not a common set up, however.)
Even without actually doing a complete restore, you should spot-check your backup to ensure that your files are really there. Backup software that won’t let you do this, such as Vista’s Complete PC Backup, should be avoided.
Your restore plan also should include a way to replace damaged hardware. If you live near computer stores and they’re open when you need them, you might be able to simply buy what you need when you need it. But if not, and time-to-restore is important, you need replacement equipment on hand and ready to go—if your replacement hard drive already has data on it, you won’t be able to use it without destroying that information.
After a restore, make sure you don’t start running without a backup. For example, suppose you keep a complete, bootable copy of your primary drive on an external drive. If the primary drive fails, you can boot from the external drive, which gets you up and running immediately, losing only a few hours of work. But if you run that way, you no longer have your backup, since the backup drive has become the primary and the old primary is dead. Instead, you should immediately clone the backup to a replacement primary drive or, if that’s not feasible, clone the backup to a second external drive.
You’ll usually back up to external hard drives, optical disks, or online storage. The first two are discussed in this article.
The main external-drive choices are a single drive, a network-attached drive, or a RAID drive-set (which could also be network-attached). Internal drives aren’t good choices because they’re not sufficiently independent of the computer being backed up. They’re gone if the computer is stolen, and they share the same drive controller, so a controller failure could destroy the data on all the internal drives.
External drives are available in sizes from about 100GB to 2TB, and some of them are entirely powered from the USB or FireWire cable, which makes connecting them and transporting them especially convenient. (Laptop and some desktop USB ports often don’t put out enough power for an external drive, but some drives come with a split cable that allows you to draw power from two USB ports.)
Nearly all computers have USB 2; those that have FireWire (IEEE 1394) usually have FireWire 400, although FireWire 800 is available on newer models. For external drives, FireWire 400 is noticeably faster than USB 2, and FireWire 800 is much faster than FireWire 400.
You’ll know if you have a computer or drive with FireWire 800, as opposed to FireWire 400, because the connectors are different (USB 1 and USB2 use the same connector). You can connect FireWire 800 disks to FireWire 400 ports on the computer (of course you then only get FireWire 400 speed).
Several times faster than FireWire 800, eSATA (external Serial Advanced Technology Attachment), is just becoming available. It takes the bus commonly used inside the computer (SATA) and extends it with an external cable. Since most external drives are SATA, this bypasses the conversion to USB or FireWire and then back to SATA.
(Currently, eSATA cables don’t supply power.) With a $40 eSATA PC-Card, adding eSATA to a laptop is even easier than it is on a desktop or server.
Remember, though, that for backup, the most important property of an external drive is that you can easily separate it from the computer, not the speed of its connection.
If the drive is going to be running all the time, put it out of sight, such as on a shelf under your desk, or on a bookshelf with some books or a family photo in front of it. Figure that a thief won’t know your drive is even there and, even if he does, few will steal a $200 drive when there are computers, CDs, and jewelry to take instead, all much easier to fence.
RAID stands for Redundant Array of Inexpensive Disks. The advantage of RAID is that it’s more reliable than a single disk of the same size. The disadvantages are that it costs extra for the same amount of storage (at least one extra disk and some fancy electronics), and that the disks aren’t nearly independent enough because they share the same power source, controller, driver, and connecting cable.
There are various RAID arrangements, the two most popular of which are RAID-1 (mirroring) and RAID-5 (striping with parity). The idea is that when a disk fails it can be removed from the running system and replaced without the system going down or any data being lost. When you replace the defective disk, the RAID system automatically restores the data that was on it. Unfortunately, the restore often takes hours, during which time you are vulnerable to a total data loss if there’s a second failure.
So-called RAID-0, also called striping, isn’t really RAID at all because there’s no redundancy. The loss of either disk destroys all the data on both disks. RAID-0 is for performance, not for reliability, which is actually reduced.
To see why RAID doesn’t diminish the need for backup, we can do a quick threat analysis:
1. User error: No help; with RAID there’s still just one logical copy of the data.
2. Computerfailure:RAIDprotectsonlyagainstdisk failure. That’s probably the most common hardware failure, but as RAID doesn’t protect against other hardware failures (such as the RAID controller itself and the rest of the computer, nor against software failures), the need for backup hasn’t changed.
3. Surge:Nohelp,asallRAIDhardwarehastobe plugged in.
4. Disappearance:Nodifferentthanasingledisk. An external RAID cabinet can be hidden below the desk, but so can a single-disk cabinet.
Anyone who thinks RAID provides sufficient protection is focused too narrowly on a single kind of failure, the disk itself, and is ignoring the other threats. Still, if you have the money, a RAID device is more reliable than a single disk. (Not RAID-0, however, which is less reliable.)
Optical disks are in theory more stable than hard drives, even unconnected ones, because they don’t rely on magnetic recording and have no moving parts. Disks you write on a computer use dyes, however, so they’re not indestructible.
However well you care for optical disks, they won’t last forever. You should plan to recopy them every few years, and then verify that the images on the copy are still good with a tool like Marc’s ImageVerifier (imageingester.com/ivinfo.php).
Backup software requirements
The following requirements must be met by any backup system:
1. There must be a way to determine what will be backed up. Systems that tell you they’re backing up “other files” without naming them are unacceptable. Systems with so many options that you can’t easily determine for sure what your settings are going to do are also problematic unless you’re willing to spend time studying the documentation (if any), run experiments, and verify (see #3, below) that you’re backing up what you think you are.
2. There must be a way to tell if a specific file was backed up, so you can spot-check the backup. Systems that keep the backup in a mysterious form (e.g., a giant, compressed file) don’t meet this requirement unless they also have a user-interface for showing you a list of files. Vista’s Complete PC Backup fails in this regard (more later).
3. There must be a way of verifying the integrity of the backup, short of doing a restore and running the system for a week or two to see if any hidden problems show up. A system that uses the ordinary file system meets this requirement because you can run a utility or script to compare the two folder hierarchies—file-by- file if you want. Systems that use their own format have to provide a separate verification option.
4. For a complete backup,there must be away to restore individual files. (This may not be a requirement for some, but it is for us.)
A much better choice is Super Duper. We used Super Duper to make both bootable disk images and partial backups for carrying offsite, with consistently excellent results. We run Super Duper every night to alternate external drives, so that if there’s a failure during one of the backups, we still have the backup from the previous night.
Marc also tried Retrospect for OS X, which interested him because it’s one of the few backup programs that can write CDs/DVDs, but it repeatedly hung trying to write a DVD.
If Uwe is not making 1:1 disk copies, he exclusively uses folder-tree mirroring software for backup. He’s used ChronoSync on the Mac for about two years because it’s easy to use, can be scheduled, allows containers that bundle multiple synch jobs, and is a great value. It also has exceptional and flexible handling for deleting files in the target folder (e.g., even keeping previous versions of a file—we use this for business data).
Note on Synchronizing: We always synchronize in one direction. Doing otherwise can be a risky business—if a folder on the backup is empty, the software will empty that folder on your primary drives as well. Always think which folder tree is the master folder and synchronize from this folder to other target folders. Also remember that only one backup copy is not enough.
If you do that and follow the other steps in this article, you should come as close as realistically possibly to 100% protection.