Storage and Preservation of Digital Images

By Mark Dubovoy Back to

Dubovoy_SO_2008_1

Something momentous has happened with the development of digital photography: For the first time since photography was invented, we have the capability to preserve original images without any deterioration for extremely long periods of time. Perhaps forever.

The negative in the shoebox

Most PHOTO Techniques readers know that storing negatives or transparencies in a shoebox is a bad idea. These boxes usually are acidic, do nothing to control temperature and humidity, and can lead to damaging physical pressure from having the originals on top of each other. All of these factors lead to premature decay and damage to precious originals. It is not unusual to find color negatives damaged beyond recognition after only a decade or so of storage in a shoebox.

I know many black-and-white film shooters who are complacent about preservation of their originals. They seem to believe that if they store their negatives in acid-free sleeves or envelopes and then place them in a metal cabinet, they will last forever. Nothing could be further from the truth. They will last longer than color materials, but they definitely will deteriorate under these conditions.

Good preservation of film originals requires placing the originals in very low-humidity, hermetically sealed containers that are stored in special cooling or freezing units that can maintain a very narrow temperature and humidity range. Double-hermetically sealed containers and humidity-controlled cooling units are highly recommended because the slightest leak can ruin an original.

Having properly sealed rooms, dehumidifiers, cooling units, and containers is a very expensive proposition. This is why there are dedicated facilities for this purpose, such as the one at the University of Arizona, which houses the originals of many of the best-known photographers of the 20th century. Even though originals are expected to last a long time under these storage conditions, they will still slowly deteriorate. Furthermore, chances are that some deterioration occurred prior to the work being sealed in the containers. Finally, trying to make new prints or scans from these originals can create problems. The process of warming up the work, taking it out of the sealed containers, dehumidifying it again, and resealing them properly is tedious, expensive, and time consuming. Also, the original will suffer every time it is exposed to the bright light of an enlarger or a scanner. And last, but not least, there is always the possibility of accidental damage, such as a scratch, while handling the original.

Bottom line: A film original is a fragile object that starts to deteriorate immediately after it is processed and can be easily damaged when it is handled. One can slow the deterioration, but not stop it completely. (For more information on this and related topics, I recommend Henry Wilhelm’s book, The Permanence and Care of Color Photographs.)

Digital files

Enter the digital image, which offers the ability to store an original without any deterioration—essentially forever. The reason for this is that a digital file can be copied without any loss of quality. In other words, we can make duplicates that are bit-for-bit identical to the original. This is not possible with non-digital images.

The fact that we can copy original images with no loss of quality means that regardless of what storage media or what storage formats are developed in the future, all we have to do is copy our originals to the new devices; a simple and totally automatic process. Furthermore, for ultimate safety, we can keep multiple identical copies of our images in different locations and on different devices; so, in theory, we never lose them.

One of the topics that constantly comes up in my workshop discussions is how to store digital images properly, both in the field and after they are edited. What follows are some guidelines developed from personal experience. I realize they may be overkill for some and insufficient for others. Please take the following paragraphs as a set of guidelines, not as the ultimate gospel.

In the field

I am never comfortable having my images in only one place. I feel that the risk of an operator mistake or a device failure wiping out irreplaceable images is indefensible. Therefore, regardless of whether I am in the field or at my home base, I always copy my images as soon as possible.

In the field, I always copy my images onto three separate devices. I usually take a small laptop with me and copy my images from the memory cards to the laptop’s internal drive as well as to two external drives. For obvious reasons, I usually buy the ruggedized versions of these external drives. I do not erase the memory cards unless I need them for further shooting, and I do so only after opening the copied files first, to make sure that they were properly copied to all three devices.

This last statement is extremely important: Always open the files in each device to make sure that they were properly copied before you erase your memory cards!

When traveling, I place each device in a different place. For example, I place one external drive in my jacket pocket, the laptop in its case, the second external drive in my carry-on bag, and the memory cards in my shirt pocket.

Home base

First of all, I recommend that you always store your Raw files as well as the edited files. The Raw files are your originals (the equivalent of the original negative or transparency). You never want to loose the Raw files. In fact, as Raw converters keep improving, many of us have been able to improve our images using newer Raw converters. Obviously, all edited files should be stored as well.

TIFF is the most universal format, and I recommend saving final edited images as TIFFs or PSD files. It is true that some Raw formats may become unreadable in the future. The only advice I have is to convert proprietary Raw files to Adobe DNG or some other future, more universal, Raw format. I still think you are way better off converting to DNG and having that data to use with future converters than losing your “true originals.” Adobe claims that nothing is lost converting to DNG (I have not tested it), and the converter is a free download.

A key question after returning to home base is whether to store images on CDs, DVDs, computer or external disk drives, magnetic tape, or other media. Memory cards are great non- volatile storage devices. Unfortunately, they currently are too expensive to use for storage. Although high-quality CDs and DVDs are available (eFilm, AMA) with claimed longevity of 300 and 100 years respectively, I am not a fan of CDs or DVDs. I can only fit two or three images onto a DVD, which is not very practical. The read-and-write times for a DVD are also extremely long, and the physical storage of many hundreds, perhaps thousands, of these disks, with a system that can find a specific image, becomes a serious problem. Besides these problems, I am very skeptical about longevity claims for media. I have been stung many times before by a manufacturer’s overly optimistic claims.

BluRay is not an attractive option for the same reasons explained above. The longevity of tape is quite short. Therefore, I’ve found the best practical choice is disk drives.

Disk drives

Disk drives are a great option. They are inexpensive, occupy a small amount of space, have huge capacity, and their prices continue to decline while their capacity and performance continue to improve. Furthermore, after more than 50 years of large numbers of users working with these devices, we have compelling evidence that the data written on these drives should last well more than 100 years.

I know the next question: How can you claim that disk drives are great when we have always been taught that they fail? We are always told that it is not a matter of “if” a drive will fail, but “when” it will fail. I have three answers to this question:

1. The current calculated mean time between failures (MTBF) for a typical drive under average working conditions in an office is around 100 years. I think it is safe to assume that a photographer can expect an MTBF of at least 25–50 years.

2. In a large percentage of cases, when a drive fails the information can be retrieved by an expert shop. This is expensive, but if the data is valuable it is certainly an option.

3. If your workflow in the field and your home-base system are properly configured, the failure of one drive, or even several drives should not affect your data at all.

Number 3 is the key. With properly configured systems, one can prevent all losses. Since we can make identical copies of digital originals, as long as you have redundancy, a drive failure should not result in the loss of a single bit of information.

To RAID or not to RAID

In the past few years, I have observed many photographers, and I think that when it comes to backing up, I can divide them into four broad groups:

1. The careless bunch—They save their files on the computer’s main drive and use no backup.

2. The half-careless bunch—They save their files to the computer’s main drive and periodically back up “important” files to an external drive.

3. The overwhelmed bunch—This includes almost every professional photographer I know. Because of the large number of photographs they take, they save their files into an external drive. When that drive is full, or a job is finished, they disconnect the drive and put it in a closet. Even though they know that a drive needs to be connected to a computer and spun at least a couple of times a year (to prevent a disastrous encounter between the head and the disk media), they usually never do it.

4. Those who save their originals properly.

Members of the first three groups risk big problems or a catastrophe at some point. I would much rather be safe than sorry, and that is why I am a fan of RAID arrays. What is a RAID array? RAID is an acronym for Redundant Array of Inexpensive Disks. RAIDs were invented to replace much more expensive large computer drives. A RAID array can consist of as few as two disk drives and a controller, or as many drives as a controller (or a group of redundant controllers) can handle. A RAID array will usually look to the computer as a single external disk drive, regardless of the number of drives, controllers, power supplies, or specific con- figuration.

Since a RAID array with only two or three drives is substandard in terms of functionality and data safety, all references from here on are for RAID arrays with at least four disk drives.

A typical RAID array will usually have a cabinet with slots where disk drives in special carriers that fit the slots can easily slide in and out. Adding or replacing drives is as simple as sliding unwanted drives out and sliding new ones into the slots.

The drives can be configured in many ways for storage efficiency, speed, and safety of the data. The most widely preferred configuration today is called RAID 5.

In a RAID 5 configuration, as the data comes in, it is not written to any single disk. Different parts of the data are written simultaneously to different disks, so the write and read speeds are extremely fast. The data is shared among the disks in such a way that even if one drive fails completely, all the data can be recovered intact by the system. A single-drive failure becomes basically a minor inconvenience with no loss of data. Even if two drives fail simultaneously, the system can usually recover all of the data, although in some cases it may lose a small fraction of the data.

On the other hand, disks in an array can be configured as “hot spares.” The hot spares are drives that stay in idle mode under normal circumstances. However, if the system detects any pos- sibility of an active-drive failure, the offending drive is immediately and automatically replaced by a hot spare. The drives do not have to be physically touched; it all happens in software. In this way the system does not miss a beat when a drive fails. The operator is flagged about the bad disk, which can be taken out and replaced anytime without the need to reconfigure or even turn off the system.

The inclusion of one or more hot spares into a RAID system allows for tremendous resiliency and safety of the data. Even extremely unlikely events, such as the simultaneous failure of several disk drives, result in only minor inconveniences without any loss of data and without any interruption in the work being performed.

RAID arrays are used extensively by large corporations for critical applications such as payroll. They are also used extensively by professional organizations such as video editing studios, where their lifeline depends on data integrity and absolute reliability. If it works for them, it should work for me!

Some words of caution about RAID arrays: Like many other things in life, the acronym RAID has become a buzz- word for marketing purposes. You need to be careful and understand exactly what you are purchasing, as marketing folks use the word RAID very loosely.

Not all RAID arrays are created equal. As unlikely as it is, a bad failure in a RAID controller, for example, could potentially corrupt the data as it is written to the drives. In order to also prevent these unlikely events, “industrial-grade” RAID arrays always come with at least two controllers (one is a hot spare) and two power supplies (again, one is a hot spare) for extra safety. These RAID arrays have two of


About the Author

Mark Dubovoy
MDubovoy
Dr. Dubovoy is highly regarded as a technical expert in many aspects of photography and printing technology. He is a regular writer of technical articles for The Luminous Landscape and photo technique magazine and is a lecturer at various workshops. His photographs are included in a number of private collections, as well as the permanent collections of the Museum of Contemporary Art in Mexico City, the San Francisco Museum of Modern Art, the Monterey Art Museum, the Berkeley Art Museum and the Museum of Modern Art in Nanao, Japan. He is a partner and Board Member of The Luminous Landscape, Inc., and holds MS and Ph.D. degrees in Physics from the University of California, Berkeley.