By Carey Brown
Table of Contents
Media. 1
Passé. 1
CD / DVD.. 1
External USB
Hard Drives. 2
Raid. 2
Software. 2
System
Backups. 2
Data Backups. 3
Image File
Formats. 3
Bit Rot. 3
Miscellaneous. 4
Fire Safes. 4
Resources. 4
Feedback. 4
Glossary. 4
I’ve seen so many forum postings on image file backup and
archiving schemes that I thought it might be a good time to jot down some
suggestions based on my experience with trying to maintain backups of (at this
point) over 9 years of digital images. Most of this applies to non-image files
as well. I’m currently set up on a PC running Vista but most of this should
apply to other operating systems.
Tape and removable hard disks (such as the Iomega Jazz
disks) had their day (and their problems), but have now been replaced by newer
technology.
- CDs (DVDs are also included in this category) are a cheap
way to backup files.
- CD drives with the ability to read CDs are ubiquitous.
Only the oldest of machines would be lacking a CD drive. Drives with the
ability to read DVDs have been standard fare on machines purchased within
the last four years. The drives on older machines may not have the ability
to write to CDs or DVDs though. External drives to read and write to CDs
and DVDs are readily available at low cost.
- Always burn more than one copy and keep one copy off site
in case of fire.
- They are only “mostly” reliable. There have been numerous
reports of “bit rot” after as little as 2 years of sitting on the shelf.
What makes “bit rot” particularly insidious is that there is no way to
know when one of the files on the disk has been ruined by “bit rot” until
it’s too late.
- They are write-once/read-many devices. This has the
advantage that you know the contents haven’t been modified since you
burned the disk. A problem that I ran into with this is that I spent some
time updating the keywords and description in my image files and had to
re-burn copies of my CDs, and then I either had to get rid of my previous
CDs or somehow keep track of which one contained the updated images.
- There are re-writable CDs but my early experiences with
them showed them to be unreliable. They (and the drives) may have improved
but my distrust lingers. I do use them for temporary work such as creating
a DVD slide show so that I can review it and make changes prior to the
final burn on to standard DVDs.
- Because of their “limited” capacity (funny, this used to
be considered a lot of capacity) CDs are not usually suitable for system
backups. For data backups you may find that you have to pre-group your
data into directories that will fit on a CD (or DVD for that matter). Note
that there are some backup programs with the ability to span multiple
disks.
- Keeping track of what files are on which disks is
problematic, especially when you may have different versions of the same
file on different disks.
(This also applies to Fire Wire drives. See ‘Feedback’)
- The price per giga-byte of these devices makes them Ideal
choices for backing up.
- Make sure to have more than one and keep one off site in
case of fire. Hard drives are not so much prone to “bit rot”, but to
catastrophic failure, hence the need for multiple backups.
- All new USB drives use USB version 2.0. If you have an
older computer it probably only has USB 1.0. This will still work but the
transfer rate is twice as slow.
- They are also available with large capacities. I have
drives that will hold my entire 9 year collection with room to spare. This
is ideal for when I want to use a program that will search entire
directory trees for images containing a specified keyword and/or
description.
- Raid drives configured for redundancy are a good choice
for your desk top computer but should not be considered a substitute for
backups.
- It does not give you the ability to have off site backups.
- Raid provides a sort of continuous back up. This is not my
ideal choice because it provides no safety net for an “oops”, such as
mistyping an edit on one of your files. For that type of file recovery you
want a backup system that runs on command (or schedule) as opposed to
continuously. Of course, it’s then up to you to remember to run the backup
program.
There are two basic categories of on-command backup
software: system, and data.
- System backup software usually comes bundled with your
operating system.
- Designed to backup entire systems including program and system
files that are typically inaccessible to other software.
- The file formats used are proprietary and the contents
apply only to the machine on which the backup is made or an exact hardware
duplicate. You may not, for instance, do a system backup on a Windows XP
machine and expect to restore it to a Windows Vista machine.
- System backup is a very important step and provides a way
to restore your machine if the hard drive needs to be replaced or
re-formatted. Usually your system files do not change as often as your
data so a system backup need not be run as often as data backups.
- Full system backups may take several hours depending on
the size of your hard drives and how full they are.
- System backup software often comes with the ability to do
incremental backups. Incremental backups only backup those files that have
changed since the last running of the system backup software; therefore
they are much quicker than a full system backup. A major issue to watch
out for when using incremental backups is that they don’t track files that
have been deleted. When you restore from a system backup and subsequent
incremental backups you may find files that you had deleted magically
reappearing.
- Data backup software does not typically come bundled with
the operating system. It must be purchased or downloaded separately.
- Data backups do not include directories that contain
system or program files.
- The backed up files and directories should be copied in a
non-proprietary format. There are software products that may use
proprietary formats, but I would avoid these, they may be useless if you
want to restore to a different machine. A non-proprietary approach will
copy directories as directories and files as files such that you can use
standard programs to list, view, and access them. One format that I
include in the non-proprietary category is the ZIP file. ZIP has become so
ubiquitous as to qualify for non-proprietary status.
- Some data backup software has the ability to do
incremental backups. Some include the ability to track deleted files so
that they don’t reappear when you restore them.
- Because data changes much more often that system files,
data backups should be done much more frequently.
- A quick-and-dirty form of data backup is to use Windows
Explorer to drag-and-drop data directories from your local hard drive to
your external media.
If you backup copies of your image files in either TIF
(TIFF) or JPG (JPEG) formats you can rest assured that your files will still be
readable for many years to come. For any other image file format, this may not
be the case. This is especially true for RAW formats and formats specific to
your editing programs, such as Photoshop and Paint Shop Pro. These are
proprietary and are subject to obsolescence.
When I shoot using the RAW format (which is most of the
time) I always run a batch process to generate JPG copies. I prefer JPG over
TIF because the JPGs files sizes are so much smaller than TIF. JPGs have gotten
a bad rap because they use a “lossy” compression technique. I have found that
the losses are imperceptible as long as you use the highest quality setting
that your editing software allows (that’s 12 for those of you with Photoshop).
I always keep the original RAW file around in case I want to start over with
the editing process; the RAW files contain more data. Likewise, when I create a
Photoshop file that is multi-layered, I always save a non-layered version in
JPG format.
Bit rot is the unintentional modification of a single bit of
data, usually as a result of passing time or environmental conditions.
None of the backup software that I’ve come across has support
for validating the integrity of the backup files after time has passed. Some
programs allow you to verify the copies at the time of backup, but that’s not
the same thing. Some devices store the data in chunks that have a number stored
with them that is derived from the data (typically called CRC or Cyclic
Redundancy Check) that allows the detection and possible correction of minor
instances of bit rot. Unfortunately, the devices do not warn you that bit rot
has been detected and as the damage increases will eventually return corrupted
data, again, without warning.
I would like to see a backup program that will store some
sort of check number for each file on the backup and allow you to go back, say,
6 months later and validate the backup files against the check number, and
tell you about any corruption it found. I usually have more than one backup
copy so if I’m made aware of the corruption I can retrieve the data from the
other copy. This would also let me know when it’s time to throw the device/media
away and not rely on it.
There are some utilities that will compute and verify files
against a check value (MD5 is one) but one that is integrated into the backup
software and is aware of incremental backups seems like an obvious and
necessary enhancement. If anyone knows of such a backup program, please let me
know.
Some people choose to use fire safes instead of keeping a
backup off site. Be aware that most fire safes are only rated to protect paper
not computer media. If you intend on going this route, look for a safe that is
“media rated”. These are much harder to come by and you won’t find them in your
typical office supply store.
One data backup program that I liked is called “Karen’s
Replicator”. It is available for free at:
http://www.karenware.com/powertools/ptreplicator.asp
The Replicator has an option for mirroring file deletions as
well as additions and modifications. The Replicator only copies the files that
have been changed, making for a fairly quick backup.
David:
The writer mentions external USB hard drives but did not
mention Fire Wire drives. We have two external FW drives and they fly.
Depending on your current computer and future computers I would suggest a drive
that works with both FW 400 & FW 800. There are drives that have ports for
USB, FW 400 & FW 800.
Karl:
Since you don't mention Win or Mac, I'm assuming Winblows.
I've been using a tool called SyncBack for quite a while and love it. You can
grab a free version at
http://www.2brightsparks.com/freeware/freeware-hub.html
Each Profile (as they call it) can be set as a full or partial BU, you can set
a date range, include/exclude folders and files, use FTP, and lots more. You
can even add a Group profile that runs 2 or more other
profiles. A nice easy way to do various folders.
Proprietary –Unique to a specific manufacturer and
may also be unique to a specific product and possibly the version of the
product. The specifications are subject to change based on the whims of the
manufacturer. Proprietary data formats, typically, cannot be read by software
from other manufacturers.
Bit Rot – The unintentional modification of a single
bit of data, usually as a result of passing time or environmental conditions.
Copyright 2008
Carey Brown