Hands down, one of the most catastrophic events any company can face is the loss of data. No data, no work. Backups are one of those tedious burdens we know we should do, but really, how many of us can say we backup properly? Did you know there are different type of backup? Should you run a backup or image? What about full, differential or incremental backups and images? If none of this makes sense keep reading, I’ll walk you through it. This post will spell it out for you so you can make a more informed decision on how to best protect your data.
Broadly speaking, there are two primary methods for “backing up” your data: Backups and Images. Both accomplish the same thing, which is to create a second copy of your data in case the primary copy of data is inaccessible, however images are a newer, more robust technology.
Backups are an older technology. They primarily focus on protecting the user-created data such as spreadsheets, Word documents, PDF’s, pictures, or any other files typically created or modified by a user. Images focus on protecting the system as a whole. It takes a snapshot of your hard drive, as it exists at that point in time. Images capture everything a backup would protect, as well as the operating system, installed applications, viruses, etc. — everything.
In my opinion images are always better than a traditional backup. Let’s pretend your computer crashes. You’ve already replaced the hard drive or simply bought a new computer, and now you have to get your computer running again. What do you do? I’ll break down the steps, and approximate some times using both a traditional backup and an image.
Backup
- Format drive – 5 minutes
- Install the operating system & drivers – 1 — 2.5 hours
- Download and install OS updates – 2 hours
- Install your applications (and update them) – 2 hours
- Recover your data from a backup – 1–2 hours
Image
- Boot to the image recovery CD – 1 minute
- Reimage your hard drive – 3 — 4 hours
As you can see, the process of recovering from backups is much more complex, and takes all day. While your data is recovered and your machine is back in operation, you spend a lot of time (and money) just getting functional. With an image, you cut your time down to one-half (possibly even more in many cases) and the process is much more streamlined. Recovering with an image is a lot less involved, which alone is a great reason to use imaging over backups.
Often you don’t have to replace an entire drive and its contents. Rather, you have one file that’s been accidentally deleted or corrupted. With both methodologies it’s fast and easy to simply open the backup or image, and pull out the one file you need.
Now let’s look at the differences between full, differential and incremental backups and images. Many organizations have so much data to protect it would take many hours each day to run the backup or image and verify it the next day. So there are some options and strategies to help with this.
Side note: For ease, I’m going to use the word “archive” to mean both backup and image. The concepts apply to both.
Full Archive
A full archive is just that — it archives everything. This is the most complete method, but also takes the longest to perform and validate.
Typically, organizations run full archives over a weekend, so they can start each week with a fresh, complete archive of everything as it existed at the end of the prior week. Also a full archive might be run just prior to and/or after a major change to the computer, such as large number of OS updates.
In reality, only a small percentage of data really changes over time. Most data on a computer doesn’t change, or does so occasionally over a long period of time. Focusing your archives on only that which has changed, and leaving the “stagnant” data to be archived with the full archive, allows you to run very efficient archiving.
Differential Archive
A differential archive captures only that data which has changed since the last full archive. If you run a full archive on Sunday, then Monday’s differential backup will only capture what changed on Monday. Tuesday’s differential captures what changed Monday and Tuesday. Differential archives are much faster to run and require less storage space than full archives, but the longer you go from the last full archive, the larger each differential archive becomes.
When you recover from a serious system failure (take our time line above), you first recover the full archive, then recover the most recent differential archive. You have two sets of data to deal with. This adds a layer of complexity and can be perceived as taking longer than recovering from simply a full archive, but the trade-off is in the time you gain making the faster differential, and validating only a smaller amount of data.
Incremental Archive
Incremental archives are similar to differentials, however they archive only that which has changed since the last archive, regardless of the type of archive. If you run a full backup on Sunday, then Monday’s incremental captures only that which changed on Monday. Tuesday’s incremental archive captures only that which changed since Monday’s, etc. As you can see incremental archives will remain very small and fast. However, you must now keep up with as many archive files since (and including) the last full archive in order to fully recover your data.
Incremental archives allow you to be more flexible in your protection strategy, save a lot of time and are easy to validate. However, the trade-off is the recovery process is a bit more complicated. But with a good system in place, this shouldn’t be too difficult a process.
When to use which
Knowing when to use a full, differential or incremental backup can be tricky and requires a good analysis of your organization and its processes. It’s important you really focus on the time it takes to run the archives and how quickly you can verify the archive was successful. Most archive systems should create a log file that shows the details. In addition, I recommend testing the archive, by recovering a significant number of files and comparing them to the originals to ensure they are correct and can be recovered.
Most of my customers are able to simply run full archives every night. This is the simplest procedure, as we only have to worry about one set of data. However, some places need a bit more flexibility.
In general, when I need to create a more complex archiving system than just nightly full archives, I start with the following template in mind, and modify it as needed.
- Sunday Evening – Full archive; Replaces the last full archive,
- Daily (Mon – Sat) – Differential or incremental archive; each replaces the archive for the same day last week,
- Monthly – Full archive; Replaces the prior month’s full archive,
- Yearly – Full archive; Replaces all archives for the year. At the end of the year, we only have this one archive to deal with, save off site, etc.
Currently, I no longer recommend anyone use traditional backups. I think images offer greater benefits to protecting your data, especially the time saved to recover from a catastrophic failure.