Bandwidth: Backup Design Problem #2

Getting enough bandwidth is the second challenge of designing and maintaining a traditional backup system. Tape is the first challenge, which is solved by not using it for operational backups.  The next challenge, however, is getting enough bandwidth to get the job done.

bandwidth

This is a major problem with any backup software product that does occasional full backups, which is most of the products running in today’s datacenter. Products that do full-file incremental backups also have this problem, although to a lesser degree.  (A full-file incremental backup is one that backs up an entire file when even one byte has changed.)

This is such a problem that many people would agree with the statement that backups are the thing that test your network system more than anything else. This is one of the main reasons people run backups at night.

This problem has been around for a long time.  I remember one time I was testing backups over the weekend, and accidentally set things up for backups to kick off at 10 AM the next day — which happened to be Monday. The network came to a screeching halt that day until we figured out what was happening and shut the backups off.

Backup system admins spend a lot of time scheduling their backups so they even out this load.  Some perform full backups only on the weekend, but this really limits the overall capacity of the system.  I prefer to perform 1/7th of the full backups each night if I’m doing weekly full backups, or 1/28th of the full backups each night if I’m doing monthly full backups.

While this increases your system capacity, it also requires constant adjustment to even the full backups out, as the size of systems changes over time. And once you’ve divided the full backups by 28 and spread them out across the month, you’ve created a barrier that you will hit at some point. What do you do when you’re doing as many full backups each night as you can? Buy more bandwidth, of course.

How has this not been fixed?

 

Luckily this problem has been fixed. Products and services that have switched to block-level incremental-forever backups need significantly less bandwidth than those that do not use such technology.   A typical block-level incremental uses over 10 times less bandwidth than typical incremental backups, and over 100 times less bandwidth than a typical full backup.

Another design element of modern backup products and services is that they use global deduplication, which only backs up blocks that have changed and haven’t been seen on any other system. If a given file is present on multiple systems, it only needs to be backed up from one of them. This significantly lowers the amount of bandwidth needed to perform a backup.

Making the impossible possible

 

Impossible

Lowering the bandwidth requirement creates two previously unheard-of possibilities: Internet-based backups and round-the-clock backups. The network impact of globally deduplicated, block-level incremental backups is so small that the data can be transferred over the Internet for many environments.  In addition, the impact on the network is so small that backups can often be done throughout the day.  And all of this can be done without all of the hassle mentioned above.

The more a product identify blocks that have changed, and the more granular and global the deduplication can be designed, the more these things become possible. One of the best ways to determine how efficient a backup system is on bandwidth is to ask them how much storage is needed to store 90-180 days of backups. There is a direct relationship between that number and the amount of bandwidth you’re going to need.

Written by W. Curtis Preston (@wcpreston), four-time O'Reilly author, and host of The Backup Wrap-up podcast. I am now the Technology Evangelist at Sullivan Strickler, which helps companies manage their legacy data

2 comments
    • LOL. Good catch. I think I wrote this one w/my voice to text app. The word I was looking for was “hold.” I changed it to store. Good catch, though.