NDMP is a great way to get faster backups for your filers by connecting tape drives directly to them. But combining Synthetic Full Backups and NFS/CIFS access to your filers might be a lot better.
NDMP (the Network Data Management Protocol) is designed to allow you to properly back up your filers. The main feature of NDMP is that it allows you to directly connect your tape drives to your filer and back them up without having to send your backup data across the network. If you're using NDMP to back up across the LAN (AKA three-way NDMP) you're missing the main benefit of NDMP; backing up via NDMP isn't usually any faster than backing up via NFS or CIFS if you're backing up across the LAN.
NDMP also allows the filer vendor to resolve the ACL/permissions issues surrounding the fact that filers support multiple operating systems. Without NDMP, you need to back up CIFS data via CIFS and NFS data via NFS. NDMP allows the filer vendor to write a single backup format that can back up both types of data.
NDMP also has limitations, though. The main limitation being that it does not support cross-filer recovery. You cannot back up a NetApp and restore it to a BlueArc filer — even though both support NDMP. This is because NDMP is a communication protocol only; it does not specify the backup format. Each filer vendor therefore creates their own format, making NDMP restores between filer vendors not possible.
NDMP backups (like any other traditional backup method) also place a load on the filer and can take a really long time. Finally, your backup product will probably charge you to use NDMP.
Synthetic backups, if your backup product supports them, are a great way to get a good backup of your filer without these limitations. Synthetic backups use the most recent full and incremental backups to create another full backup without having to transfer any data from the client; all the work is done on the backup server. Some products can also create synthetic differentials (AKA cumulative incremental or level 1) backups that contain all files that have changed since the last full backup.
First, the caveats.
- Your product must support this feature. NetBackup, NetWorker, & CommVault support this feature. TSM's backup sets are synthetic fulls, but they are designed to be used outside of TSM and can't be used for regular TSM restores.
- You will have to backup your data via the correct protocol. Unix-style data can be backed up via NFS, and Windows-style data can be backed up via CIFS.
- NDMP users will only see a benefit for typical user data. This won't help much when backing up database data. The idea is that you only have to perform an incremental backup; if that backup is the equivalent of a full backup, this idea won't help.
So, here's the idea.
- Use NFS or CIFS to mount your data to your backup server. (Please don't mount it to a client. That sends the data across the network twice – from the filer to the client and from the client to the server.)
- Use your backup server to create a full backup of that network drive.
- Time passes.
- Perform an incremental backup of the filer via the same NFS/CIFS path.
- Time passes.
- Perform another incremental backup of the filer via the same NFS/CIFS path.
- Time passes.
- When you think you need another full backup, perform a synthetic full backup. It will merge the latest full backup and the incrementals to create a new full backup that looks just like a "real" full backup. It's just that the data will all move directly from one tape to another, without having to bother the filer.
This way, you get all the benefits of occasional full backups (mainly faster restores) without having to actually make them.
----- Signature and Disclaimer -----
Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Evangelist at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.