Enterprise Backup and Beyond: State of the Art Backup Software

From Backup Central

  1. Redefine backup to include dedupe, CDP, near-CDP (new in this version)
  2. Common tasks for all backup software
    1. Prepare data to be backed up
    2. Read data to be backed up
    3. Send data across a network
    4. Accept data from network
    5. Store data on backup device
    6. Increase backup speed
    7. Decrease backup load (duration, effect on servers/networks)
    8. Record where backup is
    9. Create additional copies
    10. Protect the backup history
    11. Prepare backup for theft in transit
    12. Send backup to another location
    13. Collocating data for potential restore
    14. Expire old backups
    15. Make old backup media available
    16. Migrate data between storage devices
    17. Request data for restore
    18. Prepare media
    19. Read data from backup device
    20. Transfer across network
    21. Write to disk
  3. Traditional backup software
    1. Prepare data to be backed up
      1. Typical filesystem
      2. Millions of files (image backups)
      3. User scripts
      4. Snapshots
      5. BCVs
      6. Applications
    2. Read data to be backed up
      1. First full backup
      2. File-level Incremental backups
      3. File-level Cumulative Incremental/Differential backups
      4. Block-level Incremental backups
      5. Applications
    3. Send data across a network
      1. WAN
      2. LAN
      3. iSCSI
      4. Fibre Channel
    4. Accept data from network
      1. WAN
      2. LAN
      3. iSCSI
      4. Fibre Channel
    5. Store data on backup device
      1. Backup formats
        1. The dump utility
        2. The tar, ditto, and cpio utilities
        3. Commercial backup formats
        4. Mainframe method (ind. files & aggregates used by TSM)
      2. Traditional backup to tape
      3. Traditional backup to local filesystem
      4. Traditional backup to NAS filesystem
      5. Traditional backup to VTL
    6. Increasing backup speed
      1. Multiplexing
      2. Multistreaming
      3. LAN-free data transfer
    7. Decrease backup load (duration, effect on servers/networks)
      1. Progressive incremental with no full
      2. Traditional incremental with virtual full backup
    8. Record where backup is
      1. Different kinds of databases
      2. Images, savesets and files
      3. Individual version tracking
    9. Create additional copies
      1. Why you should make copies
      2. Traditional device-to-device copy
        1. Cover disk to tape, tape to disk, small tapes to big tape, big tape to small tapes
      3. Backup mirroring (ITC)
        1. Cover mirroring to disk and tape
      4. Intelligent device
        1. Traditional tape out
        2. Third party copy controlled by backup software (e.g. NBU NDMP Direct Tape, open storage api)
    10. Protect the backup history
      1. Catalog backups
      2. Replication of catalog
    11. Send backup to another location
      1. Traditional tape and truck
        1. Encryption (overview, covered in detail elsewhere)
        2. Originals
        3. Copies
      2. IDT methods
        1. Usually not encrypted
        2. IDT cascading (no backup knowledge), usually eventually to tape
        3. IDT cascading (with backup control & knowledge), usually eventually to tape
    12. Collocating data for potential restore
      1. Grandfather/father/son with fulls/diffs or virtual fulls/diffs
      2. Progressive incremental
        1. Collocation (client, filespace, group, active collocation)
        2. Reclamation
    13. Expire old backups
      1. Expire image/saveset, all files in image expire
      2. Expire individual file entries
    14. Make old backup media available
      1. Fully expired tapes recycled into list of available tapes
      2. Special considerations when using VTLs (how space on tapes is reclaimed)
      3. Backup sent to filesystem deleted
      4. Reclamation of mostly expired tapes
    15. Migrate data between storage devices
      1. Disk to tape migration
      2. Tape to tape migration
    16. Request data for restore
      1. One request system
      2. Multi-request system (what arcserve does)
      3. Application restore request
    17. Bring media onsite if necessary
      1. Contact offsite storage vendor
      2. Request move of tape to onsite
      3. Decide between tomorrow, rush, super rush
    18. Prepare media
      1. Restore from tape
        1. May need to recall (should've made copies)
        2. Put into tape library
        3. Software mounts, addresses & reads
      2. Restore from IDT
        1. If in onsite IDT
        2. If only in offsite IDT
      3. Restores vs backups (suspension, prioritization over)
    19. Read data from backup device(s)
      1. Legacy style of complete full restore followed by complete inc. restore
      2. Typical restore that only restores necessary files (like NBU)
      3. Simultaneous restore from multiple tapes
      4. Simultaneous restores from one tape
      5. Single restore from multiplexed image (how it is often gated by other components)
    20. Transfer across network
      1. From backup server
      2. From local server
    21. Write to disk
      1. File-level
      2. Block-level
  4. De-dupe backup software
    1. Overview
      1. Including delta diff & hashing
      2. What its for (not for)
      3. Multi-tier system (remote only, local recovery server)
      4. Software as service, migrate to own hardware
    2. Prepare data to be backed up
      1. Identify changed files via mtime, utime, archive bit
      2. Identifying changed data in a database file
      3. Identify changed blocks in a changed file
        1. Delta differential method
        2. Hashing method
      4. Identify unique, changed blocks (hash dedupe)
    3. Read data to be backed up
      1. Read new, unique blocks
    4. Send data across a network
      1. All send via IP
    5. Accept data from network
      1. All receive via IP
    6. Store data on backup device
      1. De-dupe sw requires raw disk
    7. Increase backup speed
      1. Main decrease comes from reducing data transferred
      2. Biggest challenge is that it's very compute intensive
    8. Decrease backup load (duration, effect on servers/networks)
      1. Main decrease comes from reducing data transferred
      2. Biggest challenge is that it's very compute intensive
    9. Record where backup is
      1. Hash table for hash products
      2. Not sure how delta products store their data
    10. Create additional copies
      1. Done via replication into another unit
    11. Protect the backup history
      1. Really need to ask about how they protect the "catalog"
    12. Send backup to another location
      1. Replication
    13. Collocating data for potential restore
      1. Already collocated and ready for a restore
      2. Can collocate data with servers (local recovery server)
    14. Expire old backups
      1. Expiring a file reduces by one count the number of links to a given chunk
      2. When a given chunk is reduced to zero links, the chunk can be expired
      3. Need to explain how delta works.
    15. Make old backup media available
      1. Consists of deleting chunks with zero links & hash entries
      2. What about fragmentation?
    16. Migrate data between storage devices
      1. No need, although may move around for load balancing
    17. Request data for restore
      1. Similar to traditional backup. choose file/directory and time, or app interface
    18. Bring media onsite if necessary
      1. Most backups should be onsite already
      2. Discuss local recovery server
    19. Prepare media
      1. Files are built dynamically from pointers to blocks
    20. Read data from backup device
      1. Data is built on the fly
    21. Transfer across network
      1. All current systems transfer via IP
    22. Write to disk
  5. CDP
    1. Overview - explain true CDP
    2. Prepare data to be backed up
      1. Since we will be transferring everything, not really necessary
      2. Some CDP products will allow you to create a time marker (backup mode etc)
    3. Read data to be backed up
      1. Initial full backup - may take a while
      2. Continuous incremental -- as a byte changes, it is transferred to CDP server
      3. Explain difference between software data tap and hardware appliance
    4. Send data across a network
      1. WAN/LAN
        1. Usually software data tap, data is sent via IP to CDP server
      2. iSCSI/Fibre Channel
        1. Appliance-based systems can send via FC
        2. Some CDP products that read from appliance can send data via SCSI
    5. Accept data from network
      1. Same as above
    6. Store data on backup device
      1. Constant journal, no standby disk
      2. Standby disk with journal regularly applied
      3. Talk about what happens when it gets behind ("marketing only" mode)
    7. Increase backup speed
      1. Can't get any faster -- it's continuous
    8. Decrease backup load (duration, effect on servers/networks)
      1. Actually runs all the time, but doesn't impact app so ok
    9. Record where backup is
      1. Journal must be continually updated with current state and all blocks that got us here
    10. Collocating data for potential restore
      1. All data is located in a single array
    11. Create additional copies
      1. Replication
    12. Protect the backup history
      1. Protecting the journal -- how?
    13. Send backup to another location
      1. Replication
    14. Expire old backups
      1. Expire old journal entries and associated blocks
    15. Make old backup media available
      1. As old blocks are cleared, they're available for new blocks
    16. Migrate data between storage devices
      1. With constant journal method, nothing
      2. With full standby disk, regular transfer of new blocks
    17. Request data for restore
      1. Pick a point in time and a LUN to restore
      2. My pick signficant point in time as determined by app
    18. Bring media onsite if necessary
      1. Local recovery server
    19. Prepare media
      1. Create virtual LUN on fly - quick, but may be slower performance
      2. Roll standby LUN to appropriate point in time - takes longer, but better performance
      3. If you want, you can mount LUN in another place to check that it's cool
    20. Read data from backup device
      1. If using as standby, can point production app right at LUN
      2. If need to restore production LUN, can read only blocks that have changed since PIT
    21. Transfer across network
      1. If using as standby, blocks will transfer as requested
      2. Must transfer only those blocks that have changed
    22. Write to disk
      1. If using as standby, N/A
      2. If doing restore, can write only changed blocks (incremental restore)
    23. Special
      1. Explain how you can do both standby & restore, then switch
  6. Near-CDP
    1. Types of snapshots
      1. We're talking about virtual copies, not BCVs
      2. Different methods: copy on write, redirect on write, WAFL
      3. User accessible or not?
    2. Prepare data to be backed up
      1. If user data, no prep needed
      2. If app data, put app in backup mode or down
      3. Explain snapshot then replicate, replicate then snapshot
      4. Replicate then snapshot only good for user data
      5. Scheduled vs user script driven vs backup app driven
      6. Take snapshot
    3. Read data to be backed up
      1. Backup is now done, but must create another copy
      2. Going to read static images of blocks to send via network
    4. Send data across a network
      1. Typically IP/WAN/LAN
      2. Doesn't preclude transfer via FC/iSCSI
    5. Accept data from network
      1. Typically IP/WAN/LAN
      2. Doesn't preclude transfer via FC/iSCSI
    6. Store data on backup device
      1. On source system, must store blocks necessary to continue to present snapshot
      2. Explain how many blocks that may be
      3. Store blocks necessary to replicate all snapshots
    7. Increase backup speed
      1. Virtual backup instantaneous
      2. Simultaneous copies can increase repl. speed
    8. Decrease backup load (duration, effect on servers/networks)
      1. Same
    9. Record where backup is
      1. Self explanator format
      2. Some can work with MS VSS API
      3. Some backup apps can record content of snapshot in catalog
    10. Collocate data for restore
      1. It's all already there
    11. Create additional copies
      1. Replication
    12. Protect the backup history
      1. Self-contained
      2. Backup history can actually be rebuilt
    13. Send backup to another location
      1. Replication
    14. Migrate data between storage devices
      1. No need
    15. Collocating data for potential restore
      1. Already there
    16. Expire old backups
      1. Snapshots expired = return of space
      2. Some snaphots allow you to determine expiry time when created, then auto expired
      3. Can usually manually expire images out of order
      4. Backup app controlled can be expired by backup app
    17. Make old backup media available
      1. As snapshots expire, blocks used by them returned to storage
    18. Request data for restore
      1. Traditional way, just go the right snapshot directory
      2. Some can work with VSS "previous versions"
      3. If recorded in backup app, can do usual browse restore
    19. Bring media onsite if necessary
      1. Usually onsite already
    20. Prepare media
      1. No prep needed
    21. Read data from backup device
      1. Drag and drop files needed
      2. Previous versions VSS api will overwrite file with selected version
      3. Backup app will select previous version and tell snapshot software to restore it
    22. Transfer across network
      1. If restoring from replicated copy, may need to copy across network
      2. If restoring from same system, no need
    23. Write to disk
      1. Typically complete overwrite for file-level
      2. Some can do incremental restore if restoring entire volume
  7. Other concerns
    1. Protection of the backup index
      1. Most important backup you have
    2. BMR
      1. Explain what were not covering (free stuff)
      2. Explain commercial ways of handling
        1. Cloning/Ghosting
        2. Special BMR backup of everything
        3. Regular backup accompanied by BMR info
    3. ROBO backup
      1. Talk about evils of typical, tell stories of security guard swapping tapes
      2. Time to get rid of tape
      3. Use dedupe software or hardware
        1. Use regular software & dedupe hardware
        2. Use dedupe software
      4. Software as a service
    4. Network-Mounted Filesystems (just borrow from other book)
      1. Backup via NFS
      2. Local agent
      3. NDMP
    5. Databases
      1. Agent vs scripts


<<Go back to Main Outline