I was seeing a lot of conflicting information going around about NetWorker with regard to cloning and multiplexing. Even I wasn't sure anymore how it worked. Does it preserve it or not? Does it do one thing for small savesets and another for larger ones? Does doing it via the GUI or the command line change its behavior? Does automatic cloning (group level cloning) work any different? The only official answers I could find from EMC actually muddied the waters, so I decided to do my own testing to prove it one way or another.
I did three sets of eight tests on NetWorker 7.3.Build.190 using file/saveset sizes of 60 MB, 150 MB, and 2.5 GB. I created four equal directories with 60 MB each, and then ran the eight tests. I then created four equal directories with 150 MB each, and then ran the eight tests. Then I created four equal directories with 2.5 GB in them, and ran the eight tests again. Then I did one final, "crazy" test where we created ten directories with 150 MB in each, then backed them up three times (for a total of 30 savesets) and did a special test just with that. More on that later…
SUMMARY: In NetWorker 7.3 on a Windows server, Multiplexing is ALWAYS preserved unless you force it to de-multiplex by giving it one saveset at a time. Other versions MAY have behaved differently, but this is how this one works.
ALL of the following preserved multiplexing in my tests, which included small (60 MB), medium (150 MB), and large (2.5 GB) savesets. (I did each of the eight tests below with all three saveset sizes.)
- Automatic cloning (AKA group cloning)
- Saveset cloning via the GUI
- Saveset cloning via nsrclone and a list (e.g. nsrclone -S 234234 234234 234234)
- Saveset cloning via nsrclone and a file containing a list (e.g. nsrclone -S -f <somefile>) This even preserved multiplexing when the list was completely out of order!
- Volume cloning via the GUI
- Volume cloning via nsrclone (e.g. nsrclone <volumename>)
The only things that didn't preserve multiplexing were:
- Selecting individual savesets from the GUI and cloning them
- Passing individual savesets to nsrclone (nsrclone -S 2323423; nsrclone -S 234234; nsrclone -S 23434)
NetWorker ALWAYS preserved multiplexing unless I was doing them one at a time. The only thing that would happen with very small savesets is that the originals weren't getting cloning very well because one backup would finish before the other would start. Having said that, the clone of those backups would be just as multiplexed as the original.
There are other things that can affect what NetWorker does, like:
- Using different sized volumes for originals and clones
- Appending to your originals, but sending the clones offsite as soon as they're done.
I verified this with the command mminfo -q "volume=<barcode>" -r "mediafile,mediarec,ssid" and comparing the lists. (Example at the end of this email.) Mediafile shows which tape file the savesets are in, and mediarec shows which record within that file the saveset starts.
What I always saw was that the cloned savesets were laid out just like the original, with minor variations due to the time required to get certain things started. (Sometimes the clone starts a few records before or after where the original started. This was just a timing delay, and doesn't indicate that it wasn't being multiplexed.)
THE ONE THAT COMPLETELY AMAZED ME was the following test. We created a client with ten fileset entries, each with 149 MB in them, so it would start ten jobs at the same time and multiplex them all to the same tape.
It didn't quite work the way we wanted, in that it only ran about five or six at a time. This was because it would be done with the first 149 MB file before it started backing up the sixth or seventh one. That's OK. The way the files got laid out on tape really shows how the clone is multiplexed in exactly the same way. We ran this backup three times, creating three sets of interleaved/multiplexed backups on a single tape, each of which also had a corresponding index/bootstrap backup afterwards.
We then created a list of ssids using mminfo -a (which excludes the index/bootstrap savesets), and sorted them numerically. This created a list of saveset ids in a completely different order than they were created originally. (The ssids below are listed in the order they were created. You can see sorting numerically would create an entirely different order.)
We then passed that list of ssids to nsrclone via the -S -f <filename> option. Believe it or not, it presorted them into the order that would best preserve multiplexing, and then cloned them. The resulting clone is laid out exactly the same as the original tape, excluding the tape files that contain indexes, because we didn't clone those. I totally expected this one to not multiplex, but it did.
To prove this, I ran the following commands:
First I created a list of all ssids on the original volume
# mminfo -q "volume=GH000203" -r "mediafile,mediarec,volume,ssid" >b1.txt
Then I created a list of all ssids on the source volume
# mminfo -q "volume=GH000212" -r "mediafile,mediarec,volume,ssid" >b2.txt
I cheated a bit and edited b2.txt to put in the blank lines so the files would line up right. Then I ran the Unix paste command to put the two files side by side.
#paste b1.txt b2.txt
The results are shown below. The first four columns are from the original tape, and the next four columns are from the cloned tape. You can see that the cloned volume is just as multiplexed as the original volume, and laid out in exactly the same way. For example, you can see that saveset 4287134919 starts at record 170 on the source volume, and starts at record 169 on the clone volume, and savesets 8944832, 4287134919, 4270357710, 4253580499, & 4236803288 were all multiplexed together onto tape file 2 on the source volume, and NetWorker did the same thing with the destination volume. You can also see that since we skipped the index and bootstrap savesets, the file number did not increment on the cloned tape side, but did increment on the original side. This just demonstrates that the file column really does show which tape file the backups go to.
file rec volume ssid file rec volume ssid
2 0 GH000203 8944832 2 0 GH000212 8944832
2 170 GH000203 4287134919 2 169 GH000212 4287134919
2 573 GH000203 4270357710 2 570 GH000212 4270357710
2 1149 GH000203 4253580499 2 1143 GH000212 4253580499
2 2071 GH000203 4236803288 2 2062 GH000212 4236803288
3 0 GH000203 4220026135 3 0 GH000212 4220026135
3 60 GH000203 4203248922 3 59 GH000212 4203248922
4 0 GH000203 4186471768 4 0 GH000212 4186471768
4 5 GH000203 4169694553 4 4 GH000212 4169694553
4 228 GH000203 4152917340 4 226 GH000212 4152917340
5 0 GH000203 4136140159 #Blank area inserted by me
6 0 GH000203 4119362945 #Blank area inserted by me
7 0 GH000203 4102585826 5 0 GH000212 4102585826
7 876 GH000203 4085808618 5 871 GH000212 4085808618
7 1262 GH000203 4069031408 5 1256 GH000212 4069031408
7 1425 GH000203 4052254194 5 1418 GH000212 4052254194
7 2466 GH000203 4035476982 5 2454 GH000212 4035476982
8 0 GH000203 4018699829 6 0 GH000212 4018699829
9 0 GH000203 3985145460 7 0 GH000212 3985145460
9 1 GH000203 4001922676 7 0 GH000212 4001922676
10 0 GH000203 3951591090 8 0 GH000212 3951591090
10 2 GH000203 3968368306 8 1 GH000212 3968368306
11 0 GH000203 3934813898 #Blank area inserted by me
12 0 GH000203 3918036683 #Blank area inserted by me
13 0 GH000203 3901259566 9 0 GH000212 3901259566
13 1119 GH000203 3884482358 9 1113 GH000212 3884482358
13 1412 GH000203 3867705145 9 1405 GH000212 3867705145
13 2224 GH000203 3850927934 9 2213 GH000212 3850927934
13 2426 GH000203 3834150721 9 2415 GH000212 3834150721
13 2609 GH000203 3817373506 9 2597 GH000212 3817373506
13 3166 GH000203 3800596292 9 3151 GH000212 3800596292
14 0 GH000203 3767041922 10 0 GH000212 3767041922
14 1 GH000203 3783819137 10 0 GH000212 3783819137
15 0 GH000203 3750264767 11 0 GH000212 3750264767
16 0 GH000203 3733487563 #Blank area inserted by me
17 0 GH000203 3716710348 #Blank area inserted by me
----- Signature and Disclaimer -----
Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technologist at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.