 |
Page 1 of 2
|
| Author |
Message |
Kris Vassallo
Guest
|
 Speed up 400GB backup?
I am looking for some assistance in tweaking the bumpsize, bumpdays, and bumpmult items in amanda.conf. I am backing up 420GB + worth of home directories to hard disks every night and the backup is taking about 11 hours. I just changed the backup of one 400GB home drive from client compress best to client compress fast, which did seem to shave a bit of time off the backup. The disks that are being backed up are on the same RAID controller as the backup disks.
I really need to make the backup take a lot less time because the network crawls when the developers come in to work in the morning because the home directory server is blasting away with the backup. So, with a filesystem this large, what would be some good settings for the bump options. Also, are there any other things I can do to get this backup done any faster without turning off disk compression all together?
|
| Mon Jul 19, 2004 1:12 pm |
|
 |
Frank Smith
Guest
|
 Speed up 400GB backup?
--On Monday, July 19, 2004 14:07:40 -0700 Kris Vassallo <kris < at > linuxcertified.com> wrote:
I am looking for some assistance in tweaking the bumpsize, bumpdays,
and bumpmult items in amanda.conf. I am backing up 420GB + worth of home
directories to hard disks every night and the backup is taking about 11
hours. I just changed the backup of one 400GB home drive from client
compress best to client compress fast, which did seem to shave a bit of
time off the backup. The disks that are being backed up are on the same
RAID controller as the backup disks.
I really need to make the backup take a lot less time because the
network crawls when the developers come in to work in the morning
because the home directory server is blasting away with the backup. So,
with a filesystem this large, what would be some good settings for the
bump options. Also, are there any other things I can do to get this
backup done any faster without turning off disk compression all
together?
Are you actually writing 420GB per night, or is that just the total
amount to be backed up? If most of your data isn't changing daily
then breaking up your DLEs to not have a 400GB chunk could spread
the level 0s across more nights and shorten your nightly backup time.
Are you sure its the compression using up most of the time? You
probably need to add spindle numbers to your disklist to serialize
the accesses to the DLEs that share common disks. Using a holding
disk not on the same controller would speed things up also.
If your DLS and file backups share the same disks and not just
the same controller then the disks will waste quite a bit of time
seeking back and forth. You might also want to do some performance
testing on your RAID controller, perhaps it is the bottleneck as
the model of controller (and the RAID level) can have a big impact
on throughput.
Perhas posting your daily report and more details of the physical
layout would give us a better idea of where to start on suggestions
for improving your backup times.
Frank
--
Frank Smith fsmith < at > hoovers.com
Sr. Systems Administrator Voice: 512-374-4673
Hoover's Online Fax: 512-374-4501
|
| Mon Jul 19, 2004 2:24 pm |
|
 |
Kris Vassallo
Guest
|
 Speed up 400GB backup?
420GB is not the total amount per night. Something is bogging this down though and I don't know what. I am not using holding disks because the majority of data is being backed up from one set of disks to another on the same machine. This one machine has a set of RAID 10 disks. These disks are backed up by amanda and put onto a set of RAID 5 disks. As far as assigning spindle #s goes I don't quite understand why I would set that. I have inparallel set to 4 and then didn't define maxdumps, so I would assume that not more than 1 dumper would get started on a machine at once. Am I getting this right? Here is my email log from the backup this morning.
STATISTICS:
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 7:30
Run Time (hrs:min) 10:35
Dump Time (hrs:min) 2:52 0:29 2:23
Output Size (meg) 12163.2 9094.3 3068.9
Original Size (meg) 29068.4 19177.4 9891.0
Avg Compressed Size (%) 41.8 47.4 31.0 (level:#disks ...)
Filesystems Dumped 3 1 2 (1:1 5:1)
Avg Dump Rate (k/s) 1207.5 5366.4 366.3
Tape Time (hrs:min) 0:17 0:13 0:05
Tape Size (meg) 12163.3 9094.3 3069.0
Tape Used (%) 1.8 1.3 0.4 (level:#disks ...)
Filesystems Taped 3 1 2 (1:1 5:1)
Avg Tp Write Rate (k/s) 11980.6 12287.9 11153.9
--------
NOTES:
driver: WARNING: /tmp: not 102400 KB free.
planner: Incremental of venus.xxxx:/home bumped to level 5.
planner: Full dump of bda1.xxxx:/home specially promoted from 13 days ahead.
taper: tape DailySet111 kb 12455232 fm 3 [OK]
DUMP SUMMARY:
DUMPER STATS TAPER STATS
HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s
-------------------------- --------------------------------- ------------
bda1.xxxx /home 0 196376909312576 47.4 28:555366.4 12:3812287.9
bda2.xxxx /var/www 1 3210 480 15.0 0:01 364.4 0:0028399.0
venus.xxxx /home 5 101251603142176 31.0 142:59 366.3 4:4211152.8
On Mon, 2004-07-19 at 15:20, Frank Smith wrote: --On Monday, July 19, 2004 14:07:40 -0700 Kris Vassallo <kris < at > linuxcertified.com> wrote:
I am looking for some assistance in tweaking the bumpsize, bumpdays,
and bumpmult items in amanda.conf. I am backing up 420GB + worth of home
directories to hard disks every night and the backup is taking about 11
hours. I just changed the backup of one 400GB home drive from client
compress best to client compress fast, which did seem to shave a bit of
time off the backup. The disks that are being backed up are on the same
RAID controller as the backup disks.
I really need to make the backup take a lot less time because the
network crawls when the developers come in to work in the morning
because the home directory server is blasting away with the backup. So,
with a filesystem this large, what would be some good settings for the
bump options. Also, are there any other things I can do to get this
backup done any faster without turning off disk compression all
together?
Are you actually writing 420GB per night, or is that just the total
amount to be backed up? If most of your data isn't changing daily
then breaking up your DLEs to not have a 400GB chunk could spread
the level 0s across more nights and shorten your nightly backup time.
Are you sure its the compression using up most of the time? You
probably need to add spindle numbers to your disklist to serialize
the accesses to the DLEs that share common disks. Using a holding
disk not on the same controller would speed things up also.
If your DLS and file backups share the same disks and not just
the same controller then the disks will waste quite a bit of time
seeking back and forth. You might also want to do some performance
testing on your RAID controller, perhaps it is the bottleneck as
the model of controller (and the RAID level) can have a big impact
on throughput.
Perhas posting your daily report and more details of the physical
layout would give us a better idea of where to start on suggestions
for improving your backup times.
Frank
|
| Mon Jul 19, 2004 4:24 pm |
|
 |
Frank Smith
Guest
|
 Speed up 400GB backup?
--On Monday, July 19, 2004 17:19:56 -0700 Kris Vassallo <kris < at > linuxcertified.com> wrote:
Since most items on this mailing list involve several back-and-forth
questions and answers, it's usually best to reply with comments in-line
to make the history easier to follow for anyone on the list that may
care to jump in with additional remarks.
420GB is not the total amount per night. Something is bogging this down
though and I don't know what. I am not using holding disks because the
majority of data is being backed up from one set of disks to another on
the same machine. This one machine has a set of RAID 10 disks. These
disks are backed up by amanda and put onto a set of RAID 5 disks.
OK, I was assuming a different setup. Having a holding disk would let
you run multiple dumps in parallel. Wouldn't help much (if any) when
its all on one machine, but can really speed up your overall time if
you have multiple clients.
As far
as assigning spindle #s goes I don't quite understand why I would set
that. I have inparallel set to 4 and then didn't define maxdumps, so I
would assume that not more than 1 dumper would get started on a machine
at once. Am I getting this right?
I think maxdumps defaults to 2 but I may be wrong (someone else should
jump in here). I usually define everything so I know for sure how its
defined without digging into the source.
You're right, spindle numbers are only really useful with maxdumps > 1.
Here is my email log from the backup
this morning.
STATISTICS:
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 7:30
Here's your runtime problem, 7.5 hours for estimates .
Run Time (hrs:min) 10:35
Dump Time (hrs:min) 2:52 0:29 2:23
Three hours for dumps doesn't seem too bad. It could probably
be improved some, but the estimates are what's killing you.
Output Size (meg) 12163.2 9094.3 3068.9
Original Size (meg) 29068.4 19177.4 9891.0
Avg Compressed Size (%) 41.8 47.4 31.0 (level:#disks
...)
Filesystems Dumped 3 1 2 (1:1 5:1)
Avg Dump Rate (k/s) 1207.5 5366.4 366.3
Tape Time (hrs:min) 0:17 0:13 0:05
Tape Size (meg) 12163.3 9094.3 3069.0
Tape Used (%) 1.8 1.3 0.4 (level:#disks
...)
Filesystems Taped 3 1 2 (1:1 5:1)
Avg Tp Write Rate (k/s) 11980.6 12287.9 11153.9
\--------
NOTES:
driver: WARNING: /tmp: not 102400 KB free.
planner: Incremental of venus.xxxx:/home bumped to level 5.
planner: Full dump of bda1.xxxx:/home specially promoted from 13 days
ahead.
taper: tape DailySet111 kb 12455232 fm 3 [OK]
DUMP SUMMARY:
DUMPER STATS TAPER STATS
HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS
KB/s
-------------------------- ---------------------------------
------------
bda1.xxxx /home 0 196376909312576 47.4 28:555366.4 12:3812287.9
bda2.xxxx /var/www 1 3210 480 15.0 0:01 364.4 0:0028399.0
venus.xxxx /home 5 101251603142176 31.0 142:59 366.3
4:4211152.8
I'd suggest adding columnspec to your config and adjusting it so that
all the columns don't run together. It makes it much easier to read.
I'm guessing that bda1:/home wrote 9.3GB to 'tape', taking about 26 min
to dump and almost 13 min. to tape.
venus:home wrote 3GB, taking over 2 hours to dump and 5 min. to dump.
Which (if any) of these is the backup server itself?
The taper rates (about 12MB/sec if I'm parsing it right) seem ok, but
the 142 min dump time seems somewhat high for only 3GB of data.
Is that the 400GB filesystem you were talking about, and is it local
or remote?
As for the estimates, are you using dump or tar? Look in the
*debug files on the clients and see which one was taking all the time
(I'm guessing venus since it looks like you did a force on bda1).
Does that filesystem have millions of small files?
I'm not sure of the best way to speed up estimates, other than a
faster disk system. Perhaps someone else on the list has some ideas.
Frank
On Mon, 2004-07-19 at 15:20, Frank Smith wrote:
--On Monday, July 19, 2004 14:07:40 -0700 Kris Vassallo
<kris < at > linuxcertified.com> wrote:
I am looking for some assistance in tweaking the bumpsize, bumpdays,
and bumpmult items in amanda.conf. I am backing up 420GB + worth of
home
directories to hard disks every night and the backup is taking about
11
hours. I just changed the backup of one 400GB home drive from client
compress best to client compress fast, which did seem to shave a bit
of
time off the backup. The disks that are being backed up are on the
same
RAID controller as the backup disks.
I really need to make the backup take a lot less time because the
network crawls when the developers come in to work in the morning
because the home directory server is blasting away with the backup.
So,
with a filesystem this large, what would be some good settings for the
bump options. Also, are there any other things I can do to get this
backup done any faster without turning off disk compression all
together?
Are you actually writing 420GB per night, or is that just the total
amount to be backed up? If most of your data isn't changing daily
then breaking up your DLEs to not have a 400GB chunk could spread
the level 0s across more nights and shorten your nightly backup time.
Are you sure its the compression using up most of the time? You
probably need to add spindle numbers to your disklist to serialize
the accesses to the DLEs that share common disks. Using a holding
disk not on the same controller would speed things up also.
If your DLS and file backups share the same disks and not just
the same controller then the disks will waste quite a bit of time
seeking back and forth. You might also want to do some performance
testing on your RAID controller, perhaps it is the bottleneck as
the model of controller (and the RAID level) can have a big impact
on throughput.
Perhas posting your daily report and more details of the physical
layout would give us a better idea of where to start on suggestions
for improving your backup times.
Frank
|
| Mon Jul 19, 2004 9:49 pm |
|
 |
Joshua Baker-LePain
Guest
|
 Speed up 400GB backup?
On Mon, 19 Jul 2004 at 5:19pm, Kris Vassallo wrote
420GB is not the total amount per night. Something is bogging this down
though and I don't know what. I am not using holding disks because the
majority of data is being backed up from one set of disks to another on
the same machine. This one machine has a set of RAID 10 disks. These
disks are backed up by amanda and put onto a set of RAID 5 disks. As far
Just as an aside, having your backup disks on the same controller as your
real data seems a bit risky to me -- what if the controller goes? What if
it takes multiple disks with it?
as assigning spindle #s goes I don't quite understand why I would set
that. I have inparallel set to 4 and then didn't define maxdumps, so I
would assume that not more than 1 dumper would get started on a machine
at once. Am I getting this right? Here is my email log from the backup
That's correct. maxdumps (dumps per host at a time) defaults to 1.
inparallel controls total number of backups at once.
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 7:30
As Frank pointed out, this is a big part of your problem. What OS and FS
are we talking here, and what backup program? And, again, sendsize*debug
will tell you which DLEs are taking so long to estimate.
--
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University
|
| Tue Jul 20, 2004 5:31 am |
|
 |
Geert Uytterhoeven
Guest
|
 Speed up 400GB backup?
On Tue, 20 Jul 2004, Frank Smith wrote:
As for the estimates, are you using dump or tar? Look in the
*debug files on the clients and see which one was taking all the time
(I'm guessing venus since it looks like you did a force on bda1).
Does that filesystem have millions of small files?
Or lots of hard links? I keep many quasi-identical source trees using hard
links (so identical files consume disk space only once), but it increases the
time to estimate a lot.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert < at > linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
|
| Tue Jul 20, 2004 5:44 am |
|
 |
Kris Vassallo
Guest
|
 Speed up 400GB backup?
On Tue, 2004-07-20 at 04:24, Joshua Baker-LePain wrote: On Mon, 19 Jul 2004 at 5:19pm, Kris Vassallo wrote
420GB is not the total amount per night. Something is bogging this down
though and I don't know what. I am not using holding disks because the
majority of data is being backed up from one set of disks to another on
the same machine. This one machine has a set of RAID 10 disks. These
disks are backed up by amanda and put onto a set of RAID 5 disks. As far
Just as an aside, having your backup disks on the same controller as your
real data seems a bit risky to me -- what if the controller goes? What if
it takes multiple disks with it? The whole thing of having the backup host being the same machine as the file server no longer looks like a good idea. However, I am in it too deep to jump out now. I suppose that I could get a second controller in the box, but to me it seems as if that would only create another bottleneck, the pci bus.
as assigning spindle #s goes I don't quite understand why I would set
that. I have inparallel set to 4 and then didn't define maxdumps, so I
would assume that not more than 1 dumper would get started on a machine
at once. Am I getting this right? Here is my email log from the backup
That's correct. maxdumps (dumps per host at a time) defaults to 1.
inparallel controls total number of backups at once.
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 7:30
As Frank pointed out, this is a big part of your problem. What OS and FS
are we talking here, and what backup program? And, again, sendsize*debug
will tell you which DLEs are taking so long to estimate. The box is running redhat 9 with 2.4.20 kernel and ext3 filesystem.
Below is the most recent sendsize.debug
sendsize: debug 1 pid 27717 ruid 33 euid 33: start at Tue Jul 20 01:00:00 2004
sendsize: version 2.4.3
sendsize[27747]: time 0.119: calculating for amname '/home', dirname '/home', spindle -1
sendsize[27747]: time 0.119: getting size via gnutar for /home level 0
sendsize[27717]: time 0.119: waiting for any estimate child
sendsize[27747]: time 0.156: spawning /usr/lib/amanda/runtar in pipeline
sendsize[27747]: argument list: /bin/tar --create --file /dev/null --directory /home --one-file-system --listed-incremental
/var/lib/amanda/gnutar-lists/venus.berkeley-da.com_home_0.new --sparse --ignore-failed-read --totals .
sendsize[27747]: time 9720.909: /bin/tar: ./qa/build-main-branch-rfexamples/rfexamples-20040719/customer_test/Nestoras4/Freq
Domain/.nfs0447c12d00037cfc: Warning: Cannot stat: No such file or directory
sendsize[27747]: time 9720.949: /bin/tar: ./qa/build-main-branch-rfexamples/rfexamples-20040719/customer_test/Nestoras4/Freq
Domain/Linux_temp-g/cpsys_hb.log: Warning: Cannot stat: No such file or directory
sendsize[27747]: time 9720.949: /bin/tar: ./qa/build-main-branch-rfexamples/rfexamples-20040719/customer_test/Nestoras4/Freq
Domain/Linux_temp-g/cpsys_hb.out: Warning: Cannot stat: No such file or directory
sendsize[27747]: time 11114.784: Total bytes written: 429923983360 (400GB, 37MB/s)
sendsize[27747]: time 11114.835: .....
sendsize[27747]: estimate time for /home level 0: 11114.679
sendsize[27747]: estimate size for /home level 0: 419847640 KB
sendsize[27747]: time 11114.835: waiting for /bin/tar "/home" child
sendsize[27747]: time 11114.835: after /bin/tar "/home" wait
sendsize[27747]: time 11114.882: getting size via gnutar for /home level 6
sendsize[27747]: time 11115.510: spawning /usr/lib/amanda/runtar in pipeline
sendsize[27747]: argument list: /bin/tar --create --file /dev/null --directory /home --one-file-system --listed-incremental
/var/lib/amanda/gnutar-lists/venus.berkeley-da.com_home_6.new --sparse --ignore-failed-read --totals .
sendsize[27747]: time 18793.272: /bin/tar: ./qa/build-main-branch-rfexamples/rfexamples-20040719/customer_test/makram_thesis
_oscillators/postlayout2/.nfs01c6011d00037d30: Warning: Cannot stat: No such file or directory
sendsize[27747]: time 18793.333: /bin/tar: ./qa/build-main-branch-rfexamples/rfexamples-20040719/customer_test/makram_thesis
_oscillators/postlayout2/Linux_temp-g/ui_quadc.log: Warning: Cannot stat: No such file or directory
sendsize[27747]: time 18793.334: /bin/tar: ./qa/build-main-branch-rfexamples/rfexamples-20040719/customer_test/makram_thesis
_oscillators/postlayout2/Linux_temp-g/ui_quadc.out: Warning: Cannot stat: No such file or directory
sendsize[27747]: time 18815.342: Total bytes written: 69237104640 (64GB, 8.6MB/s)
sendsize[27747]: time 18815.372: .....
sendsize[27747]: estimate time for /home level 6: 7699.861
sendsize[27747]: estimate size for /home level 6: 67614360 KB
sendsize[27747]: time 18815.372: waiting for /bin/tar "/home" child
sendsize[27747]: time 18815.372: after /bin/tar "/home" wait
sendsize[27747]: time 18815.409: done with amname '/home', dirname '/home', spindle -1
sendsize[27717]: time 18815.493: child 27747 terminated normally
sendsize: time 18815.503: pid 27717 finish time Tue Jul 20 06:13:36 2004
|
| Tue Jul 20, 2004 1:20 pm |
|
 |
Kris Vassallo
Guest
|
 Speed up 400GB backup?
On Mon, 2004-07-19 at 22:41, Frank Smith wrote: > 420GB is not the total amount per night. Something is bogging this down > though and I don't know what. I am not using holding disks because the
majority of data is being backed up from one set of disks to another on
the same machine. This one machine has a set of RAID 10 disks. These
disks are backed up by amanda and put onto a set of RAID 5 disks.
OK, I was assuming a different setup. Having a holding disk would let
you run multiple dumps in parallel. Wouldn't help much (if any) when
its all on one machine, but can really speed up your overall time if
you have multiple clients.
As far
as assigning spindle #s goes I don't quite understand why I would set
that. I have inparallel set to 4 and then didn't define maxdumps, so I
would assume that not more than 1 dumper would get started on a machine
at once. Am I getting this right?
I think maxdumps defaults to 2 but I may be wrong (someone else should
jump in here). I usually define everything so I know for sure how its
defined without digging into the source.
You're right, spindle numbers are only really useful with maxdumps > 1.
Here is my email log from the backup
this morning.
STATISTICS:
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 7:30
Here's your runtime problem, 7.5 hours for estimates .
Run Time (hrs:min) 10:35
Dump Time (hrs:min) 2:52 0:29 2:23
Three hours for dumps doesn't seem too bad. It could probably
be improved some, but the estimates are what's killing you.
Output Size (meg) 12163.2 9094.3 3068.9
Original Size (meg) 29068.4 19177.4 9891.0
Avg Compressed Size (%) 41.8 47.4 31.0 (level:#disks
...)
Filesystems Dumped 3 1 2 (1:1 5:1)
Avg Dump Rate (k/s) 1207.5 5366.4 366.3
Tape Time (hrs:min) 0:17 0:13 0:05
Tape Size (meg) 12163.3 9094.3 3069.0
Tape Used (%) 1.8 1.3 0.4 (level:#disks
...)
Filesystems Taped 3 1 2 (1:1 5:1)
Avg Tp Write Rate (k/s) 11980.6 12287.9 11153.9
--------
NOTES:
driver: WARNING: /tmp: not 102400 KB free.
planner: Incremental of venus.xxxx:/home bumped to level 5.
planner: Full dump of bda1.xxxx:/home specially promoted from 13 days
ahead.
taper: tape DailySet111 kb 12455232 fm 3 [OK]
DUMP SUMMARY:
DUMPER STATS TAPER STATS
HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS
KB/s
-------------------------- ---------------------------------
------------
bda1.xxxx /home 0 196376909312576 47.4 28:555366.4 12:3812287.9
bda2.xxxx /var/www 1 3210 480 15.0 0:01 364.4 0:0028399.0
venus.xxxx /home 5 101251603142176 31.0 142:59 366.3
4:4211152.8
I'd suggest adding columnspec to your config and adjusting it so that
all the columns don't run together. It makes it much easier to read. Good idea, done! I'm guessing that bda1:/home wrote 9.3GB to 'tape', taking about 26 min
to dump and almost 13 min. to tape.
venus:home wrote 3GB, taking over 2 hours to dump and 5 min. to dump.
Which (if any) of these is the backup server itself? The backup server itself as well as the fileserver the data is coming from is called venus The taper rates (about 12MB/sec if I'm parsing it right) seem ok, but
the 142 min dump time seems somewhat high for only 3GB of data.
Is that the 400GB filesystem you were talking about, and is it local
or remote? Those disk are local to the backup server. As for the estimates, are you using dump or tar? Look in the
*debug files on the clients and see which one was taking all the time
(I'm guessing venus since it looks like you did a force on bda1).
Does that filesystem have millions of small files? I am using tar to do this. The bda1 system is a CVS server which gets hammered on all day long and does have tons of smaller files as well as a decent amount of larger ones. I'm not sure of the best way to speed up estimates, other than a
faster disk system. The disks in the venus box are all SATA 150 drives, SCSI is way out of the price range for this amount of space. If venus is the machine that is taking forever to do the estimates, is it possible that 1. estimates start on all machines, 2. the estimates finish on the smaller remote file systems first; these systems begin to dump. 3. now along with the backup server trying to do an estimate on its own disks, its also dealing with a dump coming in from remote systems and all of this together is slowing it down? Do I have any valid ideas here?
-Kris
Perhaps someone else on the list has some ideas.
Frank
On Mon, 2004-07-19 at 15:20, Frank Smith wrote:
--On Monday, July 19, 2004 14:07:40 -0700 Kris Vassallo
<kris < at > linuxcertified.com> wrote:
I am looking for some assistance in tweaking the bumpsize, bumpdays,
and bumpmult items in amanda.conf. I am backing up 420GB + worth of
home
directories to hard disks every night and the backup is taking about
11
hours. I just changed the backup of one 400GB home drive from client
compress best to client compress fast, which did seem to shave a bit
of
time off the backup. The disks that are being backed up are on the
same
RAID controller as the backup disks.
I really need to make the backup take a lot less time because the
network crawls when the developers come in to work in the morning
because the home directory server is blasting away with the backup.
So,
with a filesystem this large, what would be some good settings for the
bump options. Also, are there any other things I can do to get this
backup done any faster without turning off disk compression all
together?
Are you actually writing 420GB per night, or is that just the total
amount to be backed up? If most of your data isn't changing daily
then breaking up your DLEs to not have a 400GB chunk could spread
the level 0s across more nights and shorten your nightly backup time.
Are you sure its the compression using up most of the time? You
probably need to add spindle numbers to your disklist to serialize
the accesses to the DLEs that share common disks. Using a holding
disk not on the same controller would speed things up also.
If your DLS and file backups share the same disks and not just
the same controller then the disks will waste quite a bit of time
seeking back and forth. You might also want to do some performance
testing on your RAID controller, perhaps it is the bottleneck as
the model of controller (and the RAID level) can have a big impact
on throughput.
Perhas posting your daily report and more details of the physical
layout would give us a better idea of where to start on suggestions
for improving your backup times.
Frank
|
| Tue Jul 20, 2004 1:38 pm |
|
 |
Stefan G. Weichinger
Guest
|
 Speed up 400GB backup?
Hi, Frank,
on Dienstag, 20. Juli 2004 at 07:41 you wrote to amanda-users:
420GB is not the total amount per night. Something is bogging this down
though and I don't know what. I am not using holding disks because the
majority of data is being backed up from one set of disks to another on
the same machine. This one machine has a set of RAID 10 disks. These
disks are backed up by amanda and put onto a set of RAID 5 disks.
FS> OK, I was assuming a different setup. Having a holding disk would let
FS> you run multiple dumps in parallel. Wouldn't help much (if any) when
FS> its all on one machine, but can really speed up your overall time if
FS> you have multiple clients.
Given Joshua's note about having data and backup on the same
controller I would just suggest adding a cheap'n'huge IDE-drive (and
controller, if necessary) for a holdingdisk.
This will speed things up locally, too. Think parallel dumping AND the
fact that people could access data at ~normal speed even while the
holdingdisk is still feeding the tape (while this is still not the
solution here, estimates ain't done on the holdingdisk ....)
Having a separate holdingdisk is never a bad thing with AMANDA IMHO.
As far
as assigning spindle #s goes I don't quite understand why I would set
that. I have inparallel set to 4 and then didn't define maxdumps, so I
would assume that not more than 1 dumper would get started on a machine
at once. Am I getting this right?
FS> I think maxdumps defaults to 2 but I may be wrong (someone else should
FS> jump in here).
It is 10. ( grep -r "define MAXDUMPS" amanda-2.4.4-p3 )
Estimate Time (hrs:min) 7:30
FS> Here's your runtime problem, 7.5 hours for estimates .
Yep.
Run Time (hrs:min) 10:35
Dump Time (hrs:min) 2:52 0:29 2:23
FS> Three hours for dumps doesn't seem too bad. It could probably
FS> be improved some, but the estimates are what's killing you.
Yep again.
FS> As for the estimates, are you using dump or tar? Look in the
FS> *debug files on the clients and see which one was taking all the time
FS> (I'm guessing venus since it looks like you did a force on bda1).
FS> Does that filesystem have millions of small files?
FS> I'm not sure of the best way to speed up estimates, other than a
FS> faster disk system. Perhaps someone else on the list has some ideas.
My idea is to request more details here.
Relevant dumptype-definition, local/remote-info, df venus:/home, etc
...
--
best regards,
Stefan
|
| Tue Jul 20, 2004 1:40 pm |
|
 |
Frank Smith
Guest
|
 Speed up 400GB backup?
--On Tuesday, July 20, 2004 14:42:23 -0700 Mike Fedyk <mfedyk < at > matchmail.com> wrote:
Joshua Baker-LePain wrote:
As Frank pointed out, this is a big part of your problem. What OS and FS
are we talking here, and what backup program? And, again, sendsize*debug
Amanda works with other clients besides Amanda? Or are you asking Amanda version?
I'm assuming by 'backup program' he was referring to what program amanda
was using to read the disks (GNU tar vs dump, ufsdump, vxdump, etc.).
It was determined to be GNU tar in a later email.
Frank
--
Frank Smith fsmith < at > hoovers.com
Sr. Systems Administrator Voice: 512-374-4673
Hoover's Online Fax: 512-374-4501
|
| Tue Jul 20, 2004 1:53 pm |
|
 |
Stefan G. Weichinger
Guest
|
 Speed up 400GB backup?
Hi, Kris,
on Dienstag, 20. Juli 2004 at 23:14 you wrote to amanda-users:
KV> The box is running redhat 9 with 2.4.20 kernel and ext3 filesystem.
KV> Below is the most recent sendsize.debug
KV> sendsize[27747]: time 11114.784: Total bytes written: 429923983360 (400GB, 37MB/s)
ok ...
KV> sendsize[27747]: time 18815.342: Total bytes written: 69237104640 (64GB, 8.6MB/s)
not ok ---
I would:
- split venus:/home into several DLEs (via exclude/include)
- exclude unnecessary subdirs/files (./qa/build-main-branch-rfexamples/rfexamples-20040719/customer_test/Nestoras4/Freq
Domain/Linux_temp-g seems like a candidate to me)
--
This would spawn several sendsize-processes in parallel ...
--
best regards,
Stefan
|
| Tue Jul 20, 2004 2:03 pm |
|
 |
Frank Smith
Guest
|
 Speed up 400GB backup?
--On Tuesday, July 20, 2004 14:41:43 -0700 Mike Fedyk <mfedyk < at > matchmail.com> wrote:
Hi,
[ This is my first post to this list, and it looks like "reply to all" is accepted here, so that's what I'm doing...]
Kris Vassallo wrote:
On Tue, 2004-07-20 at 04:24, Joshua Baker-LePain wrote:
/On Mon, 19 Jul 2004 at 5:19pm, Kris Vassallo wrote
420GB is not the total amount per night. Something is bogging this down
though and I don't know what. I am not using holding disks because the
majority of data is being backed up from one set of disks to another on
the same machine. This one machine has a set of RAID 10 disks. These
disks are backed up by amanda and put onto a set of RAID 5 disks. As far
Just as an aside, having your backup disks on the same controller as your
real data seems a bit risky to me -- what if the controller goes? What if
it takes multiple disks with it?/
The whole thing of having the backup host being the same machine as
the file server no longer looks like a good idea. However, I am in it
too deep to jump out now. I suppose that I could get a second
controller in the box, but to me it seems as if that would only create
another bottleneck, the pci bus.
Why?
You have the compression done on the client anyway, so just take an older
(probably Pentium II class or better) machine and use that as your Amanda
server.
Generally true if you're using tape, but Kris is using the file driver
and backing up to disk, so his backup server would probably need over
a terabyte of space (to keep two fulls and the incrementals of a 400GB
filesystem). Although if the backup disks could easily be moved
to another box it would speed things up.
Frank
--
Frank Smith fsmith < at > hoovers.com
Sr. Systems Administrator Voice: 512-374-4673
Hoover's Online Fax: 512-374-4501
|
| Tue Jul 20, 2004 2:04 pm |
|
 |
Frank Smith
Guest
|
 Speed up 400GB backup?
--On Tuesday, July 20, 2004 14:35:53 -0700 Kris Vassallo <kris < at > linuxcertified.com> wrote:
NOTES:
driver: WARNING: /tmp: not 102400 KB free.
I overlooked this last night. I've never seen this message myself,
but perhaps it is relevant. Any thoughts, anyone?
I am using tar to do this. The bda1 system is a CVS server which gets
hammered on all day long and does have tons of smaller files as well as
a decent amount of larger ones.
As Stefan mentioned, there are probably subdirectories you could exclude
from the backup to speed things up. You mentioned part of it was used
for CVS, perhaps you can exclude some of the build trees and just backup
the source trees.
The disks in the venus box are all SATA 150 drives, SCSI is way out of
the price range for this amount of space. If venus is the machine that
is taking forever to do the estimates, is it possible that 1. estimates
start on all machines, 2. the estimates finish on the smaller remote
file systems first; these systems begin to dump. 3. now along with the
backup server trying to do an estimate on its own disks, its also
dealing with a dump coming in from remote systems and all of this
together is slowing it down? Do I have any valid ideas here?
Possible, although the estimates write to /dev/null, so the remote
dumps shouldn't be slowing them down unless it's your controller
limiting you and not the disks themselves. You could try commenting
out all the other filesystems in your disklist and see if the estimate
still takes as long.
Is the system otherwise idle when you are running Amanda? If
the disks are fairly active (whether from user activity or perhaps
automated nightly builds) it will slow down your backups considerably.
It could also be kernel related. Our first attempt at Linux
fileservers had problems under heavy load, the sytem would slow to
a crawl (and sometimes appear to hang) under concurrent loads (a
CVS build and an rsync of the filesystem in our case). Moving from
a 2.4 kernel to 2.6 solved the problem completely.
Frank
-Kris
--
Frank Smith fsmith < at > hoovers.com
Sr. Systems Administrator Voice: 512-374-4673
Hoover's Online Fax: 512-374-4501
|
| Tue Jul 20, 2004 3:08 pm |
|
 |
Mike Fedyk
Guest
|
 Speed up 400GB backup?
Frank Smith wrote:
--On Tuesday, July 20, 2004 14:41:43 -0700 Mike Fedyk <mfedyk < at > matchmail.com> wrote:
The whole thing of having the backup host being the same machine as
the file server no longer looks like a good idea. However, I am in it
too deep to jump out now. I suppose that I could get a second
controller in the box, but to me it seems as if that would only create
another bottleneck, the pci bus.
Why?
You have the compression done on the client anyway, so just take an older
(probably Pentium II class or better) machine and use that as your Amanda
server.
Generally true if you're using tape, but Kris is using the file driver
and backing up to disk, so his backup server would probably need over
a terabyte of space (to keep two fulls and the incrementals of a 400GB
filesystem). Although if the backup disks could easily be moved
to another box it would speed things up.
Frank
That depends on the compressibility of the data of course.
That said, a few IDE hard drives are quite cheap these days.
|
| Tue Jul 20, 2004 3:41 pm |
|
 |
Mike Fedyk
Guest
|
 Speed up 400GB backup?
Kris Vassallo wrote:
The disks in the venus box are all SATA 150 drives, SCSI is way out of
the price range for this amount of space. If venus is the machine that
is taking forever to do the estimates, is it possible that 1.
estimates start on all machines, 2. the estimates finish on the
smaller remote file systems first; these systems begin to dump. 3. now
along with the backup server trying to do an estimate on its own
disks, its also dealing with a dump coming in from remote systems and
all of this together is slowing it down? Do I have any valid ideas here?
On my Amanda 2.4.4p2 from Fedora Core 2, the dumps wait until all
estimates finish before starting.
Is there an option to change that?
|
| Tue Jul 20, 2004 3:53 pm |
|
 |
|
|
The time now is Thu May 24, 2012 7:49 am | All times are GMT - 8 Hours
|
Page 1 of 2
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|