 |
Page 1 of 1
|
| Author |
Message |
rory_f
Joined: 27 Oct 2008
Posts: 44
|
 Making sure AMANDA does not run "over" tapesize
Hey,
So I've noticed that sometimes Amanda will fill up a tape with more than 400gb (LTO3) - I'm assuming this is down to compression? Is there another way to limit this from happening apart from turning hardware and software compression off?
Thanks,
|
| Wed Jan 05, 2011 8:34 am |
|
 |
Brian Cuttler
Guest
|
 Making sure AMANDA does not run "over" tapesize
Rory,
When I am using tape compression I often...sorry.
When using SW tape compression amanda will know what the
typical compression is for any given DLE and will (I believe)
use the expected compressed size of the data when estimating
overall tape usage.
When I use HW compression I often lie about the tape length,
extending the actually physical size by the expected compression
amount so that I am able to utilize the full physical tape.
This is very valuable for me in the couple of cases where I
have a non-spanning DLE that is larger than the physical tape
would be without compression, else amanda would report that the
DLE was larger than the tape...
What goal/outcome are you seeking ?
Brian
On Wed, Jan 05, 2011 at 11:34:47AM -0500, rory_f wrote:
Hey,
So I've noticed that sometimes Amanda will fill up a tape with more than 400gb (LTO3) - I'm assuming this is down to compression? Is there another way to limit this from happening apart from turning hardware and software compression off?
Thanks,
+----------------------------------------------------------------------
|This was sent by rory < at > mrxfx.com via Backup Central.
|Forward SPAM to abuse < at > backupcentral.com.
+----------------------------------------------------------------------
---
Brian R Cuttler brian.cuttler < at > wadsworth.org
Computer Systems Support (v) 518 486-1697
Wadsworth Center (f) 518 473-6384
NYS Department of Health Help Desk 518 473-0773
IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure. It
is intended only for the addressee. If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments. Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.
|
| Wed Jan 05, 2011 8:52 am |
|
 |
rory_f
Joined: 27 Oct 2008
Posts: 44
|
 Re: Making sure AMANDA does not run "over" tapesiz
I want to ensure tapes are filled 100% each time where possible. I've written a script in python to look at directory, figure out size, and create a disklist which will ensure a round about size for each disklist file - so for instance it will try to create a disklist file that contains entries in groups of 400gb's - the size of a tape. I know amanda will fill a tape to 100% where possible but sometimes, if it is using compression, this doesn't work, and the first two tapes will fill 500gb+ and then the last tape will be left with 200gb. This is a waste of 200gb - I'm trying to make sure all tapes are full where possible and not waste any space.
I know I could take the tape that is half full and archive the contents again with added content but this is time consuming.
I just want to make sure amanda is working with my script to make sure all tapes are being filled.
Do you see what i'm getting at?
Thanks,
Rory,
When I am using tape compression I often...sorry.
When using SW tape compression amanda will know what the
typical compression is for any given DLE and will (I believe)
use the expected compressed size of the data when estimating
overall tape usage.
When I use HW compression I often lie about the tape length,
extending the actually physical size by the expected compression
amount so that I am able to utilize the full physical tape.
This is very valuable for me in the couple of cases where I
have a non-spanning DLE that is larger than the physical tape
would be without compression, else amanda would report that the
DLE was larger than the tape...
What goal/outcome are you seeking ?
Brian
On Wed, Jan 05, 2011 at 11:34:47AM -0500, rory_f wrote:
Hey,
So I've noticed that sometimes Amanda will fill up a tape with more than 400gb (LTO3) - I'm assuming this is down to compression? Is there another way to limit this from happening apart from turning hardware and software compression off?
Thanks,
+----------------------------------------------------------------------
|This was sent by rory < at > mrxfx.com via Backup Central.
|Forward SPAM to abuse < at > backupcentral.com.
+----------------------------------------------------------------------
---
|
| Wed Jan 05, 2011 8:59 am |
|
 |
choogendyk
Joined: 27 Jul 2007
Posts: 294
|
 Making sure AMANDA does not run "over" tapesize
On 1/5/11 12:00 PM, rory_f wrote:
I want to ensure tapes are filled 100% each time where possible. I've written a script in python to look at directory, figure out size, and create a disklist which will ensure a round about size for each disklist file - so for instance it will try to create a disklist file that contains entries in groups of 400gb's - the size of a tape. I know amanda will fill a tape to 100% where possible but sometimes, if it is using compression, this doesn't work, and the first two tapes will fill 500gb+ and then the last tape will be left with 200gb. This is a waste of 200gb - I'm trying to make sure all tapes are full where possible and not waste any space.
Not to be rude, but that's a false economy.
It could just as easily be said that you would be wasting tape capacity by not using compression.
You are asking to not allow more than 400GB per tape, and thus no more than 1200GB on the set of 3.
Then you are complaining that the 1200GB is unevenly distributed across the 3 tapes, because
compression allowed more than 400GB on each of the first 2 tapes. So, stated another way, you are
asking that the "wasted" (or unused) 300GB (or so) of space be distributed across all 3 tapes,
rather than just being on the last tape, and/or to just not use compression so that you can imagine
that you are not wasting tape.
500GB per tape means that you are getting about 20% compression. If that is consistent, have your
python script set to queue up somewhere between 1400GB to 1500GB for backup, the choice depending on
how close you want to shave it (with a higher risk of over running the last tape). Then you are
being economical with your tape usage, getting a couple hundred more GB on the set of tapes than you
were originally thinking.
Of course, compressibility varies widely. Huge directories of TIFF and JPEG files can be essentially
uncompressible. Typical unix directories of predominantly text based stuff, like log files or
configuration files, are highly compressible, and repetitive things like Apache access logs can
compress as much as 10:1. So, you have to know your data to efficiently plan what you are trying to do.
--
---------------
Chris Hoogendyk
-
O__ ---- Systems Administrator
c/ /'_ --- Biology& Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<hoogendyk < at > bio.umass.edu>
---------------
Erdös 4
|
| Wed Jan 05, 2011 10:14 am |
|
 |
rory_f
Joined: 27 Oct 2008
Posts: 44
|
 Re: Making sure AMANDA does not run "over" tapesiz
On 1/5/11 12:00 PM, rory_f wrote:
I want to ensure tapes are filled 100% each time where possible. I've written a script in python to look at directory, figure out size, and create a disklist which will ensure a round about size for each disklist file - so for instance it will try to create a disklist file that contains entries in groups of 400gb's - the size of a tape. I know amanda will fill a tape to 100% where possible but sometimes, if it is using compression, this doesn't work, and the first two tapes will fill 500gb+ and then the last tape will be left with 200gb. This is a waste of 200gb - I'm trying to make sure all tapes are full where possible and not waste any space.
Not to be rude, but that's a false economy.
It could just as easily be said that you would be wasting tape capacity by not using compression.
You are asking to not allow more than 400GB per tape, and thus no more than 1200GB on the set of 3.
Then you are complaining that the 1200GB is unevenly distributed across the 3 tapes, because
compression allowed more than 400GB on each of the first 2 tapes. So, stated another way, you are
asking that the "wasted" (or unused) 300GB (or so) of space be distributed across all 3 tapes,
rather than just being on the last tape, and/or to just not use compression so that you can imagine
that you are not wasting tape.
500GB per tape means that you are getting about 20% compression. If that is consistent, have your
python script set to queue up somewhere between 1400GB to 1500GB for backup, the choice depending on
how close you want to shave it (with a higher risk of over running the last tape). Then you are
being economical with your tape usage, getting a couple hundred more GB on the set of tapes than you
were originally thinking.
Of course, compressibility varies widely. Huge directories of TIFF and JPEG files can be essentially
uncompressible. Typical unix directories of predominantly text based stuff, like log files or
configuration files, are highly compressible, and repetitive things like Apache access logs can
compress as much as 10:1. So, you have to know your data to efficiently plan what you are trying to do.
Ok. I totally see your point - you are very correct.The majority of the files being backed up are images - dpx, exr, cin, etc - we are a VFX house.
I guess with a wide range of varying file types theres no real way to 'predict' the type of compression, is there?
Perhaps what i'm looking to do is guarantee the most economical way of filling the tapes.. It would be great to count on compression of 20% every time but it seems to vary too much to rely on AMANDA to work this way.
Thanks for your view though - initially I didnt look at it that way.
|
| Wed Jan 05, 2011 10:19 am |
|
 |
Brian Cuttler
Guest
|
 Making sure AMANDA does not run "over" tapesize
Rory,
On Wed, Jan 05, 2011 at 12:00:40PM -0500, rory_f wrote:
I want to ensure tapes are filled 100% each time where possible. I've
written a script in python to look at directory, figure out size, and
create a disklist which will ensure a round about size for each disklist
file - so for instance it will try to create a disklist file that contains
entries in groups of 400gb's - the size of a tape. I know amanda will fill
a tape to 100% where possible but sometimes, if it is using compression,
this doesn't work, and the first two tapes will fill 500gb+ and then the
last tape will be left with 200gb. This is a waste of 200gb - I'm trying
to make sure all tapes are full where possible and not waste any space.
I'm not certain I understand your example. If you have 12000 GB to
write and you write 500 gb to the first two tapes and 200 gb to the
last tape you are using the same three tapes you'd use if you
wrote 400 gig to each. You either waste the space equally at the end
of all three tapes or all at ones on the last tape.
If a tape will hold 500 gig and you only put 400 gig the tape
isn't full. This "overfull" is a hazard of HW compression that
you don't experience with SW compression. If you know from
experience that your data will compress by 20% and you have a
400 gig tape and you are using HW compression you can lie in
the tapetype definition and claim its a 500 gig tape so amanda
will be able to estimate usage better, but its not going to be
an exact fit because even with SW compression where amanada can
track the data there is always a little real-life flux in the
data compression (unless of course the data is static, in which
case you might want to archive it rather than back it up).
There are settings within amanda that help to fill tapes.
Admittedly I've only used "taperalgo" and have been very happy
with the "largestfit" setting, but there are also the values
flush-threshold-dumped
flush-threshold-scheduled
taperflush
which have examples settings in the amanda.conf file.
These settings should help you fill tapes, will even
delay flushing of the work area to tape if the tape
is not expected to be filled.
For my money the more you can delegate to amanda and the
less fiddling with the disklist or other files the better.
I know I could take the tape that is half full and archive the contents again with added content but this is time consuming.
I just want to make sure amanda is working with my script to make sure all tapes are being filled.
Yah... but you can't fill them by not filling them.
Do you see what i'm getting at?
I think so, do you see where I'm going ?
Thanks,
your welcome,
Brian
Brian Cuttler wrote:
Rory,
When I am using tape compression I often...sorry.
When using SW tape compression amanda will know what the
typical compression is for any given DLE and will (I believe)
use the expected compressed size of the data when estimating
overall tape usage.
When I use HW compression I often lie about the tape length,
extending the actually physical size by the expected compression
amount so that I am able to utilize the full physical tape.
This is very valuable for me in the couple of cases where I
have a non-spanning DLE that is larger than the physical tape
would be without compression, else amanda would report that the
DLE was larger than the tape...
What goal/outcome are you seeking ?
Brian
On Wed, Jan 05, 2011 at 11:34:47AM -0500, rory_f wrote:
Hey,
So I've noticed that sometimes Amanda will fill up a tape with more than 400gb (LTO3) - I'm assuming this is down to compression? Is there another way to limit this from happening apart from turning hardware and software compression off?
Thanks,
+----------------------------------------------------------------------
|This was sent by rory < at > mrxfx.com via Backup Central.
|Forward SPAM to abuse < at > backupcentral.com.
+----------------------------------------------------------------------
---
+----------------------------------------------------------------------
|This was sent by rory < at > mrxfx.com via Backup Central.
|Forward SPAM to abuse < at > backupcentral.com.
+----------------------------------------------------------------------
---
Brian R Cuttler brian.cuttler < at > wadsworth.org
Computer Systems Support (v) 518 486-1697
Wadsworth Center (f) 518 473-6384
NYS Department of Health Help Desk 518 473-0773
IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure. It
is intended only for the addressee. If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments. Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.
|
| Wed Jan 05, 2011 10:31 am |
|
 |
Charles Curley
Guest
|
 Making sure AMANDA does not run "over" tapesize
On Wed, 05 Jan 2011 13:19:42 -0500
rory_f <amanda-forum < at > backupcentral.com> wrote:
Perhaps what i'm looking to do is guarantee the most economical way
of filling the tapes.
Which is more valuable, a few terabytes of tape space, or your time?
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email
Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
|
| Wed Jan 05, 2011 10:45 am |
|
 |
rory_f
Joined: 27 Oct 2008
Posts: 44
|
 Re: Making sure AMANDA does not run "over" tapesiz
On Wed, 05 Jan 2011 13:19:42 -0500
rory_f <amanda-forum < at > backupcentral.com> wrote:
Perhaps what i'm looking to do is guarantee the most economical way
of filling the tapes.
Which is more valuable, a few terabytes of tape space, or your time?
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email
Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
I guess that depends if you ask me (as it is my time) or the boss (who pays for the tapes ;-])
|
| Wed Jan 05, 2011 10:52 am |
|
 |
|
|
The time now is Thu May 24, 2012 7:02 am | All times are GMT - 8 Hours
|
Page 1 of 1
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|