SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
"all estimate failed"
Author Message
Post "all estimate failed" 
I use amanda successfully to back up a number of systems remotely, one
via VPN across the Internet. I'm having trouble, however, with my
local system (also the amanda server host) where it seems to be
difficult to get any backups to work. The box is running Gentoo
Linux, xinetd, Amanda version 2.4.4p2, Linux kernel 2.4.20. A typical
report shows:

FAILURE AND STRANGE DUMP SUMMARY:
vishnu hdb2 lev 0 FAILED [disk hdb2, all estimate failed]
vishnu hdb1 lev 0 FAILED [disk hdb1, all estimate failed]

... and the backup fails.

This particular notice is from a test backup on a drive with 2
partitions, one ext2 and the other reiserfs. The dumptype on both is
comp-user or comp-user-tar. All files and directories on both
partitions are mode ugo+r. The amanda xinetd config contains
'only-from' entries for both the private IP address of the box and
'localhost'. /etc/amandahosts likewise contains entries for both the
hostname of the box and 'localhost:

$ cat /etc/amandahosts
localhost amanda
localhost root
[snip snap!]
vishnu.fmp.com amanda
vishnu.fmp.com root

Does anyone have any ideas what I might try to solve this problem?

--
Lindsay Haisley | "Everything works | PGP public key
FMP Computer Services | if you let it" | available at
512-259-1190 | (The Roadie) | <http://www.fmp.com/pubkeys>
http://www.fmp.com | |

Post "all estimate failed" 
On Sunday 11 July 2004 15:49, fmouse < at > fmp.com wrote:
Thus spake Gene Heskett on Sun, Jul 11, 2004 at 04:29:31AM CDT

On Sunday 11 July 2004 03:11, fmouse-amanda < at > fmp.com wrote:
I use amanda successfully to back up a number of systems
remotely, one via VPN across the Internet. I'm having trouble,
however, with my local system (also the amanda server host)
where it seems to be difficult to get any backups to work. The
box is running Gentoo Linux, xinetd, Amanda version 2.4.4p2,
Linux kernel 2.4.20. A typical report shows:

FAILURE AND STRANGE DUMP SUMMARY:
vishnu hdb2 lev 0 FAILED [disk hdb2, all estimate failed]
vishnu hdb1 lev 0 FAILED [disk hdb1, all estimate failed]

... and the backup fails.

This particular notice is from a test backup on a drive with 2
partitions, one ext2 and the other reiserfs. The dumptype on
both is comp-user or comp-user-tar. All files and directories
on both partitions are mode ugo+r. The amanda xinetd config
contains 'only-from' entries for both the private IP address of
the box and 'localhost'. /etc/amandahosts likewise contains
entries for both the hostname of the box and 'localhost:

$ cat /etc/amandahosts
localhost amanda
localhost root
[snip snap!]
vishnu.fmp.com amanda
vishnu.fmp.com root

Does anyone have any ideas what I might try to solve this
problem?

First, one should never use localhost, it is not a unique name,
and amanda must have unique, no mistaking which machine, names,
resolvable in the approriate /etc/hosts file (or by a local dns
server),

It's unique on this box.

$ dig +short localhost
127.0.0.1

It _will_ bite you at some point. Amanda is very very network aware.
Please use the FQDN of the machine even though that might resolve to a
192.168.xx.xx address.

No ambiguity there.

and likewise, the ~/.amandahosts file must be similarly
treated. I note above that you didn't have the leading dot in the
filename you catted above.

/etc/amandahosts doesn't use a leading dot. ~amanda/.amandahosts
does, and is a symlink to /etc/amandahosts.

Amanda was built as user:

and make install was done as user:

And its perms (~/.amandahosts) are?

This may not be your whole problem, but it will remove another
often encountered source of recovery time failures. Please read
the TOP-TEN-QUESTIONS and FAQ in the tarball.

This isn't a recovery-time issue, it's a backup-time issue. I have
no problem with either backup or recovery on any of the other
systems I have for which I use amanda.

The problem is more complex than this. There is _one_ filesystem on
this box with which amanda has no problems. That's my boot
partition, which is a small (c.a. 15 megs) ext2 boot partition
which I _can_ back up. It's both the smallest and least important
partition on the box, but the fact that I can back it up tells me
that the problem isn't one of access permissions or name
resolution. Amanda doesn't care about file permissions,
apparently, since I can remove read permissions from all files on
this partition and still back it up.

Amstatus shows:

vishnu:hda2 1 7810k dump done (14:35:22), wait for writing to
tape vishnu:hda4 0 planner: [disk hda4, all estimate failed]

hda2 succeeds, hda4 fails. Both are set to a dumptype of
'nocomp-boot-tar' (not appropriate for hda4, but temporarily
changing this eliminates a variable). The only thing which
distinguishes hda2 from any other filesystem on the box is it's
small size. I tried a test backup on a spare drive, mounted
temporarily with two nearly identical sets of files on nearly
identically sized partitions (c.a. 1.5G), one ext2 and one
reiserfs, and both failed, which eliminates the filesystem type,
apparently. Amanda has no problem getting estimates on and backup
up filesystems many times this size on other boxes.

The clues point to serious slowdown someplace.

I'll take a look for the TOP-TEN-QUESTIONS FAQ and see if it helps,
but I'm guessing that it probably won't. I'm wondering if this is
some kind of kernel issue or something equally weird.

The only other item that comes to mind is the etimeout value in your
amanda.conf. Just for testing, multiply it by 10 for tonight, that
big a filesystem may be too slow for the default 5 minutes. Its in
seconds BTW. If that fixes the estimates, you may have to expand the
dtimeout value also.

--
Cheers, Gene
There are 4 boxes to be used in defense of liberty.
Soap, ballot, jury, and ammo.
Please use in that order, starting now. -Ed Howdershelt, Author
Additions to this message made by Gene Heskett are Copyright 2004,
Maurice E. Heskett, all rights reserved.

Post "all estimate failed" 
On Sunday 11 July 2004 16:36, fmouse < at > fmp.com wrote:
Thus spake Gene Heskett on Sun, Jul 11, 2004 at 03:10:06PM CDT

$ cat /etc/amandahosts
localhost amanda
localhost root
[snip snap!]
vishnu.fmp.com amanda
vishnu.fmp.com root

Does anyone have any ideas what I might try to solve this
problem?

First, one should never use localhost, it is not a unique name,
and amanda must have unique, no mistaking which machine, names,
resolvable in the approriate /etc/hosts file (or by a local dns
server),

It's unique on this box.

$ dig +short localhost
127.0.0.1

It _will_ bite you at some point. Amanda is very very network
aware. Please use the FQDN of the machine even though that might
resolve to a 192.168.xx.xx address.

The fqdn (vishnu.fmp.com) _is_ in amandahosts and resolves. Putting
localhost in the same file shouldn't make any difference, it just
expands access permissions, and amanda doesn't, or shouldn't rely
on it. I'll explore this issue later, but it obviously isn't the
problem here since amanda _can_ resolve the box, and deal with one
of the filesystems on it.

No ambiguity there.

and likewise, the ~/.amandahosts file must be similarly
treated. I note above that you didn't have the leading dot in
the filename you catted above.

/etc/amandahosts doesn't use a leading dot. ~amanda/.amandahosts
does, and is a symlink to /etc/amandahosts.

Amanda was built as user:

and make install was done as user:

Amanda was built and installed as part of a Gentoo build.

Since I'm on an FC1 box, that doesn't answer the question.

And its perms (~/.amandahosts) are?

$ l /etc/amandahosts
-rw-r----- 1 amanda amanda 233 Jan 7 2004 /etc/amandahosts

Looks ok

The amanda user can read it, no problem.

changing this eliminates a variable). The only thing which
distinguishes hda2 from any other filesystem on the box is it's
small size. I tried a test backup on a spare drive, mounted
temporarily with two nearly identical sets of files on nearly
identically sized partitions (c.a. 1.5G), one ext2 and one
reiserfs, and both failed, which eliminates the filesystem type,
apparently. Amanda has no problem getting estimates on and
backup up filesystems many times this size on other boxes.

The clues point to serious slowdown someplace.

Possibly. The system is speedy enough in other respects. It's the
fastest box I have, and even on the network it's a real performer!

I'll take a look for the TOP-TEN-QUESTIONS FAQ and see if it
helps, but I'm guessing that it probably won't. I'm wondering
if this is some kind of kernel issue or something equally weird.

The only other item that comes to mind is the etimeout value in
your amanda.conf. Just for testing, multiply it by 10 for
tonight, that big a filesystem may be too slow for the default 5
minutes. Its in seconds BTW. If that fixes the estimates, you
may have to expand the dtimeout value also.

I'll check this out. None of these timeouts are spec'd in the
amanda.conf file for the box, and etimeout defaults to 300 seconds,
or 5 minutes. The failure comes back in less than a minute.

Thats something else then, probably in the networking somewhere. Is
there a firewall restricting the higher port numbers maybe?

--
Cheers, Gene
There are 4 boxes to be used in defense of liberty.
Soap, ballot, jury, and ammo.
Please use in that order, starting now. -Ed Howdershelt, Author
Additions to this message made by Gene Heskett are Copyright 2004,
Maurice E. Heskett, all rights reserved.

Post "all estimate failed" 
On Sunday 11 July 2004 17:04, fmouse < at > fmp.com wrote:
Thus spake Gene Heskett on Sun, Jul 11, 2004 at 03:10:06PM CDT

BTW, I've been Cc: in the list as this really should be a public
exchange. Thats what the list is for, and its searchable I believe.

The clues point to serious slowdown someplace.

I'll take a look for the TOP-TEN-QUESTIONS FAQ and see if it
helps, but I'm guessing that it probably won't. I'm wondering
if this is some kind of kernel issue or something equally weird.

The only other item that comes to mind is the etimeout value in
your amanda.conf. Just for testing, multiply it by 10 for
tonight, that big a filesystem may be too slow for the default 5
minutes. Its in seconds BTW. If that fixes the estimates, you
may have to expand the dtimeout value also.

The time to failure is about 50 seconds, with etimeout set to 600
seconds, so it's not a slowdown or timeout issue unless there's
some other timeing factor involved. I really wish there were
better logging for this kind of thing. Running it under strace
_might_ help.

BTW there are logfiles being kept, fairly verbose ones at that,
whereever you have it set to in your amanda.conf, or in the default
dir, probably in /var/amanda-dbg or similar. They will tell you a
lot more precisely what is going on. I'd start there rather than on
gentoo.


I'll check with the Gentoo devs and see if any of them have any
experience with this sort of thing.

--
Cheers, Gene
There are 4 boxes to be used in defense of liberty.
Soap, ballot, jury, and ammo.
Please use in that order, starting now. -Ed Howdershelt, Author
Additions to this message made by Gene Heskett are Copyright 2004,
Maurice E. Heskett, all rights reserved.

Post "all estimate failed" 
Thus spake Gene Heskett on Sun, Jul 11, 2004 at 04:49:03PM CDT
On Sunday 11 July 2004 17:04, fmouse < at > fmp.com wrote:
Thus spake Gene Heskett on Sun, Jul 11, 2004 at 03:10:06PM CDT

BTW, I've been Cc: in the list as this really should be a public
exchange. Thats what the list is for, and its searchable I believe.

Sorry. Dealing with Majordomo lists is a PITA, and I didn't notice the CC
either. THis one's cc'd.

BTW there are logfiles being kept, fairly verbose ones at that,
whereever you have it set to in your amanda.conf, or in the default
dir, probably in /var/amanda-dbg or similar. They will tell you a
lot more precisely what is going on. I'd start there rather than on
gentoo.


I'll check with the Gentoo devs and see if any of them have any
experience with this sort of thing.

I know about the amanda logfiles, and they're uninformative. For instance,

/var/log/amanda/Fmouse# cat log.20040711.2
DISK planner vishnu /dev/fasttrak/fmouse_gen
DISK planner vishnu hda1
START planner date 20040711
WARNING planner tapecycle (5) <= runspercycle (2Cool
START driver date 20040711
INFO planner Adding new disk vishnu:hda1.
FAIL planner vishnu hda1 20040711 0 [disk hda1, all estimate failed]
FAIL planner vishnu /dev/fasttrak/fmouse_gen 20040711 0 [disk
/dev/fasttrak/fmouse_gen, all estimate failed]
FINISH planner date 20040711
WARNING driver WARNING: got empty schedule from planner
STATS driver startup time 17.290
ERROR taper no-tape [rewinding tape: No medium found]
FINISH driver date 20040711 time 30.142

"all estimate failed", which is where I started with this. I'm looking for
a reason _why_ "all estimate failed" and this doesn't help, nor are there
any other logfiles which help.

Believe me, Gene, I didn't fall off the turnip truck yesterday on this
stuff. I've been doing professional system administration for many years
and I've been to all the obvious places on this looking for clues. I know
how to segregate a problem on the basis of clues, and set up test situations
and deduce something about what the problem is _not_, which I've done here,
but I'm up against a stone wall and am looking for information on were I
might go that I haven't already gone.

--
Lindsay Haisley | "Everything works | PGP public key
FMP Computer Services | if you let it" | available at
512-259-1190 | (The Roadie) | <http://www.fmp.com/pubkeys>
http://www.fmp.com | |

Post "all estimate failed" 
On Sun, Jul 11, 2004 at 05:21:55PM -0500, fmouse-amanda < at > fmp.com wrote:

BTW there are logfiles being kept, fairly verbose ones at that,
whereever you have it set to in your amanda.conf, or in the default
dir, probably in /var/amanda-dbg or similar. They will tell you a
lot more precisely what is going on.
gentoo.


I know about the amanda logfiles, and they're uninformative. For instance,

/var/log/amanda/Fmouse# cat log.20040711.2
DISK planner vishnu /dev/fasttrak/fmouse_gen
...
FINISH driver date 20040711 time 30.142

"all estimate failed", which is where I started with this.
I'm looking for a reason _why_ "all estimate failed" and this
doesn't help, nor are there any other logfiles which help.


Do you have the /var/amanda-dbg directory that Gene mentioned
and you overlooked. In my case it is /tmp/amanda. In that
directory is where debuging logfiles are placed by amanda.
Gene was not refering to the "logs", but the debug logs.

--
Jon H. LaBadie jon < at > jgcomp.com
JG Computing
4455 Province Line Road (609) 252-0159
Princeton, NJ 08540-4322 (609) 683-7220 (fax)

Post "all estimate failed" 
On Sunday 11 July 2004 21:02, Jon LaBadie wrote:
On Sun, Jul 11, 2004 at 05:21:55PM -0500, fmouse-amanda < at > fmp.com
wrote:
BTW there are logfiles being kept, fairly verbose ones at that,
whereever you have it set to in your amanda.conf, or in the
default dir, probably in /var/amanda-dbg or similar. They will
tell you a lot more precisely what is going on.
gentoo.

I know about the amanda logfiles, and they're uninformative. For
instance,

/var/log/amanda/Fmouse# cat log.20040711.2
DISK planner vishnu /dev/fasttrak/fmouse_gen

...

FINISH driver date 20040711 time 30.142

"all estimate failed", which is where I started with this.
I'm looking for a reason _why_ "all estimate failed" and this
doesn't help, nor are there any other logfiles which help.

Do you have the /var/amanda-dbg directory that Gene mentioned
and you overlooked. In my case it is /tmp/amanda. In that
directory is where debuging logfiles are placed by amanda.
Gene was not refering to the "logs", but the debug logs.

I didn't make myself clear enough on that I guess. :(

That said, I've dug as deep as I feel comfortable with, so jump right
in Jon and see if you can help out.

There is also an env var that can be set which will multiply the
verbosity of the debugging logs considerably. But I haven't used it
in ages, so I don't know the name or the exact syntax to use it.
There could be a reference to it in the docs dir of the archive.

--
Cheers, Gene
There are 4 boxes to be used in defense of liberty.
Soap, ballot, jury, and ammo.
Please use in that order, starting now. -Ed Howdershelt, Author
Additions to this message made by Gene Heskett are Copyright 2004,
Maurice E. Heskett, all rights reserved.

Post "all estimate failed" 
Gene, thanks for your help. I finally seem to be on the track of fixing the
problem. I completely nuked amanda and let Gentoo rebuild it from scratch,
after providing the ebuild with the proper env vars. This didn't help, but
it gave me a clean slate.

After that, I rewrote my amanda.conf and disklist files for the affected
configs using the templates provided with the current version of amanda (my
old configs have been in use for about 5 years, through several versions of
amanda, and were somewhat hacked and outdated). I'm not sure exactly which
parameter(s) made the difference, but a backup of the main local system
seems to be proceeding well.

I note that the Gentoo build defaults to 'localhost' for the server address,
which should _always_ be 127.0.0.1, and hence unambiguous. I trust the
Gentoo devs implicitly. IMHO, they're right up there with the Debian
package maintainers in knowledge and experience. I did, however, use the
FQDN for the server in the ebuild, which I prefer in any event.

--
Lindsay Haisley | "Everything works | PGP public key
FMP Computer Services | if you let it" | available at
512-259-1190 | (The Roadie) | <http://www.fmp.com/pubkeys>
http://www.fmp.com | |

Post "all estimate failed" 
On Mon, Jul 12, 2004 at 05:21:40PM -0500, fmouse-amanda < at > fmp.com enlightened us:
I note that the Gentoo build defaults to 'localhost' for the server address,
which should _always_ be 127.0.0.1, and hence unambiguous. I trust the
Gentoo devs implicitly. IMHO, they're right up there with the Debian
package maintainers in knowledge and experience. I did, however, use the
FQDN for the server in the ebuild, which I prefer in any event.

It is not unambiguous since *every* machine is localhost with an address of
127.0.0.1. Unfortunately that's a problem with binary packages and amanda in
that yes, backing up with localhost/127.0.0.1 will always work, but it can
fail at restore time and can cause other crazy problems occasionally. It is
the same situation in the redhat packages, which is why I always grab the
source rpm and rebuild with the right options for my server. Sounds like
that's what you did and is IMHO the best way to do it if you want to keep
the tracking of packages while using amanda in the best manner.

Matt

--
Matt Hyclak
Department of Mathematics
Department of Social Work
Ohio University
(740) 593-1263

Post "all estimate failed" 
On Monday 12 July 2004 18:21, fmouse-amanda < at > fmp.com wrote:
Gene, thanks for your help. I finally seem to be on the track of
fixing the problem. I completely nuked amanda and let Gentoo
rebuild it from scratch, after providing the ebuild with the proper
env vars. This didn't help, but it gave me a clean slate.

After that, I rewrote my amanda.conf and disklist files for the
affected configs using the templates provided with the current
version of amanda (my old configs have been in use for about 5
years, through several versions of amanda, and were somewhat hacked
and outdated). I'm not sure exactly which parameter(s) made the
difference, but a backup of the main local system seems to be
proceeding well.

I note that the Gentoo build defaults to 'localhost' for the server
address, which should _always_ be 127.0.0.1, and hence unambiguous.
I trust the Gentoo devs implicitly. IMHO, they're right up there
with the Debian package maintainers in knowledge and experience. I
did, however, use the FQDN for the server in the ebuild, which I
prefer in any event.

The huge majority of the amanda users here, and even Jean-Louis, will
re-iterate about not using localhost because its not a unique name.
The problem being that the data can be taken to any machine that is
also localhost and installed. This is, among other things, a huge
security hole.

I'm not sure how the ebuilder works, but from the tarballs unpacking,
to a fully installed update here is not more than 5 to 6 minutes
unless you use the checkinstall utility to install it on an rpm based
system. Checkinstall can be handy in that you do the standard
routine of unpack, cd to the unpacked directory, ./configure; make,
doing this as the user amanda, then revert to root or become root,
and do the make install or the checkinstall. Checkinstall takes the
packages built directory, makes a binary rpm out of it, and then
installs the rpm. One can then remove an errant package with the rpm
-e command should it become required.

Configureing the tarball to make it can be confusing, and error prone
if you don't have a photographic memory, so I only do these steps to
build and install it:

(1) tar xzvf name_of_tarball
(2) cp gh.cf name_of_tarball/
(3) chown -R amanda:disk name_of_tarball
(4) become the user 'amanda'
(5) cd name_of_tarball(minus the tar.gz of course)
(6) ./gh.cf (which configures it with my std config options, then
'make's it)
(7) become root
(Cool cd name_of_tarball
(9) make install
(10) become user amanda
(11) amcheck configname (to error check the install)

Steps 1-9 take about 5 minutes here.

Here is that gh.cf file:
--------------------
#!/bin/sh
# since I'm always forgetting to su amanda...
if [ `whoami` != 'amanda' ]; then
echo
echo "!!!!!!!!!!!! Warning !!!!!!!!!!!!"
echo "Amanda needs to be configured and built by the user amanda,"
echo "but must be installed by user root."
echo
exit 1
fi
make clean
rm -f config.status config.cache
./configure --with-user=amanda \
--with-group=disk \
--with-owner=amanda \
--with-tape-device=/dev/nst0 \
--with-changer-device=/dev/sg1 \
--with-gnu-ld --prefix=/usr/local \
--with-debugging=/tmp/amanda-dbg/ \
--with-tape-server=coyote.coyote.den \
--with-amandahosts \
--with-configdir=/usr/local/etc/amanda

make
---------------
I didn't sanitize it any, that hostname won't resolve more than 100
feet away in my woodshop, as my firewall is bulletproof in practice.
Adjust your changer device if required.

--
Cheers, Gene
There are 4 boxes to be used in defense of liberty.
Soap, ballot, jury, and ammo.
Please use in that order, starting now. -Ed Howdershelt, Author
Additions to this message made by Gene Heskett are Copyright 2004,
Maurice E. Heskett, all rights reserved.

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB