SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Intermittent connection failures
Author Message
Post Intermittent connection failures 
I have something like a terabyte of stuff to backup. I do it each day
with more than ten rdiff-backup scripts (one for each directory).

Lately I have been getting these errors:

Received disconnect from nn.nn.nn.nn: 2: Timeout, your session not responding.
Fatal Error: Lost connection to the remote system

(nn.nn.nn.nn stands for source ip).

the script is of this type:

rdiff-backup --exclude-special-files --force --remote-schema "ssh -i
/root/.ssh/myserver.dsa %s rdiff-backup --server"
root < at > nn.nn.nn.nn::/path/to/dir /path/to/dir 1>> logfile 2>> logfile

I have tried different things, like
- simply rerun the script
- throw away rdiff-backup-data directory, then try again
- like before, then use rsync, then rdiff-backup again
- add ServerAliveInterval 180 to /etc/ssh/ssh_config on target
and ClientAliveInterval 180 to sshd_config on source machine
- add ConnectTimeout 300 to ssh_config on target machine

Adjusting ssh_config seems to have no effect. These errors seem to
center on some dirs and not on others. But the errors don't occur every
time, which makes it hard to find the reason.

Finally, here is some version information (same versions on source and
target):
- CentOS 5.6 (source 64-bit, target 32-bit)
- rdiff-backup 1.2.8

I could check versions of dependencies, if necessary. There may be
version differences because one system is 32- and the other 64-bit.

One theory is that the router for my target machine (my modem is in
bridge mode, so the router is owned by cable operator) plays nasty. They
changed all ip's recently, and I think these errors started to appear
around the same time. Probably the whole machine has been changed, the
routing capabilities for my LAN changed (no more Bonjour printer
sharing, etc.).

Any thoughts about the reason and remedy of the rdiff-backup errors?

Thanks,
Jussi

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Intermittent connection failures 
Hi Jussi,

You can increase the verbosity with the -v<value> parameter, maybe this gives you some more insight into what's going on. Are there any patterns? Is it always the same backup script where these failures occur?

Patrick.

Jussi Hirvi <listmember < at > greenspot.fi> wrote:

I have something like a terabyte of stuff to backup. I do it each day
with more than ten rdiff-backup scripts (one for each directory).

Lately I have been getting these errors:

Received disconnect from nn.nn.nn.nn: 2: Timeout, your session not
responding.
Fatal Error: Lost connection to the remote system

(nn.nn.nn.nn stands for source ip).

the script is of this type:

rdiff-backup --exclude-special-files --force --remote-schema "ssh -i
/root/.ssh/myserver.dsa %s rdiff-backup --server"
root < at > nn.nn.nn.nn::/path/to/dir /path/to/dir 1>> logfile 2>> logfile

I have tried different things, like
- simply rerun the script
- throw away rdiff-backup-data directory, then try again
- like before, then use rsync, then rdiff-backup again
- add ServerAliveInterval 180 to /etc/ssh/ssh_config on target
and ClientAliveInterval 180 to sshd_config on source machine
- add ConnectTimeout 300 to ssh_config on target machine

Adjusting ssh_config seems to have no effect. These errors seem to
center on some dirs and not on others. But the errors don't occur every

time, which makes it hard to find the reason.

Finally, here is some version information (same versions on source and
target):
- CentOS 5.6 (source 64-bit, target 32-bit)
- rdiff-backup 1.2.8

I could check versions of dependencies, if necessary. There may be
version differences because one system is 32- and the other 64-bit.

One theory is that the router for my target machine (my modem is in
bridge mode, so the router is owned by cable operator) plays nasty.
They
changed all ip's recently, and I think these errors started to appear
around the same time. Probably the whole machine has been changed, the
routing capabilities for my LAN changed (no more Bonjour printer
sharing, etc.).

Any thoughts about the reason and remedy of the rdiff-backup errors?

Thanks,
Jussi

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL:
http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

--
Sent from my phone.

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Intermittent connection failures 
Hi,

Another way to find out what's going on, could be to add -v to the ssh
command (remote schema).
Right now, all we know is that it fails, but it would be good to know
whether the problem is with rdiff-backup or simply just what the
subject-line says: a connection failure.

Side note: please remove the --force argument from the command line. It
has a habit of breaking things. Use it with caution.

You mention having tried with rsync. Did you do an rsync to a local copy,
or to the rdiff-backup tree? With the rdiff-backup (not up-to-date) tree,
rsync would only update some files, and possibly not suffer from randomly
occurring network outages.
When you subsequently use rdiff-backup to turn the tree into a
rdiff-backup managed backup tree, rdiff-backup has to process each and
every file (to create rdiff-backup-data/ metadata files) which can take a
much longer time, with an increased risk of breaking half-way it the
network is unstable.


My money would be on the network stability.
Maybe rdiff-backup could somehow be made more robust, but I doubt the
current design can be patched to support reconnection attempts when
disconnected unexpectedly.


--
Maarten



On Mon, 4 Jul 2011, Patrick Nagel wrote:

Hi Jussi,

You can increase the verbosity with the -v<value> parameter, maybe this gives you some more insight into what's going on. Are there any patterns? Is it always the same backup script where these failures occur?

Patrick.

Jussi Hirvi <listmember < at > greenspot.fi> wrote:

I have something like a terabyte of stuff to backup. I do it each day
with more than ten rdiff-backup scripts (one for each directory).

Lately I have been getting these errors:

Received disconnect from nn.nn.nn.nn: 2: Timeout, your session not
responding.
Fatal Error: Lost connection to the remote system

(nn.nn.nn.nn stands for source ip).

the script is of this type:

rdiff-backup --exclude-special-files --force --remote-schema "ssh -i
/root/.ssh/myserver.dsa %s rdiff-backup --server"
root < at > nn.nn.nn.nn::/path/to/dir /path/to/dir 1>> logfile 2>> logfile

I have tried different things, like
- simply rerun the script
- throw away rdiff-backup-data directory, then try again
- like before, then use rsync, then rdiff-backup again
- add ServerAliveInterval 180 to /etc/ssh/ssh_config on target
and ClientAliveInterval 180 to sshd_config on source machine
- add ConnectTimeout 300 to ssh_config on target machine

Adjusting ssh_config seems to have no effect. These errors seem to
center on some dirs and not on others. But the errors don't occur every

time, which makes it hard to find the reason.

Finally, here is some version information (same versions on source and
target):
- CentOS 5.6 (source 64-bit, target 32-bit)
- rdiff-backup 1.2.8

I could check versions of dependencies, if necessary. There may be
version differences because one system is 32- and the other 64-bit.

One theory is that the router for my target machine (my modem is in
bridge mode, so the router is owned by cable operator) plays nasty.
They
changed all ip's recently, and I think these errors started to appear
around the same time. Probably the whole machine has been changed, the
routing capabilities for my LAN changed (no more Bonjour printer
sharing, etc.).

Any thoughts about the reason and remedy of the rdiff-backup errors?

Thanks,
Jussi

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL:
http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

--
Sent from my phone.

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Intermittent connection failures 
On 07/04/2011 06:48 AM, Maarten Bezemer wrote:
Hi,

Another way to find out what's going on, could be to add -v to the ssh
command (remote schema).

I had a similar problem, and kluged my way through using a
python script to write a rdiff-backup shell script that added
top-level directories
one at a time. By check pointing in this way I avoided having to
start over each time I lost the connection. Once I got the full
directory tree backed up the incrementals were small enough
to avoid whatever problem was causing the drops.

Specifically, the script below writes a file called write_backup.sh
that looks like this:

#!/bin/bash -v

echo '+++climate_data/gpcp+++'
/home/phil/usr64/bin/rdiff-backup --include climate_data/gpcp
--exclude '**' --remote-schema 'ssh -C %s
/home/phil/usr64/bin/rdiff-backup --server' -v 8 climate_data
backup_phil::/media/green1/climate_data

echo '+++climate_data/agcm3+++'
/home/phil/usr64/bin/rdiff-backup --include climate_data/gpcp
--include climate_data/agcm3 --exclude '**' --remote-schema 'ssh -C
%s /home/phil/usr64/bin/rdiff-backup --server' -v 8 climate_data
backup_phil::/media/green1/climate_data

echo '+++climate_data/modis+++'
/home/phil/usr64/bin/rdiff-backup --include climate_data/gpcp
--include climate_data/agcm3 --include climate_data/modis --exclude
'**' --remote-schema 'ssh -C %s /home/phil/usr64/bin/rdiff-backup
--server' -v 8 climate_data backup_phil::/media/green1/climate_data

etc.

import textwrap, glob
import os.path
the_template=\
"""
echo '+++%s+++'
/home/phil/usr64/bin/rdiff-backup %s --exclude '**'
--remote-schema 'ssh -C %%s /home/phil/usr64/bin/rdiff-backup
--server' -v 8 climate_data backup_phil::/media/green1/climate_data
"""

the_template=textwrap.dedent(the_template)

include_string=' '
outfile=open('write_backup.sh','w')
outfile.write('#!/bin/bash -v\n')
allfiles=glob.glob('climate_data/*')
for in_file in allfiles:
if os.path.isdir(in_file):
include_string = include_string + '--include %s ' % in_file
outfile.write(the_template % (in_file,include_string))
outfile.close()

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Intermittent connection failures 
On 04/07/2011 14:48, Maarten Bezemer wrote:
...My money would be on the network stability.
Maybe rdiff-backup could somehow be made more robust, but I doubt the
current design can be patched to support reconnection attempts when
disconnected unexpectedly...

I agree. IMO rdiff-backup should never be run over an internet
connection, only on a stable local network. If you want to get the
rdiff-backup data offsite, create your rdiff-backup repository locally
and then use rsync to mirror it to the offsite location. This is the
technique used for TimeDicer primary/mirror servers.

Dominic
http://www.timedicer.co.uk

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Intermittent connection failures 
On 5.7.2011 16.13, Dominic Raferd wrote:
I agree. IMO rdiff-backup should never be run over an internet
connection, only on a stable local network. If you want to get the
rdiff-backup data offsite, create your rdiff-backup repository locally
and then use rsync to mirror it to the offsite location. This is the
technique used for TimeDicer primary/mirror servers.

Thanks for all the input. I am now trying Dominic's solution, modified
(because I have easier hands-on access to the target machine):

First I rsync the data from source to target machine, then I use that to
make a local rdiff-backup archive.

(Good luck that 2TB HDs are already so inexpensive.)

If it works, then we will know that the errors were related to unstable
network connection - as I beliebe they were. Too bad I did not get more
exact information on the causes. I did increase verbosity both in
rdiff-backup and ssh, but all I found out was that the connection seems
to break now and then, quite irregularly.

- Jussi

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB