SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
rsnapshot dies during backup
Author Message
Post rsnapshot dies during backup 
When I use rsnapshot to backup some data ( roundabout 300 GB from 4 different servers ) 3 in lan, 1 over WWW.
during backup, rsnapshot dies and leaves his job partially undone and his pidfile in the system back.
so i have done a strace from the rsync-page, but it seems normal:

----snip --------
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\v{=\211N\365\307\235w\303\200Iz\272#\313\322\24\247nG"..., 4092) = 4092
close(3) = 0
open("/var/www/munin/localdomain/localhost.localdomain-vmstat.html", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=3142, ...}) = 0
read(3, "<?xml version=\"1.0\" encoding=\"is"..., 3142) = 3142
close(3) = 0
open("/var/www/munin/localdomain/localhost.localdomain.html", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=18310, ...}) = 0
read(3, "<?xml version=\"1.0\" encoding=\"is"..., 18310) = 18310
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\374\17\0\7", 4) = 4
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\352\362=\340N\203\332\355\366^{\3555w\323M7\271_\377\365"..., 4092) = 4092
close(3) = 0
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\265\0\0\7\377\377\377\360\377\377\377\357\377\377\377"..., 185) = 185
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\4\0\0\7\377\377\377\377", Cool = 8
select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 960000})
read(0, "\377\377\377\377\377\377\377\377", 8184) = 8
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\4\0\0\7\377\377\377\377", Cool = 8
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\4\0\0\7\377\377\377\377", Cool = 8
time(NULL) = 1208873960
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\34\0\0\7\30\257\n\0\203\370]\r\377\377\377\377#qV\212"..., 32) = 32
select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {13, 244000})
read(0, "\377\377\377\377", 8184) = 4
rt_sigaction(SIGUSR1, {SIG_IGN}, NULL, Cool = 0
rt_sigaction(SIGUSR2, {SIG_IGN}, NULL, Cool = 0
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "l\0\0\trsync warning: some files va"..., 112) = 112
exit_group(24) = ?
Process 5905 detached

-- snap ----
after this, theres no more output, the process isn't anymore existing.
rsnapshot.log says this:

var/www/munin/localdomain/localhost.localdomain-swap-week.png
var/www/munin/localdomain/localhost.localdomain-swap.html
var/www/munin/localdomain/localhost.localdomain-vmstat-day.png
var/www/munin/localdomain/localhost.localdomain-vmstat-month.png
var/www/munin/localdomain/localhost.localdomain-vmstat-week.png
var/www/munin/localdomain/localhost.localdomain-vmstat.html
var/www/munin/localdomain/localhost.localdomain.html

sent 700184 bytes received 224262275 bytes 96903.92 bytes/sec
total size is 212774318371 speedup is 945.82
Tue Apr 22 16:20:39 CEST 2008
end rsnapshot
pidfile existiert noch


where the date and the two following lines are not from rsync/rsnapshot but debugging-output from my script.

has anyone an idea where to look for ?

something i wonderes about: everytime i started rsnapshot by hand ( on a command-line, in a screen, )
it works fine until the end. but this is only an impression ..

thanks for help

stephan


--
Mit freundlichen Grüßen
Stephan Augustin
dgx new media GmbH
Saalbaustr. 8-10
64283 Darmstadt
Tel. +49 6151 8508-003
Fax +49 6151 8508-111

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post  
Symptoms of this problem are identical for us.
The process dies, the pid file remains. Logfile shows success ... up to a random host. Every night.
Running the 'daily' manualy will work just fine.

We had rsnapshot running with joy, for months, on a large number of hosts. Some TB of data.

But right now, we are even lacking the ideas on how to debug this further.

View user's profile Send private message
Post rsnapshot dies during backup 
Hallo, Haas,

Du meintest am 15.04.11:

Symptoms of this problem are identical for us.
The process dies, the pid file remains. Logfile shows success ... up
to a random host. Every night. Running the 'daily' manualy will work
just fine.

We had rsnapshot running with joy, for months, on a large number of
hosts. Some TB of data.

But right now, we are even lacking the ideas on how to debug this
further.

Maybe the target disk is dying. Or the connection (SATA-SATA, net, ...)
is dying.
A colleague and I have seen this problem on one machine: changing the
motherboard cured it.

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post rsnapshot dies during backup 
I have seen a system where rsnapshot is run from cron, and if it produced
no output it ran fine. But once it produced a certain amount of output,
rsnapshot was killed.

Importantly, this system did not have a "mail" program installed.
I think the cron job (rsnapshot) was killed when cron tried to email
that output to root (owner of the crontab), but that failed. I think
rsnapshot got a SIGPIPE when it was writing to a pipe that was closed
at the reading end.

See also:
https://bugs.launchpad.net/ubuntu/+source/cron/+bug/151231

The link gives an example setup you can use to test whether cron jobs
die when they produce a certain amount of output. A quick test would
be to check if you can send mail from that system.

If you have this problem here are some things you could do to fix it:
(1) install mail, so that cron can send you email if a cron job has output
(2) Add MAILTO="" to that crontab (and any other crontabs) so no mail is sent
(3) Redirect stdout and stderr on every cron job that might produce output

On Fri, Apr 15, 2011 at 03:43:45AM -0700, Haas wrote:
Symptoms of this problem are identical for us.
The process dies, the pid file remains. Logfile shows success ... up to a random host. Every night.
Running the 'daily' manualy will work just fine.

We had rsnapshot running with joy, for months, on a large number of hosts. Some TB of data.

But right now, we are even lacking the ideas on how to debug this further.

+----------------------------------------------------------------------
|This was sent by haas < at > corp.mail.ru via Backup Central.
|Forward SPAM to abuse < at > backupcentral.com.
+----------------------------------------------------------------------



------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

--
___________________________________________________________________________
David Keegel <djk < at > cyber.com.au> http://www.cyber.com.au/users/djk/
Cyber IT Solutions (ABN 13 053 904 082) Linux & Unix Systems Administration

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post rsnapshot dies during backup 
In the last few days I put a change in to CVS for rsnapshot to catch
SIGPIPE and log an error message with a hint for this situation.

If anyone out there is having trouble with rsnapshot dying part way
through a cron job and is comfortable with making and installing a
new version of rsnapshot from CVS themselves, please try the new CVS
rsnapshot and see whether it now logs an error that helps you diagnose
the problem.

On Sat, Apr 16, 2011 at 02:32:41PM +1000, David Keegel wrote:
I have seen a system where rsnapshot is run from cron, and if it produced
no output it ran fine. But once it produced a certain amount of output,
rsnapshot was killed.

Importantly, this system did not have a "mail" program installed.
I think the cron job (rsnapshot) was killed when cron tried to email
that output to root (owner of the crontab), but that failed. I think
rsnapshot got a SIGPIPE when it was writing to a pipe that was closed
at the reading end.

See also:
https://bugs.launchpad.net/ubuntu/+source/cron/+bug/151231

The link gives an example setup you can use to test whether cron jobs
die when they produce a certain amount of output. A quick test would
be to check if you can send mail from that system.

If you have this problem here are some things you could do to fix it:
(1) install mail, so that cron can send you email if a cron job has output
(2) Add MAILTO="" to that crontab (and any other crontabs) so no mail is sent
(3) Redirect stdout and stderr on every cron job that might produce output

On Fri, Apr 15, 2011 at 03:43:45AM -0700, Haas wrote:
Symptoms of this problem are identical for us.
The process dies, the pid file remains. Logfile shows success ... up to a random host. Every night.
Running the 'daily' manualy will work just fine.

We had rsnapshot running with joy, for months, on a large number of hosts. Some TB of data.

But right now, we are even lacking the ideas on how to debug this further.

+----------------------------------------------------------------------
|This was sent by haas < at > corp.mail.ru via Backup Central.
|Forward SPAM to abuse < at > backupcentral.com.
+----------------------------------------------------------------------



------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

--
___________________________________________________________________________
David Keegel <djk < at > cyber.com.au> http://www.cyber.com.au/users/djk/
Cyber IT Solutions (ABN 13 053 904 082) Linux & Unix Systems Administration

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

--
___________________________________________________________________________
David Keegel <djk < at > cyber.com.au> http://www.cyber.com.au/users/djk/
Cyber IT Solutions (ABN 13 053 904 082) Linux & Unix Systems Administration

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post  
going through your recommendations one by one... our problem immediately disappeared after adding a postfix package to that machine.

so the solution was as simple as "apt-get install postfix".

Thank you for all the input, guys. You rock.

View user's profile Send private message
Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB