SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Let me get to the real point
Author Message
Post Let me get to the real point 
Ok, let me get to the point.

The server is F15. The client was just upgraded to F16 (but the upgrade failed,
and it's a clean install).

The backups had been working. Now they're not. I'm trying to troubleshoot.

The symptom is, rsync starts on the client, but nothing happens. Notice the cpu
time used:

ps aux | grep rsync
root 13636 0.5 0.1 119300 4528 ? Ss 13:18 0:25 /usr/bin/rsync
--server --sender --p...

It just hangs.

No useful log messages on client or server.


------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Let me get to the real point 
On Tue, Nov 15, 2011 at 1:40 PM, Neal Becker <ndbecker2 < at > gmail.com> wrote:
Ok, let me get to the point.

The server is F15.  The client was just upgraded to F16 (but the upgrade failed,
and it's a clean install).

The backups had been working.  Now they're not.  I'm trying to troubleshoot.

The symptom is, rsync starts on the client, but nothing happens.  Notice the cpu
time used:

ps aux | grep rsync
root     13636  0.5  0.1 119300  4528 ?        Ss   13:18   0:25 /usr/bin/rsync
--server --sender --p...

It just hangs.

No useful log messages on client or server.

Did you do this as a manual backup[1]? Perhaps that will give you more
information.

Richard

[1] http://backuppc.sourceforge.net/faq/debugXfer.html

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Let me get to the real point 
On Tue, Nov 15, 2011 at 02:40:24PM -0500, Neal Becker wrote:
Ok, let me get to the point.

The server is F15. The client was just upgraded to F16 (but the upgrade failed,
and it's a clean install).

The backups had been working. Now they're not. I'm trying to troubleshoot.

The symptom is, rsync starts on the client, but nothing happens. Notice the cpu
time used:

ps aux | grep rsync
root 13636 0.5 0.1 119300 4528 ? Ss 13:18 0:25 /usr/bin/rsync
--server --sender --p...

It just hangs.

No useful log messages on client or server.

What does lsof against the rsync on the client side show as open files?

If you run strace against rsync on the client side is it sitting in a
select loop?

Is the server side perl process accumulating any time?

--
-- rouilj

John Rouillard System Administrator
Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Let me get to the real point 
John Rouillard wrote:

On Tue, Nov 15, 2011 at 02:40:24PM -0500, Neal Becker wrote:
Ok, let me get to the point.

The server is F15. The client was just upgraded to F16 (but the upgrade
failed, and it's a clean install).

The backups had been working. Now they're not. I'm trying to troubleshoot.

The symptom is, rsync starts on the client, but nothing happens. Notice the
cpu time used:

ps aux | grep rsync
root 13636 0.5 0.1 119300 4528 ? Ss 13:18 0:25
/usr/bin/rsync --server --sender --p...

It just hangs.

No useful log messages on client or server.

What does lsof against the rsync on the client side show as open files?

If you run strace against rsync on the client side is it sitting in a
select loop?

Is the server side perl process accumulating any time?


Actually, I was wrong. The server side perl is using 100% cpu. I'll try
running it again tomorrow when I have more time.


------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Let me get to the real point 
OK, so I did a new test today.

1. I did a complete reinstall of backuppc on the server. I removed all existing
data from /var/lib/backuppc

2. I ran backup manually:
/usr/share/BackupPC/bin/BackupPC_dump -v -f nbecker1

[ blah blah ...]

.: md4 doesn't match: will retry in phase 1; file removed
Out of memory!
Parent read EOF from child: fatal error!
Done: 0 files, 0 bytes
Got fatal error during xfer (Child exited prematurely)
cmdSystemOrEval: about to system /bin/ping -c 1 -w 3 nbecker1
cmdSystemOrEval: finished: got output PING nbecker1.hughes.com (10.32.112.120)
56(84) bytes of data.
64 bytes from nbecker1.hughes.com (10.32.112.120): icmp_req=1 ttl=63 time=0.819
ms

--- nbecker1.hughes.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.819/0.819/0.819/0.000 ms

cmdSystemOrEval: about to system /bin/ping -c 1 -w 3 nbecker1
cmdSystemOrEval: finished: got output PING nbecker1.hughes.com (10.32.112.120)
56(84) bytes of data.
64 bytes from nbecker1.hughes.com (10.32.112.120): icmp_req=1 ttl=63 time=0.840
ms

--- nbecker1.hughes.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.840/0.840/0.840/0.000 ms

CheckHostAlive: returning 0.840
Backup aborted (Child exited prematurely)
Not saving this as a partial backup since it has fewer files than the prior one
(got 0 and 0 files versus 0)
dump failed: Child exited prematurely


Out of memory, huh? This machine has 8Gb. And, this same server has been
backing up this same client for the last year, until now the client was updated
to f16.


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Let me get to the real point 
On Wed, Nov 16, 2011 at 02:42:51PM -0500, Neal Becker wrote:
OK, so I did a new test today.

1. I did a complete reinstall of backuppc on the server. I removed
all existing data from /var/lib/backuppc

2. I ran backup manually:
/usr/share/BackupPC/bin/BackupPC_dump -v -f nbecker1

[ blah blah ...]

.: md4 doesn't match: will retry in phase 1; file removed
Out of memory!
Parent read EOF from child: fatal error!
Done: 0 files, 0 bytes
Got fatal error during xfer (Child exited prematurely)
cmdSystemOrEval: about to system /bin/ping -c 1 -w 3 nbecker1
cmdSystemOrEval: finished: got output PING nbecker1.hughes.com (10.32.112.120)
56(84) bytes of data.
64 bytes from nbecker1.hughes.com (10.32.112.120): icmp_req=1 ttl=63 time=0.819
ms

[...]
Out of memory, huh? This machine has 8Gb.

8Gb for the server or client?

And, this same server has been backing up this same client for the
last year, until now the client was updated to f16.

Hmm, I wonder if it's the client that was running out of memory. Are
there a lot more files in the sl6 upgrade than there were in prior os
install? I assume this is rsync over ssh and the rsync process in
enumerating the files can run out of memory on the client.

Are you dumping the entire system (say /) or just a small part of it
(e.g. /etc).

--
-- rouilj

John Rouillard System Administrator
Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB