SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Memory usage during regressions
Author Message
Post Memory usage during regressions 
Hi there,

I'm experiencing quite high memory usage during regressions; I have a
backup server with only 2G of RAM. I'm doing daily backups. Sometimes a
backup fails, and then, of course, rdiff-backup first recovers the most
recent backup which did not fail. During this process, rdiff-backup
blows up to approx. 3G of RAM. Then things start to slow down
(swapping). It's a quite large backup set, about 400G, with a large
history. I doesn't seem to be a memory leak, as the memory usage stay at
3G. Just seems a little bit too much in principle.

Has anyone experienced similar luxurious memory (ab-) use during
regressions to the previous backup?

Kind regards,

Claus


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Memory usage during regressions 
On 08/06/2011 08:35 AM, Claus-Justus Heine wrote:
Hi there,

I'm experiencing quite high memory usage during regressions; I have a backup
server with only 2G of RAM. I'm doing daily backups. Sometimes a backup fails,
and then, of course, rdiff-backup first recovers the most recent backup which
did not fail. During this process, rdiff-backup blows up to approx. 3G of RAM.
Then things start to slow down (swapping). It's a quite large backup set, about
400G, with a large history. I doesn't seem to be a memory leak, as the memory
usage stay at 3G. Just seems a little bit too much in principle.

Regression is concerned with only the two most recent sessions, so
the amount of history should be irrelevant. What is the total
number of files being backed up and the size of the uncompressed
mirror_metadata snapshot?

zcat file_statistics.{latest_timestamp}.data.gz | tr '\0' '\n' | wc -l
zcat mirror_metadata.{latest_timestamp}.snapshot.gz | wc -c

That would be more indicative of the amount of data that needs to be
kept in memory during the regression. FWIW, I'm seeing memory usage
of about 480MB during regression of a backup of about 250,000 files,
though the number of changed files needing to be regressed is quite
small (~1000).

--
Bob Nichols "NOSPAM" is really part of my email address.
Do NOT delete it.


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Memory usage during regressions 
On 08/06/2011 07:13 PM, Robert Nichols wrote:
On 08/06/2011 08:35 AM, Claus-Justus Heine wrote:
[snip]
Then things start to slow down (swapping). It's a quite large backup
set, about
400G, with a large history. I doesn't seem to be a memory leak, as the
memory
usage stay at 3G. Just seems a little bit too much in principle.

Regression is concerned with only the two most recent sessions, so

Well, yes. So that piece of information was irrelevant, as rdiff-backups
keeps the most recent snapshot literally, and the older ones as deltas.

the amount of history should be irrelevant. What is the total
number of files being backed up and the size of the uncompressed
mirror_metadata snapshot?

zcat file_statistics.{latest_timestamp}.data.gz | tr '\0' '\n' | wc -l
zcat mirror_metadata.{latest_timestamp}.snapshot.gz | wc -c

That would be more indicative of the amount of data that needs to be

Quite a bit of data:

backup < at > NAS rdiff-backup-data $\
zcat file_statistics.2011-07-24T20:02:14+02:00.data.gz|\
tr '\0' '\n' |\
wc -l
2698977
backup < at > NAS rdiff-backup-data $\
zcat mirror_metadata.2011-07-30T05\:56\:36+02\:00.snapshot.gz |\
wc -c
732509974

So roughly 2.5 millon files, metadata is about 700M.

kept in memory during the regression. FWIW, I'm seeing memory usage
of about 480MB during regression of a backup of about 250,000 files,

Well mine is a factor of 10 larger, roughly (good that memory
consumption does not scale linearily ... Wink

Still, this seems to be insane. Reading this month's archives (and given
that the last release of rdiff-backup was about '09) it seems that I
would have to fix that by myself, it seems. Or live with it. Or buy a
backup server with more memory ...

Of course, in principle the core-size of rdiff-backup is not a problem,
on a decent OS the part of the core currently not in use would just be
swapped out. There is only a problem if the program constantly traverses
the allocated buffers. This is what I suspect. It takes ages (2 or 3
days) to finally finish the regression.

Thanks for your quick response!

Best,

Claus

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Memory usage during regressions 
On Sat, 6 Aug 2011, Claus-Justus Heine wrote:

So roughly 2.5 millon files, metadata is about 700M.

Still, this seems to be insane. Reading this month's archives (and given that
the last release of rdiff-backup was about '09) it seems that I would have to
fix that by myself, it seems. Or live with it. Or buy a backup server with
more memory ...

Of course, in principle the core-size of rdiff-backup is not a problem, on a
decent OS the part of the core currently not in use would just be swapped
out. There is only a problem if the program constantly traverses the
allocated buffers. This is what I suspect. It takes ages (2 or 3 days) to
finally finish the regression.

I don't know the internals of rdiff-backup, but I do know that there is a
runtime option to disable compression of .snapshot and .diff files. When
file operations are done properly, this might be helpful (using memory
mapped file access instead of decompressing/compressing, so effectively
using pointers to disk instead of ram).

But I don't know whether it would work that way. The only thing I do know
is that it will at best help you in the future, as it is not going to
change anything wrt historic files.


--
Maarten

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Memory usage during regressions 
On 08/06/2011 07:51 PM, Maarten Bezemer wrote:

I don't know the internals of rdiff-backup, but I do know that there is
a runtime option to disable compression of .snapshot and .diff files.
When file operations are done properly, this might be helpful (using
memory mapped file access instead of decompressing/compressing, so
effectively using pointers to disk instead of ram).

But I don't know whether it would work that way. The only thing I do
know is that it will at best help you in the future, as it is not going
to change anything wrt historic files.


Thanks, Maarten, as Robert pointed out: only the two most recent backups
are needed for regression, so it is at least worth trying. I was not
aware of the switch (I probably should RTFM with more concentration).

Best,

Claus




_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Memory usage during regressions 
On 08/06/2011 07:54 PM, Claus-Justus Heine wrote:
On 08/06/2011 07:51 PM, Maarten Bezemer wrote:

I don't know the internals of rdiff-backup, but I do know that there is
a runtime option to disable compression of .snapshot and .diff files.
When file operations are done properly, this might be helpful (using
memory mapped file access instead of decompressing/compressing, so
effectively using pointers to disk instead of ram).

But I don't know whether it would work that way. The only thing I do
know is that it will at best help you in the future, as it is not going
to change anything wrt historic files.


Thanks, Maarten, as Robert pointed out: only the two most recent backups
are needed for regression, so it is at least worth trying. I was not
aware of the switch (I probably should RTFM with more concentration).

I have started reading through the rdiff-backup sources.
--no-compression doesn't seem to be fully implemented, and in particular
might not work at all with regressions. Also, the 700MB meta-data should
not be a problem. RB simply has a sliding 32kb window on that file, so
this cannot the reason for the memory usage (32kb is already a waste of
CPU time, a record in the meta-data file is something around 200 bytes).
Maybe I try to dig this down. Perhaps one should first port the thing to
python3, at least before starting to hack around. And maybe this changes
already something (according to Murphy this should even increase the
memory hungriness of RB).

Best,

Claus


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB