SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Re: Millions of files
Author Message
Post Re: Millions of files 
Troels Arvin <troels < at > arvin.dk>
wrote the following on Fri, 27 May 2005 21:42:36 +0200

A new backup setup I'm working on isn't completely finished yet. But one
backup job with >13mill files (totalling 233GB, lasting 45 hours) ran
fine. Nice.

233GB in 45 hours is around 1½ MB/sec if I'm calculating right. I'm not
sure if I'm satisfied with that. Both harddisks and modern LAN-networks
should (in principle) be able to sustain higher throughput.(?)

I'm thinking:
- Could SSH's crypto work be significant, or is it really peanuts
for today's fast CPUs?
- Is rdiff-backup performing calculations where Python's
slowness could be a problem?

I doubt SSH is the bottleneck, presumably it can do a lot more than
1.5M/s. I'd guess the bottleneck is the large number of files and the
CPU and HD overhead rdiff-backup has on each (statting, writing them,
fsyncing, etc.). So I think Python's slowness/my programming could be
a problem (especially if you're using 0.13.x, which hasn't been
optimized yet).

But these are just guesses. I haven't seen firsthand any setup
similar to yours.


--
Ben Escoto

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB