Troels Arvin <troels < at > arvin.dk>
wrote the following on Fri, 27 May 2005 21:42:36 +0200
A new backup setup I'm working on isn't completely finished yet. But one
backup job with >13mill files (totalling 233GB, lasting 45 hours) ran
fine. Nice.
233GB in 45 hours is around 1½ MB/sec if I'm calculating right. I'm not
sure if I'm satisfied with that. Both harddisks and modern LAN-networks
should (in principle) be able to sustain higher throughput.(?)
I'm thinking:
- Could SSH's crypto work be significant, or is it really peanuts
for today's fast CPUs?
- Is rdiff-backup performing calculations where Python's
slowness could be a problem?
I doubt SSH is the bottleneck, presumably it can do a lot more than
1.5M/s. I'd guess the bottleneck is the large number of files and the
CPU and HD overhead rdiff-backup has on each (statting, writing them,
fsyncing, etc.). So I think Python's slowness/my programming could be
a problem (especially if you're using 0.13.x, which hasn't been
optimized yet).
But these are just guesses. I haven't seen firsthand any setup
similar to yours.
--
Ben Escoto
