SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
reducing i/o
Author Message
Post reducing i/o 
Hi

is there a way to reduce i/o load on the backup-servers significantly?

we are using backuppc over years in many different combinations of
hardware and filesystems and always i/o-wait is the killer.

we are now running 8 backuppc-server running ~16TB of backup-data
(quickly changing) and the handling is getting tricky (which host is the
client backuped on? is there a backup of every host? when do I have the
time to finaly really start programming backuppc-hq?)

so. we are willing to do anything to reduce the nr of backup-servers
(best would be only one).

eg we could give up deduplication, compression, increase RAM and
CPU-Power, change filesystem and os (debian and xfs now), change
raid-level (Non, raid-0, raid-1 and raid-10 now) and so on.

what we cant do for financial reasons is drop the cheap SATA drives.
changing to SAS 15k eg would be much more expensive (even if
calculating rackspace, power, machines, manpower and so on of the
current backuppool of 8 backup-servers)

any tips?

ys
Peter


------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post reducing i/o 
Hi,

Being a BackupPC 'newb', I can only say I haven't found much information
on i/o tuning. Naively, it would seem that the bulk of BackupPC i/o is
spent comparing hashes. *If* that's true, I think it would be
interesting to see if splitting hashes from the actual data and putting
them on much faster storage such as SSDs would benefit overall
throughput. I'd suspect that's only possible at the code level.

But I also notice that you didn't provide much architectural info and
that's an area that can have large impact. Here's some thoughts
presented more as 'food for though' than as recommendations.

I'd think there's two options. The first option is to use network
storage, but then you quickly hit bandwidth limitations so you need
fatter (and possibly lower latency) pipes like 10Gb Ethernet or Myrinet.

The newer crop of network storage such as GlusterFS (being purchased by
Red Hat) is nice for several reasons. It scales nearly linearly in i/o,
available storage, and redundancy into the petabyte range. It makes the
server the redundant and allows you to remove the redundancy at the disk
level; and the more parts you can "throw out the window" and recover
from, the better.

The other option is to put more disks in each server. Making big
assumptions, if you've got 16TB of data and 50% utilization on 1TB
disks, that's only 4 disks per server. That doesn't buy you much in
raid 10 and is probably quite slow in raid 5.

Putting all your disks in one server with 12 or 16 bays plus an eSata
chassis would significantly increase throughput.

A note about raid levels. Raid 5 is really really not the thing to use
as it has very poor performance for some realistic workloads and I'd
expect BackupPC to be one of those workloads. If you can, use raid 10,
which is both very fast and _ought_ to be more robust. That is, in the
event of a disk failure and rebuild, raid 10 only requires a read from
one disk, lessening the chance that the rebuild crashes other disks as
occasionally happens in raid 5.

Lastly, a note about disk choice. I too use cheap SATA drives.
Optimizing for cost, i/o, and throughput, it's pretty easy to choose a
1TB drive and only use e.g. 20% of the space versus a much more
expensive SAS drive. The beauty of this is you still have a lot of
unused disk should you need it or should you migrate disks to other
uses.

If development on BackupPC should go that direction (or should it be
possible today), splitting off the higher i/o part of the workload onto
SSDs would make those larger but slower SATA disks much more attractive
relative to a faster but smaller SAS disks.

Eric


On Sat, 2011-11-05 at 08:40 +0100, pv wrote:
Hi

is there a way to reduce i/o load on the backup-servers significantly?

we are using backuppc over years in many different combinations of
hardware and filesystems and always i/o-wait is the killer.

we are now running 8 backuppc-server running ~16TB of backup-data
(quickly changing) and the handling is getting tricky (which host is the
client backuped on? is there a backup of every host? when do I have the
time to finaly really start programming backuppc-hq?)

so. we are willing to do anything to reduce the nr of backup-servers
(best would be only one).

eg we could give up deduplication, compression, increase RAM and
CPU-Power, change filesystem and os (debian and xfs now), change
raid-level (Non, raid-0, raid-1 and raid-10 now) and so on.

what we cant do for financial reasons is drop the cheap SATA drives.
changing to SAS 15k eg would be much more expensive (even if
calculating rackspace, power, machines, manpower and so on of the
current backuppool of 8 backup-servers)

any tips?

ys
Peter


------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/




------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post reducing i/o 
fatter (and possibly lower latency) pipes like 10Gb Ethernet or Myrinet.

and IB which I think is cheaper than either of the above.

The newer crop of network storage such as GlusterFS (being purchased by
Red Hat) is nice for several reasons.  It scales nearly linearly in i/o,
available storage, and redundancy into the petabyte range.  It makes the
server the redundant and allows you to remove the redundancy at the disk
level; and the more parts you can "throw out the window" and recover
from, the better.

If you can run backuppc on gluster and not have it crash then I think
gluster will be ready for enterprise deployments that need high
availability and robustness. I know many people run it in production
as something more than "scratch" or experimental storage, but I
wouldn't, it's real flaky. I tested backuppc on gluster v3.0.x and
gluster didn't hold up for long. I've tested performance with the
newer 3.2.x versions and you lose bandwidth and get increased latency
going over tcp or RDMA over IB vs straight to local disk. If you want
good performance, I wouldn't recommend sending the backup data out
over the network again. Using gluster in replicate mode makes it even
slower. Stripe mode sounds like it should be fast, but it's not unless
you're using a low latency network like IB.

we are using backuppc over years in many different combinations of
hardware and filesystems and always i/o-wait is the killer.

we are now running 8 backuppc-server running ~16TB of backup-data

We had 7 servers backing up ~30TB+ of data over gigE from a high end
NAS. Most of the incrementals would finish nightly. We've recently
stopped using backuppc to do this after the IT dept purchased enough
FC tape drives for the NDMP backup window to drop to an acceptable
level.

(quickly changing) and the handling is getting tricky (which host is the
client backuped on? is there a backup of every host? when do I have the
time to finaly really start programming backuppc-hq?)

Most of this information is given in the web gui and you can probably
get it using scripts. I don't think we ever had a host with a backup
that was more than 1.3 days old.

so. we are willing to do anything to reduce the nr of backup-servers
(best would be only one).

eg we could give up deduplication, compression, increase RAM and

I would start by getting rid of the de-dup and compression, but you
probably have to buy more disk in that case.

CPU-Power, change filesystem and os (debian and xfs now), change
raid-level (Non, raid-0, raid-1 and raid-10 now) and so on.

We also used cheap SATA drives and didn't use high end servers with
lots of memory, except for the NAS from which we were doing backups.
We had two 16 disk RAID6 arrays and 5 6 drive RAID5 arrays on the
backup servers.

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

View user's profile Send private message
Post reducing i/o 
On Sat, Nov 5, 2011 at 2:40 AM, pv <peter < at > vratny.at> wrote:
Hi

is there a way to reduce i/o load on the backup-servers significantly?

we are using backuppc over years in many different combinations of
hardware and filesystems and always i/o-wait is the killer.

we are now running 8 backuppc-server running ~16TB of backup-data
(quickly changing) and the handling is getting tricky (which host is the
client backuped on? is there a backup of every host? when do I have the
time to finaly really start programming backuppc-hq?)

so. we are willing to do anything to reduce the nr of backup-servers
(best would be only one).

eg we could give up deduplication, compression, increase RAM and
CPU-Power, change filesystem and os (debian and xfs now), change
raid-level (Non, raid-0, raid-1 and raid-10 now) and so on.

what we cant do for financial reasons is drop the cheap SATA drives.
changing to SAS 15k eg would be much more expensive (even if
calculating rackspace, power, machines, manpower and so on of the
current backuppool of 8 backup-servers)

any tips?


Until a couple of weeks ago you could get a few models of large SAS
drives for only a little more than the SATA counterparts (like $250
for 2 TB), but that is at least temporarily gone. Setting up Raid-0
with a large number of drives should increase your speed at least for
large files, but mostly the killer is the seek time for small-full
directory updates anyway and each raid-0 drive multiplies the risk of
failure of the whole set.

--
Les Mikesell
lesmikesell < at > gmail.com

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB