Welcome! » Log In » Create A New Profile

Improving Replication performance

Posted by Zoltan Forray 
Zoltan Forray
Improving Replication performance
April 26, 2018 11:59AM
As we get deeper into Replication and my boss wants to use it more and more
as an offsite recovery platform.

As we try to reach "best practices" of replicating everything, we are
finding this desire to be difficult if not impossible to achieve due to the
resource demands.

Total we want to eventually replicate is around 700TB from 5-source servers
to 1-target server which is dedicated to replication.

So the big question is, can this be done?

We recently rebuilt the offsite target server to as big as we could afford
($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
source servers and 2-for ISILON/NFS).

Yet we can only replicate around 3TB daily when we backup around 7TB.

Looking for suggestions/thoughts/experiences?

All boxes are RHEL Linux and 7.1.7.300

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Sergio O. Fuentes
Re: Improving Replication performance
April 26, 2018 12:59PM
The totals that you're mentioning: eg: "Yet we can only replicate around
3TB daily when we backup around 7TB.", are these deduplicated replication
totals or non-deduplication non-compressed totals?

Are you replicating deduplicated data?

We've been replicating successfully about 10TB of nightly non-deduplicated
data across our two datacenters successfully for years now and it's only
been getting better. We get about 60% deduplication so what we're really
transferring over the wire is about 4 TB of data. It used to take 6 to 8
hours to do the replication of that much data across our network, but we
changed the DB storage to essentially SSD (as part of a separate project)
and the code base for TSM replication has gotten more stable that it only
takes about two-three hours to replicate it now.

Some differences that I notice from what you've provided, is that we're
doing 1 to 1 replication where we have 4 servers, 2 are primary and 2
replicate to a dedicated replica for that primary server. We're now moving
those replicas to the cloud and seeing if we can use the AWS EC2 and S3
instances to be our DR servers (plus the AWS TSM instance will be our AWS
TSM target (yes, there are still reasons to have backups in the cloud)).
That's still early in production so we haven't taxed that replication
network much yet.

We also use dedicated disk for both DB's and stgpools (still on file-pools,
so we're not doing any protect stgpool commands). We don't have another
ethernet network to traverse for the file-pool traffic. That dedicated
storage is backed by old 8GB FC switches on a pair of Dell MD arrays. It's
nice to have that low-latency backbone while we still got it. We're on TSM
8.latest and there might have been some performance bugs at the 7.1 level
if I remember correctly.

The one thing I did notice is that the more you can spreadout your I/O
across your storage arrays (whether it's DB or STGPOOL targets) the better
the performance. For example, the setup for our TSM server has 16
filesystems for the database and 16 mountpoints for the filepool
directories. For your Isilon backend do you see that as a bottleneck at
all? Is the TSM server pushing load to as many of those Isilon nodes as
possible? Or is it really the enumeration of the replication data that
takes a long time (that's possibly a DB bottleneck). Lots of questions
than answers for me but I hope I pointed you in the right direction.

Thanks and good luck!
Sergio

On Thu, Apr 26, 2018 at 2:46 PM, Zoltan Forray <zforray@vcu.edu> wrote:

> As we get deeper into Replication and my boss wants to use it more and more
> as an offsite recovery platform.
>
> As we try to reach "best practices" of replicating everything, we are
> finding this desire to be difficult if not impossible to achieve due to the
> resource demands.
>
> Total we want to eventually replicate is around 700TB from 5-source servers
> to 1-target server which is dedicated to replication.
>
> So the big question is, can this be done?
>
> We recently rebuilt the offsite target server to as big as we could afford
> ($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
> 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
> source servers and 2-for ISILON/NFS).
>
> Yet we can only replicate around 3TB daily when we backup around 7TB.
>
> Looking for suggestions/thoughts/experiences?
>
> All boxes are RHEL Linux and 7.1.7.300
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/
>
This message was imported via the External PhorumMail Module
Zoltan Forray
Re: Improving Replication performance
April 26, 2018 01:59PM
Currently, all source servers are non-dedup since they don't have the
horsepower/cpu/threads to handle it. The target server is deduping.

Currently, the source servers have internal disk but a lot of the daily
backups to flow to tape which I know is slow - so the DB's are internal
disk.

The target server DB is on SSD and we are monitoring it since it has grown
to over 1.8TB in size and we are nowhere close to replicating more than
half of what we need to process.

We can't afford a 1-to-1 source/replica server like you have.

We are looking to move to 8.x in the future - no target yet since we are in
the middle of hardware upgrades of 2-ISP servers right now and upgrading
5-local servers to forced TLS will be a big morass of problems due to the
levels and mix of TSM clients (everything from 5.4-8.1.2) and having to
upgrade 2-LM servers simultaneously will be an adventure.

On Thu, Apr 26, 2018 at 3:50 PM, Sergio O. Fuentes <sfuentes@umd.edu> wrote:

> The totals that you're mentioning: eg: "Yet we can only replicate around
> 3TB daily when we backup around 7TB.", are these deduplicated replication
> totals or non-deduplication non-compressed totals?
>
> Are you replicating deduplicated data?
>
> We've been replicating successfully about 10TB of nightly non-deduplicated
> data across our two datacenters successfully for years now and it's only
> been getting better. We get about 60% deduplication so what we're really
> transferring over the wire is about 4 TB of data. It used to take 6 to 8
> hours to do the replication of that much data across our network, but we
> changed the DB storage to essentially SSD (as part of a separate project)
> and the code base for TSM replication has gotten more stable that it only
> takes about two-three hours to replicate it now.
>
> Some differences that I notice from what you've provided, is that we're
> doing 1 to 1 replication where we have 4 servers, 2 are primary and 2
> replicate to a dedicated replica for that primary server. We're now moving
> those replicas to the cloud and seeing if we can use the AWS EC2 and S3
> instances to be our DR servers (plus the AWS TSM instance will be our AWS
> TSM target (yes, there are still reasons to have backups in the cloud)).
> That's still early in production so we haven't taxed that replication
> network much yet.
>
> We also use dedicated disk for both DB's and stgpools (still on file-pools,
> so we're not doing any protect stgpool commands). We don't have another
> ethernet network to traverse for the file-pool traffic. That dedicated
> storage is backed by old 8GB FC switches on a pair of Dell MD arrays. It's
> nice to have that low-latency backbone while we still got it. We're on TSM
> 8.latest and there might have been some performance bugs at the 7.1 level
> if I remember correctly.
>
> The one thing I did notice is that the more you can spreadout your I/O
> across your storage arrays (whether it's DB or STGPOOL targets) the better
> the performance. For example, the setup for our TSM server has 16
> filesystems for the database and 16 mountpoints for the filepool
> directories. For your Isilon backend do you see that as a bottleneck at
> all? Is the TSM server pushing load to as many of those Isilon nodes as
> possible? Or is it really the enumeration of the replication data that
> takes a long time (that's possibly a DB bottleneck). Lots of questions
> than answers for me but I hope I pointed you in the right direction.
>
> Thanks and good luck!
> Sergio
>
> On Thu, Apr 26, 2018 at 2:46 PM, Zoltan Forray <zforray@vcu.edu> wrote:
>
> > As we get deeper into Replication and my boss wants to use it more and
> more
> > as an offsite recovery platform.
> >
> > As we try to reach "best practices" of replicating everything, we are
> > finding this desire to be difficult if not impossible to achieve due to
> the
> > resource demands.
> >
> > Total we want to eventually replicate is around 700TB from 5-source
> servers
> > to 1-target server which is dedicated to replication.
> >
> > So the big question is, can this be done?
> >
> > We recently rebuilt the offsite target server to as big as we could
> afford
> > ($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
> > 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
> > source servers and 2-for ISILON/NFS).
> >
> > Yet we can only replicate around 3TB daily when we backup around 7TB.
> >
> > Looking for suggestions/thoughts/experiences?
> >
> > All boxes are RHEL Linux and 7.1.7.300
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zforray@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit http://phishing.vcu.edu/
> >
>



--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Skylar Thompson
Re: Improving Replication performance
April 26, 2018 01:59PM
Are you CPU or disk-bound on the source or target servers? Even if you have
lots of CPUs, replication might be running on a single thread and just using
one CPU.

On Thu, Apr 26, 2018 at 02:46:24PM -0400, Zoltan Forray wrote:
> As we get deeper into Replication and my boss wants to use it more and more
> as an offsite recovery platform.
>
> As we try to reach "best practices" of replicating everything, we are
> finding this desire to be difficult if not impossible to achieve due to the
> resource demands.
>
> Total we want to eventually replicate is around 700TB from 5-source servers
> to 1-target server which is dedicated to replication.
>
> So the big question is, can this be done?
>
> We recently rebuilt the offsite target server to as big as we could afford
> ($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
> 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
> source servers and 2-for ISILON/NFS).
>
> Yet we can only replicate around 3TB daily when we backup around 7TB.
>
> Looking for suggestions/thoughts/experiences?
>
> All boxes are RHEL Linux and 7.1.7.300
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skylar2@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
This message was imported via the External PhorumMail Module
Stefan Folkerts
Re: Improving Replication performance
April 27, 2018 01:59AM
We have just build a setup at a large university that replicates (on
average) 6.9TB per hour but that data is deduplicated on the source server.
If you have 10Gb/s+ bandwith and no limitation on the performance on the
source you should but able to handle 7TB per day (mixed workload, not only
tiny files) on a target server if it's an M blueprint or faster model
without any issue.
The specifications of the compute hardware at this customer are less then
what you specify memory wise, we are spot on M blueprint but with NVME's
for the database.
If your source server can deliver the data fast enough it's all about super
fast database and activelog performance on the target, it's needs to do an
insane amount of iop/s to chunk, hash and check all the chunks you are
throwing at it. And yes, memory helps but 256GB isn't as important as
database speed and raw CPU power to plow thru that data.

Did you run benchmarks on the database and activelog volumes on your target
server?
We reach about 110.000 IOP/s on the database volumes using NVME and we have
found that to be the key to unlocking Spectrum Protect ludicrous-mode.

https://imgur.com/a/SAA7OAZ


On Thu, Apr 26, 2018 at 10:37 PM, Skylar Thompson <skylar2@uw.edu> wrote:

> Are you CPU or disk-bound on the source or target servers? Even if you have
> lots of CPUs, replication might be running on a single thread and just
> using
> one CPU.
>
> On Thu, Apr 26, 2018 at 02:46:24PM -0400, Zoltan Forray wrote:
> > As we get deeper into Replication and my boss wants to use it more and
> more
> > as an offsite recovery platform.
> >
> > As we try to reach "best practices" of replicating everything, we are
> > finding this desire to be difficult if not impossible to achieve due to
> the
> > resource demands.
> >
> > Total we want to eventually replicate is around 700TB from 5-source
> servers
> > to 1-target server which is dedicated to replication.
> >
> > So the big question is, can this be done?
> >
> > We recently rebuilt the offsite target server to as big as we could
> afford
> > ($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
> > 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
> > source servers and 2-for ISILON/NFS).
> >
> > Yet we can only replicate around 3TB daily when we backup around 7TB.
> >
> > Looking for suggestions/thoughts/experiences?
> >
> > All boxes are RHEL Linux and 7.1.7.300
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zforray@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit http://phishing.vcu.edu/
>
> --
> -- Skylar Thompson (skylar2@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
>
This message was imported via the External PhorumMail Module
Rhodes, Richard L.
Re: [EXTERNAL] Improving Replication performance
April 27, 2018 07:59AM
So the pools are on Isilon via NFS. I assume the OS is Linux.

We've use DataDomain as a NFS target for file pools over 10G ethernet. We've fought performance issues with this for years. The AIX NFs stack is really bad/slow - about 100mb/s through a single mount point, and even then it varies all over the place. Our admins have tested Linux to DD and it's better, but I don't remember the numbers. It's not something we actually use with TSM. Now, Oracle writing directly to DataDomain via NFS via Oracle's internal DNFS stack is fast - 600-700MB/s! The Oracle processing shows it's not the DD that's the bottleneck.

You might want to test NFS performance.

Rick




-----Original Message-----
From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Zoltan Forray
Sent: Thursday, April 26, 2018 2:46 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [EXTERNAL] Improving Replication performance

As we get deeper into Replication and my boss wants to use it more and more
as an offsite recovery platform.

As we try to reach "best practices" of replicating everything, we are
finding this desire to be difficult if not impossible to achieve due to the
resource demands.

Total we want to eventually replicate is around 700TB from 5-source servers
to 1-target server which is dedicated to replication.

So the big question is, can this be done?

We recently rebuilt the offsite target server to as big as we could afford
($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
source servers and 2-for ISILON/NFS).

Yet we can only replicate around 3TB daily when we backup around 7TB.

Looking for suggestions/thoughts/experiences?

All boxes are RHEL Linux and 7.1.7.300

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
------------------------------------------------------------------------------

The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
This message was imported via the External PhorumMail Module
Zoltan Forray
Re: [EXTERNAL] Improving Replication performance
April 30, 2018 08:59AM
The replicate target server storage is ISILON. The source servers are
internal disk and/or tape (if migrated due to lack of disk space). All are
RHEL and all ISP servers are 7.1.7.300. Yes we are looking at upgrading to
8.x but are seriously concerned due to the TLS enforcement impact.

On Fri, Apr 27, 2018 at 10:52 AM, Rhodes, Richard L. <
rrhodes@firstenergycorp.com> wrote:

> So the pools are on Isilon via NFS. I assume the OS is Linux.
>
> We've use DataDomain as a NFS target for file pools over 10G ethernet.
> We've fought performance issues with this for years. The AIX NFs stack is
> really bad/slow - about 100mb/s through a single mount point, and even then
> it varies all over the place. Our admins have tested Linux to DD and it's
> better, but I don't remember the numbers. It's not something we actually
> use with TSM. Now, Oracle writing directly to DataDomain via NFS via
> Oracle's internal DNFS stack is fast - 600-700MB/s! The Oracle processing
> shows it's not the DD that's the bottleneck.
>
> You might want to test NFS performance.
>
> Rick
>
>
>
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Zoltan
> Forray
> Sent: Thursday, April 26, 2018 2:46 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: [EXTERNAL] Improving Replication performance
>
> As we get deeper into Replication and my boss wants to use it more and more
> as an offsite recovery platform.
>
> As we try to reach "best practices" of replicating everything, we are
> finding this desire to be difficult if not impossible to achieve due to the
> resource demands.
>
> Total we want to eventually replicate is around 700TB from 5-source servers
> to 1-target server which is dedicated to replication.
>
> So the big question is, can this be done?
>
> We recently rebuilt the offsite target server to as big as we could afford
> ($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
> 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
> source servers and 2-for ISILON/NFS).
>
> Yet we can only replicate around 3TB daily when we backup around 7TB.
>
> Looking for suggestions/thoughts/experiences?
>
> All boxes are RHEL Linux and 7.1.7.300
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/
> ------------------------------------------------------------
> ------------------
>
> The information contained in this message is intended only for the
> personal and confidential use of the recipient(s) named above. If the
> reader of this message is not the intended recipient or an agent
> responsible for delivering it to the intended recipient, you are hereby
> notified that you have received this document in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If you have received this communication in error, please notify
> us immediately, and delete the original message.
>



--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Zoltan Forray
Re: Improving Replication performance
April 30, 2018 09:59AM
Probably CPU since all of the current source servers are older 16-thread
boxes. But none of the source data is deduped and mostly disk (DISK
devtype NOT FILE).

On Thu, Apr 26, 2018 at 4:37 PM, Skylar Thompson <skylar2@uw.edu> wrote:

> Are you CPU or disk-bound on the source or target servers? Even if you have
> lots of CPUs, replication might be running on a single thread and just
> using
> one CPU.
>
> On Thu, Apr 26, 2018 at 02:46:24PM -0400, Zoltan Forray wrote:
> > As we get deeper into Replication and my boss wants to use it more and
> more
> > as an offsite recovery platform.
> >
> > As we try to reach "best practices" of replicating everything, we are
> > finding this desire to be difficult if not impossible to achieve due to
> the
> > resource demands.
> >
> > Total we want to eventually replicate is around 700TB from 5-source
> servers
> > to 1-target server which is dedicated to replication.
> >
> > So the big question is, can this be done?
> >
> > We recently rebuilt the offsite target server to as big as we could
> afford
> > ($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
> > 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
> > source servers and 2-for ISILON/NFS).
> >
> > Yet we can only replicate around 3TB daily when we backup around 7TB.
> >
> > Looking for suggestions/thoughts/experiences?
> >
> > All boxes are RHEL Linux and 7.1.7.300
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zforray@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit http://phishing.vcu.edu/
>
> --
> -- Skylar Thompson (skylar2@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
>



--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Zoltan Forray
Re: Improving Replication performance
April 30, 2018 09:59AM
We were lucky to afford the pair of 3TB SSD on the target server, which is
dedicated to the DB.

Since our primary disk these days is ISILON/NFS, I am trying to stuff as
much rotating disk into the servers as they are replaced. The one we are
about to switch to has 12-10TB internal disk (for LZ/primary storage - not
including OS, actlog, archlog), 72-threads, 256GB RAM and quad 10G. I
figure the 100TB internal storage will hold 1/3 of this servers onsite
occupancy, allowing me to dedup and start getting away from tape for onsite
storage.

On Fri, Apr 27, 2018 at 3:58 AM, Stefan Folkerts <stefan.folkerts@gmail.com>
wrote:

> We have just build a setup at a large university that replicates (on
> average) 6.9TB per hour but that data is deduplicated on the source server.
> If you have 10Gb/s+ bandwith and no limitation on the performance on the
> source you should but able to handle 7TB per day (mixed workload, not only
> tiny files) on a target server if it's an M blueprint or faster model
> without any issue.
> The specifications of the compute hardware at this customer are less then
> what you specify memory wise, we are spot on M blueprint but with NVME's
> for the database.
> If your source server can deliver the data fast enough it's all about super
> fast database and activelog performance on the target, it's needs to do an
> insane amount of iop/s to chunk, hash and check all the chunks you are
> throwing at it. And yes, memory helps but 256GB isn't as important as
> database speed and raw CPU power to plow thru that data.
>
> Did you run benchmarks on the database and activelog volumes on your target
> server?
> We reach about 110.000 IOP/s on the database volumes using NVME and we have
> found that to be the key to unlocking Spectrum Protect ludicrous-mode.
>
> https://imgur.com/a/SAA7OAZ
>
>
> On Thu, Apr 26, 2018 at 10:37 PM, Skylar Thompson <skylar2@uw.edu> wrote:
>
> > Are you CPU or disk-bound on the source or target servers? Even if you
> have
> > lots of CPUs, replication might be running on a single thread and just
> > using
> > one CPU.
> >
> > On Thu, Apr 26, 2018 at 02:46:24PM -0400, Zoltan Forray wrote:
> > > As we get deeper into Replication and my boss wants to use it more and
> > more
> > > as an offsite recovery platform.
> > >
> > > As we try to reach "best practices" of replicating everything, we are
> > > finding this desire to be difficult if not impossible to achieve due to
> > the
> > > resource demands.
> > >
> > > Total we want to eventually replicate is around 700TB from 5-source
> > servers
> > > to 1-target server which is dedicated to replication.
> > >
> > > So the big question is, can this be done?
> > >
> > > We recently rebuilt the offsite target server to as big as we could
> > afford
> > > ($38K). It has 256GB of RAM. 64-threads of CPU. Storage is primarily
> > > 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic
> from
> > > source servers and 2-for ISILON/NFS).
> > >
> > > Yet we can only replicate around 3TB daily when we backup around 7TB.
> > >
> > > Looking for suggestions/thoughts/experiences?
> > >
> > > All boxes are RHEL Linux and 7.1.7.300
> > >
> > > --
> > > *Zoltan Forray*
> > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > Xymon Monitor Administrator
> > > VMware Administrator
> > > Virginia Commonwealth University
> > > UCC/Office of Technology Services
> > > www.ucc.vcu.edu
> > > zforray@vcu.edu - 804-828-4807
> > > Don't be a phishing victim - VCU and other reputable organizations will
> > > never use email to request that you reply with your password, social
> > > security number or confidential personal information. For more details
> > > visit http://phishing.vcu.edu/
> >
> > --
> > -- Skylar Thompson (skylar2@u.washington.edu)
> > -- Genome Sciences Department, System Administrator
> > -- Foege Building S046, (206)-685-7354
> > -- University of Washington School of Medicine
> >
>



--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Sorry, only registered users may post in this forum.

Click here to login