Welcome! » Log In » Create A New Profile

Looking for suggestions to deal with large backups not completing in 24-hours

Posted by Zoltan Forray 
As I have mentioned in the past, we have gone through large migrations to
DFS based storage on EMC ISILON hardware. As you may recall, we backup
these DFS mounts (about 90 at last count) using multiple Windows servers
that run multiple ISP nodes (about 30-each) and they access each DFS
mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.

This has lead to lots of performance issue with backups and some
departments are now complain that their backups are running into
multiple-days in some cases.

One such case in a department with 2-nodes with over 30-million objects for
each node. In the past, their backups were able to finish quicker since
they were accessed via dedicated servers and were able to use Journaling to
reduce the scan times. Unless things have changed, I believe Journling is
not an option due to how the files are accessed.

FWIW, average backups are usually <50k files and <200GB once it finished
scanning.....

Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since many
of these objects haven't been accessed in many years old. But as I
understand it, that won't work either given our current configuration.

Given the current DFS configuration (previously CIFS), what can we do to
improve backup performance?

So, any-and-all ideas are up for discussion. There is even discussion on
replacing ISP/TSM due to these issues/limitations.

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Hello,
I don’t know much about Isilon.
There might be SAN level snap backups option for Isilon.

For our Data domain, we replicate from Main site to DR site, then take snap at our DR site every night. Each snap is consider a backup.

Thank you.

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Zoltan Forray
Sent: Thursday, July 05, 2018 2:52 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Looking for suggestions to deal with large backups not completing in 24-hours

As I have mentioned in the past, we have gone through large migrations to
DFS based storage on EMC ISILON hardware. As you may recall, we backup
these DFS mounts (about 90 at last count) using multiple Windows servers
that run multiple ISP nodes (about 30-each) and they access each DFS
mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.

This has lead to lots of performance issue with backups and some
departments are now complain that their backups are running into
multiple-days in some cases.

One such case in a department with 2-nodes with over 30-million objects for
each node. In the past, their backups were able to finish quicker since
they were accessed via dedicated servers and were able to use Journaling to
reduce the scan times. Unless things have changed, I believe Journling is
not an option due to how the files are accessed.

FWIW, average backups are usually <50k files and <200GB once it finished
scanning.....

Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since many
of these objects haven't been accessed in many years old. But as I
understand it, that won't work either given our current configuration.

Given the current DFS configuration (previously CIFS), what can we do to
improve backup performance?

So, any-and-all ideas are up for discussion. There is even discussion on
replacing ISP/TSM due to these issues/limitations.

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ucc.vcu.edu&d=DwIBaQ&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=p7OfdQbObZllF-mnDqjrQg&m=jX8fSV2xXtioczSetX1viQa6EzVNOcZlBX9ddwWGXRM&s=KsUuBwu8G3pWJ7R7hedi0ZISk0CjIRrWQMJneyjNxD4&e=
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit https://urldefense.proofpoint.com/v2/url?u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=p7OfdQbObZllF-mnDqjrQg&m=jX8fSV2xXtioczSetX1viQa6EzVNOcZlBX9ddwWGXRM&s=xiPt_TkUv02i1b7VQfKybQwokZGIKegAHQtBFG_G19U&e=
This message was imported via the External PhorumMail Module
Zoltan

I kind of agree with Ung Yi

What is the purpose of your TSM backups? DR? Long term retention for auditability/sarbox/other regulation?

It may well be that a daily or even more frequent snapshot regime might be the best way to get back that recently lost/deleted/corrupted file.
Use a TSM backup of a weekly point-of-consistency snapshot as your long term strategy.

Of course a better option would be an embedded TSM client on the Isilon itself, but the commercial realities are that will never happen.

Cheers

Steve
Steven Harris
TSM Admin
Canberra Australia

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Yi, Ung
Sent: Friday, 6 July 2018 6:36 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups not completing in 24-hours

Hello,
I don’t know much about Isilon.
There might be SAN level snap backups option for Isilon.

For our Data domain, we replicate from Main site to DR site, then take snap at our DR site every night. Each snap is consider a backup.

Thank you.

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Zoltan Forray
Sent: Thursday, July 05, 2018 2:52 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Looking for suggestions to deal with large backups not completing in 24-hours

As I have mentioned in the past, we have gone through large migrations to DFS based storage on EMC ISILON hardware. As you may recall, we backup these DFS mounts (about 90 at last count) using multiple Windows servers that run multiple ISP nodes (about 30-each) and they access each DFS mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.

This has lead to lots of performance issue with backups and some departments are now complain that their backups are running into multiple-days in some cases.

One such case in a department with 2-nodes with over 30-million objects for each node. In the past, their backups were able to finish quicker since they were accessed via dedicated servers and were able to use Journaling to reduce the scan times. Unless things have changed, I believe Journling is not an option due to how the files are accessed.

FWIW, average backups are usually <50k files and <200GB once it finished scanning.....

Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since many of these objects haven't been accessed in many years old. But as I understand it, that won't work either given our current configuration.

Given the current DFS configuration (previously CIFS), what can we do to improve backup performance?

So, any-and-all ideas are up for discussion. There is even discussion on replacing ISP/TSM due to these issues/limitations.

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator Xymon Monitor Administrator VMware Administrator Virginia Commonwealth University UCC/Office of Technology Services https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ucc.vcu.edu&d=DwIBaQ&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=p7OfdQbObZllF-mnDqjrQg&m=jX8fSV2xXtioczSetX1viQa6EzVNOcZlBX9ddwWGXRM&s=KsUuBwu8G3pWJ7R7hedi0ZISk0CjIRrWQMJneyjNxD4&e=
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit https://urldefense.proofpoint.com/v2/url?u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=p7OfdQbObZllF-mnDqjrQg&m=jX8fSV2xXtioczSetX1viQa6EzVNOcZlBX9ddwWGXRM&s=xiPt_TkUv02i1b7VQfKybQwokZGIKegAHQtBFG_G19U&e=

This message and any attachment is confidential and may be privileged or otherwise protected from disclosure. You should immediately delete the message if you are not the intended recipient. If you have received this email by mistake please delete it from your system; you should not copy the message or disclose its content to anyone.

This electronic communication may contain general financial product advice but should not be relied upon or construed as a recommendation of any financial product. The information has been prepared without taking into account your objectives, financial situation or needs. You should consider the Product Disclosure Statement relating to the financial product and consult your financial adviser before making a decision about whether to acquire, hold or dispose of a financial product.

For further details on the financial product please go to http://www.bt.com.au

Past performance is not a reliable indicator of future performance.
This message was imported via the External PhorumMail Module
We've implemented file count quotas in addition to our existing byte
quotas to try to avoid this situation. You can improve some things
(metadata on SSDs, maybe get an accelerator node if Isilon still offers
those) but the fact is that metadata is expensive in terms of CPU (both
client and server) and disk.

We chose 1 million objects/TB of allocated disk space. We sort of compete
with a storage system offered by our central IT organization, and picked a
limit higher than what they would provide.

To be honest, though, we're retiring our Isilon systems because the
performance/scalability/cost ratios just aren't as great as they used to
be. Our new storage is GPFS and mmbackup works much better with huge number
of files, though it's still not great. In particular, the filelist
generation is based around UNIX sort which is definitely a memory pig,
though it can be split across multiple systems so can scale out pretty
well.

On Thu, Jul 05, 2018 at 02:52:27PM -0400, Zoltan Forray wrote:
> As I have mentioned in the past, we have gone through large migrations to
> DFS based storage on EMC ISILON hardware. As you may recall, we backup
> these DFS mounts (about 90 at last count) using multiple Windows servers
> that run multiple ISP nodes (about 30-each) and they access each DFS
> mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
>
> This has lead to lots of performance issue with backups and some
> departments are now complain that their backups are running into
> multiple-days in some cases.
>
> One such case in a department with 2-nodes with over 30-million objects for
> each node. In the past, their backups were able to finish quicker since
> they were accessed via dedicated servers and were able to use Journaling to
> reduce the scan times. Unless things have changed, I believe Journling is
> not an option due to how the files are accessed.
>
> FWIW, average backups are usually <50k files and <200GB once it finished
> scanning.....
>
> Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since many
> of these objects haven't been accessed in many years old. But as I
> understand it, that won't work either given our current configuration.
>
> Given the current DFS configuration (previously CIFS), what can we do to
> improve backup performance?
>
> So, any-and-all ideas are up for discussion. There is even discussion on
> replacing ISP/TSM due to these issues/limitations.
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skylar2@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
This message was imported via the External PhorumMail Module
A couple years ago we decided to replace dozens and dozens of big
Windows servers with a centralize Isilon NAS. The Windows servers,
being tons of little files, were an ongoing pain to backup with TSM.
Our decision was to NOT backup the Isilon to TSM or any other external
program. Instead, we decided to use snapshots and replication to a DR
Isilon. In other words, we made a conscience decision to stop using
TSM to backup this data when we moved to Isilon. We took the opportunity
to standardize backup policies to a single snapshot retention
of just 32 days to help keep the snapshot disk space down.
Other than watching free disk space and a periodic check of
replication and snapshots, backup of this data is out of sight
and out of mind.


Rick






-----Original Message-----
From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Zoltan Forray
Sent: Thursday, July 5, 2018 2:52 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [EXTERNAL] Looking for suggestions to deal with large backups not completing in 24-hours

As I have mentioned in the past, we have gone through large migrations to
DFS based storage on EMC ISILON hardware. As you may recall, we backup
these DFS mounts (about 90 at last count) using multiple Windows servers
that run multiple ISP nodes (about 30-each) and they access each DFS
mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.

This has lead to lots of performance issue with backups and some
departments are now complain that their backups are running into
multiple-days in some cases.

One such case in a department with 2-nodes with over 30-million objects for
each node. In the past, their backups were able to finish quicker since
they were accessed via dedicated servers and were able to use Journaling to
reduce the scan times. Unless things have changed, I believe Journling is
not an option due to how the files are accessed.

FWIW, average backups are usually <50k files and <200GB once it finished
scanning.....

Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since many
of these objects haven't been accessed in many years old. But as I
understand it, that won't work either given our current configuration.

Given the current DFS configuration (previously CIFS), what can we do to
improve backup performance?

So, any-and-all ideas are up for discussion. There is even discussion on
replacing ISP/TSM due to these issues/limitations.

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
------------------------------------------------------------------------------

The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
This message was imported via the External PhorumMail Module
Another possible idea is to look at General Storage dsmISI MAGS:

http://www.general-storage.com/PRODUCTS/products.html


Del


"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
02:52:27 PM:

> From: Zoltan Forray <zforray@VCU.EDU>
> To: ADSM-L@VM.MARIST.EDU
> Date: 07/05/2018 02:53 PM
> Subject: Looking for suggestions to deal with large backups not
> completing in 24-hours
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
>
> As I have mentioned in the past, we have gone through large migrations
to
> DFS based storage on EMC ISILON hardware. As you may recall, we backup
> these DFS mounts (about 90 at last count) using multiple Windows servers
> that run multiple ISP nodes (about 30-each) and they access each DFS
> mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
>
> This has lead to lots of performance issue with backups and some
> departments are now complain that their backups are running into
> multiple-days in some cases.
>
> One such case in a department with 2-nodes with over 30-million objects
for
> each node. In the past, their backups were able to finish quicker since
> they were accessed via dedicated servers and were able to use Journaling
to
> reduce the scan times. Unless things have changed, I believe Journling
is
> not an option due to how the files are accessed.
>
> FWIW, average backups are usually <50k files and <200GB once it finished
> scanning.....
>
> Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
many
> of these objects haven't been accessed in many years old. But as I
> understand it, that won't work either given our current configuration.
>
> Given the current DFS configuration (previously CIFS), what can we do to
> improve backup performance?
>
> So, any-and-all ideas are up for discussion. There is even discussion
on
> replacing ISP/TSM due to these issues/limitations.
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit INVALID URI REMOVED
> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> a432oKYronO-w1z-
> ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
>
This message was imported via the External PhorumMail Module
Thanks Del. Very interesting. Are they a VAR for IBM?

Not sure if it would work in the current configuration we are using to back
up ISILON. I have passed the info on.

BTW, FWIW, when I copied/pasted the info, Chrome spell-checker red-flagged
on "The easy way to incrementally backup billons of objects" (billions).
So if you know anybody at the company, please pass it on to them.

On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoobler@us.ibm.com> wrote:

> Another possible idea is to look at General Storage dsmISI MAGS:
>
> http://www.general-storage.com/PRODUCTS/products.html
>
>
> Del
>
>
> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
> 02:52:27 PM:
>
> > From: Zoltan Forray <zforray@VCU.EDU>
> > To: ADSM-L@VM.MARIST.EDU
> > Date: 07/05/2018 02:53 PM
> > Subject: Looking for suggestions to deal with large backups not
> > completing in 24-hours
> > Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >
> > As I have mentioned in the past, we have gone through large migrations
> to
> > DFS based storage on EMC ISILON hardware. As you may recall, we backup
> > these DFS mounts (about 90 at last count) using multiple Windows servers
> > that run multiple ISP nodes (about 30-each) and they access each DFS
> > mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
> >
> > This has lead to lots of performance issue with backups and some
> > departments are now complain that their backups are running into
> > multiple-days in some cases.
> >
> > One such case in a department with 2-nodes with over 30-million objects
> for
> > each node. In the past, their backups were able to finish quicker since
> > they were accessed via dedicated servers and were able to use Journaling
> to
> > reduce the scan times. Unless things have changed, I believe Journling
> is
> > not an option due to how the files are accessed.
> >
> > FWIW, average backups are usually <50k files and <200GB once it finished
> > scanning.....
> >
> > Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
> many
> > of these objects haven't been accessed in many years old. But as I
> > understand it, that won't work either given our current configuration.
> >
> > Given the current DFS configuration (previously CIFS), what can we do to
> > improve backup performance?
> >
> > So, any-and-all ideas are up for discussion. There is even discussion
> on
> > replacing ISP/TSM due to these issues/limitations.
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zforray@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit INVALID URI REMOVED
> > u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> > siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> > a432oKYronO-w1z-
> > ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
> >
>


--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
They are a 3rd-party partner that offers an integrated Spectrum Protect
solution for large filer backups.


Del

----------------------------------------------------

"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/09/2018
09:17:06 AM:

> From: Zoltan Forray <zforray@VCU.EDU>
> To: ADSM-L@VM.MARIST.EDU
> Date: 07/09/2018 09:17 AM
> Subject: Re: Looking for suggestions to deal with large backups not
> completing in 24-hours
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
>
> Thanks Del. Very interesting. Are they a VAR for IBM?
>
> Not sure if it would work in the current configuration we are using to
back
> up ISILON. I have passed the info on.
>
> BTW, FWIW, when I copied/pasted the info, Chrome spell-checker
red-flagged
> on "The easy way to incrementally backup billons of objects" (billions).
> So if you know anybody at the company, please pass it on to them.
>
> On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoobler@us.ibm.com> wrote:
>
> > Another possible idea is to look at General Storage dsmISI MAGS:
> >
> > INVALID URI REMOVED
>
u=http-3A__www.general-2Dstorage.com_PRODUCTS_products.html&d=DwIBaQ&c=jf_iaSHvJObTbx-
>
siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75lwUZLmc_kYAQxroVCZQUCSs&s=25_psxEcE0fvxruxybvMJZzSZv-
> ach7r-VHXaLNVD_E&e=
> >
> >
> > Del
> >
> >
> > "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
> > 02:52:27 PM:
> >
> > > From: Zoltan Forray <zforray@VCU.EDU>
> > > To: ADSM-L@VM.MARIST.EDU
> > > Date: 07/05/2018 02:53 PM
> > > Subject: Looking for suggestions to deal with large backups not
> > > completing in 24-hours
> > > Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> > >
> > > As I have mentioned in the past, we have gone through large
migrations
> > to
> > > DFS based storage on EMC ISILON hardware. As you may recall, we
backup
> > > these DFS mounts (about 90 at last count) using multiple Windows
servers
> > > that run multiple ISP nodes (about 30-each) and they access each DFS
> > > mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
> > >
> > > This has lead to lots of performance issue with backups and some
> > > departments are now complain that their backups are running into
> > > multiple-days in some cases.
> > >
> > > One such case in a department with 2-nodes with over 30-million
objects
> > for
> > > each node. In the past, their backups were able to finish quicker
since
> > > they were accessed via dedicated servers and were able to use
Journaling
> > to
> > > reduce the scan times. Unless things have changed, I believe
Journling
> > is
> > > not an option due to how the files are accessed.
> > >
> > > FWIW, average backups are usually <50k files and <200GB once it
finished
> > > scanning.....
> > >
> > > Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
> > many
> > > of these objects haven't been accessed in many years old. But as I
> > > understand it, that won't work either given our current
configuration.
> > >
> > > Given the current DFS configuration (previously CIFS), what can we
do to
> > > improve backup performance?
> > >
> > > So, any-and-all ideas are up for discussion. There is even
discussion
> > on
> > > replacing ISP/TSM due to these issues/limitations.
> > >
> > > --
> > > *Zoltan Forray*
> > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > Xymon Monitor Administrator
> > > VMware Administrator
> > > Virginia Commonwealth University
> > > UCC/Office of Technology Services
> > > www.ucc.vcu.edu
> > > zforray@vcu.edu - 804-828-4807
> > > Don't be a phishing victim - VCU and other reputable organizations
will
> > > never use email to request that you reply with your password, social
> > > security number or confidential personal information. For more
details
> > > visit INVALID URI REMOVED
> > > u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> > > siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> > > a432oKYronO-w1z-
> > > ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
> > >
> >
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit INVALID URI REMOVED
> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
>
siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75lwUZLmc_kYAQxroVCZQUCSs&s=umTd28h-
> GlxqSvNShsNIqm8D1PcanVk0HPcP5KTurKw&e=
>
This message was imported via the External PhorumMail Module
It is possible to da a parallel backup of file system parts.
https://www.gwdg.de/documents/20182/27257/GN_11-2016_www.pdf (german) have a
look on page 10.

---
Jonas Jansen

IT Center
Gruppe: Server & Storage
Abteilung: Systeme & Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-28784
Fax: +49 241 80-22134
jansen@itc.rwth-aachen.de
www.itc.rwth-aachen.de

-----Original Message-----
From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Del
Hoobler
Sent: Monday, July 9, 2018 3:29 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups not
completing in 24-hours

They are a 3rd-party partner that offers an integrated Spectrum Protect
solution for large filer backups.


Del

----------------------------------------------------

"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/09/2018
09:17:06 AM:

> From: Zoltan Forray <zforray@VCU.EDU>
> To: ADSM-L@VM.MARIST.EDU
> Date: 07/09/2018 09:17 AM
> Subject: Re: Looking for suggestions to deal with large backups not
> completing in 24-hours
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
>
> Thanks Del. Very interesting. Are they a VAR for IBM?
>
> Not sure if it would work in the current configuration we are using to
back
> up ISILON. I have passed the info on.
>
> BTW, FWIW, when I copied/pasted the info, Chrome spell-checker
red-flagged
> on "The easy way to incrementally backup billons of objects" (billions).
> So if you know anybody at the company, please pass it on to them.
>
> On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoobler@us.ibm.com> wrote:
>
> > Another possible idea is to look at General Storage dsmISI MAGS:
> >
> > INVALID URI REMOVED
>
u=http-3A__www.general-2Dstorage.com_PRODUCTS_products.html&d=DwIBaQ&c=jf_ia
SHvJObTbx-
>
siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
lwUZLmc_kYAQxroVCZQUCSs&s=25_psxEcE0fvxruxybvMJZzSZv-
> ach7r-VHXaLNVD_E&e=
> >
> >
> > Del
> >
> >
> > "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
> > 02:52:27 PM:
> >
> > > From: Zoltan Forray <zforray@VCU.EDU>
> > > To: ADSM-L@VM.MARIST.EDU
> > > Date: 07/05/2018 02:53 PM
> > > Subject: Looking for suggestions to deal with large backups not
> > > completing in 24-hours
> > > Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> > >
> > > As I have mentioned in the past, we have gone through large
migrations
> > to
> > > DFS based storage on EMC ISILON hardware. As you may recall, we
backup
> > > these DFS mounts (about 90 at last count) using multiple Windows
servers
> > > that run multiple ISP nodes (about 30-each) and they access each DFS
> > > mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
> > >
> > > This has lead to lots of performance issue with backups and some
> > > departments are now complain that their backups are running into
> > > multiple-days in some cases.
> > >
> > > One such case in a department with 2-nodes with over 30-million
objects
> > for
> > > each node. In the past, their backups were able to finish quicker
since
> > > they were accessed via dedicated servers and were able to use
Journaling
> > to
> > > reduce the scan times. Unless things have changed, I believe
Journling
> > is
> > > not an option due to how the files are accessed.
> > >
> > > FWIW, average backups are usually <50k files and <200GB once it
finished
> > > scanning.....
> > >
> > > Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
> > many
> > > of these objects haven't been accessed in many years old. But as I
> > > understand it, that won't work either given our current
configuration.
> > >
> > > Given the current DFS configuration (previously CIFS), what can we
do to
> > > improve backup performance?
> > >
> > > So, any-and-all ideas are up for discussion. There is even
discussion
> > on
> > > replacing ISP/TSM due to these issues/limitations.
> > >
> > > --
> > > *Zoltan Forray*
> > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > Xymon Monitor Administrator
> > > VMware Administrator
> > > Virginia Commonwealth University
> > > UCC/Office of Technology Services
> > > www.ucc.vcu.edu
> > > zforray@vcu.edu - 804-828-4807
> > > Don't be a phishing victim - VCU and other reputable organizations
will
> > > never use email to request that you reply with your password, social
> > > security number or confidential personal information. For more
details
> > > visit INVALID URI REMOVED
> > > u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> > > siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> > > a432oKYronO-w1z-
> > > ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
> > >
> >
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit INVALID URI REMOVED
> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
>
siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
lwUZLmc_kYAQxroVCZQUCSs&s=umTd28h-
> GlxqSvNShsNIqm8D1PcanVk0HPcP5KTurKw&e=
>
This message was imported via the External PhorumMail Module
I will need to translate to English but I gather it is talking about the
RESOURCEUTILZATION / MAXNUMMP values. While we have increased MAXNUMMP to
5 on the server (will try going higher), not sure how much good it would
do since the backup schedule uses OBJECTS to point to a specific/single
mountpoint/filesystem (see below) but is worth trying to bump the
RESOURCEUTILIZATION value on the client even higher...

We have checked the dsminstr.log file and it is spending 92% of the time in
PROCESS DIRS (no surprise)

7:46:25 AM SUN : q schedule * ISILON-SOM-SOMADFS1 f=d
Policy Domain Name: DFS
Schedule Name: ISILON-SOM-SOMADFS1
Description: ISILON-SOM-SOMADFS1
Action: Incremental
Subaction:
Options: -subdir=yes
Objects: \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*
Priority: 5
Start Date/Time: 12/05/2017 08:30:00
Duration: 1 Hour(s)
Maximum Run Time (Minutes): 0
Schedule Style: Enhanced
Period:
Day of Week: Any
Month: Any
Day of Month: Any
Week of Month: Any
Expiration:
Last Update by (administrator): ZFORRAY
Last Update Date/Time: 01/12/2018 10:30:48
Managing profile:


On Tue, Jul 10, 2018 at 4:06 AM Jansen, Jonas <jansen@itc.rwth-aachen.de>
wrote:

> It is possible to da a parallel backup of file system parts.
> https://www.gwdg.de/documents/20182/27257/GN_11-2016_www.pdf (german)
> have a
> look on page 10.
>
> ---
> Jonas Jansen
>
> IT Center
> Gruppe: Server & Storage
> Abteilung: Systeme & Betrieb
> RWTH Aachen University
> Seffenter Weg 23
> 52074 Aachen
> Tel: +49 241 80-28784
> Fax: +49 241 80-22134
> jansen@itc.rwth-aachen.de
> www.itc.rwth-aachen.de
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Del
> Hoobler
> Sent: Monday, July 9, 2018 3:29 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups
> not
> completing in 24-hours
>
> They are a 3rd-party partner that offers an integrated Spectrum Protect
> solution for large filer backups.
>
>
> Del
>
> ----------------------------------------------------
>
> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/09/2018
> 09:17:06 AM:
>
> > From: Zoltan Forray <zforray@VCU.EDU>
> > To: ADSM-L@VM.MARIST.EDU
> > Date: 07/09/2018 09:17 AM
> > Subject: Re: Looking for suggestions to deal with large backups not
> > completing in 24-hours
> > Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >
> > Thanks Del. Very interesting. Are they a VAR for IBM?
> >
> > Not sure if it would work in the current configuration we are using to
> back
> > up ISILON. I have passed the info on.
> >
> > BTW, FWIW, when I copied/pasted the info, Chrome spell-checker
> red-flagged
> > on "The easy way to incrementally backup billons of objects" (billions).
> > So if you know anybody at the company, please pass it on to them.
> >
> > On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoobler@us.ibm.com> wrote:
> >
> > > Another possible idea is to look at General Storage dsmISI MAGS:
> > >
> > > INVALID URI REMOVED
> >
>
> u=http-3A__www.general-2Dstorage.com_PRODUCTS_products.html&d=DwIBaQ&c=jf_ia
> SHvJObTbx-
> >
>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> lwUZLmc_kYAQxroVCZQUCSs&s=25_psxEcE0fvxruxybvMJZzSZv-
> > ach7r-VHXaLNVD_E&e=
> > >
> > >
> > > Del
> > >
> > >
> > > "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
> > > 02:52:27 PM:
> > >
> > > > From: Zoltan Forray <zforray@VCU.EDU>
> > > > To: ADSM-L@VM.MARIST.EDU
> > > > Date: 07/05/2018 02:53 PM
> > > > Subject: Looking for suggestions to deal with large backups not
> > > > completing in 24-hours
> > > > Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> > > >
> > > > As I have mentioned in the past, we have gone through large
> migrations
> > > to
> > > > DFS based storage on EMC ISILON hardware. As you may recall, we
> backup
> > > > these DFS mounts (about 90 at last count) using multiple Windows
> servers
> > > > that run multiple ISP nodes (about 30-each) and they access each DFS
> > > > mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
> > > >
> > > > This has lead to lots of performance issue with backups and some
> > > > departments are now complain that their backups are running into
> > > > multiple-days in some cases.
> > > >
> > > > One such case in a department with 2-nodes with over 30-million
> objects
> > > for
> > > > each node. In the past, their backups were able to finish quicker
> since
> > > > they were accessed via dedicated servers and were able to use
> Journaling
> > > to
> > > > reduce the scan times. Unless things have changed, I believe
> Journling
> > > is
> > > > not an option due to how the files are accessed.
> > > >
> > > > FWIW, average backups are usually <50k files and <200GB once it
> finished
> > > > scanning.....
> > > >
> > > > Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
> > > many
> > > > of these objects haven't been accessed in many years old. But as I
> > > > understand it, that won't work either given our current
> configuration.
> > > >
> > > > Given the current DFS configuration (previously CIFS), what can we
> do to
> > > > improve backup performance?
> > > >
> > > > So, any-and-all ideas are up for discussion. There is even
> discussion
> > > on
> > > > replacing ISP/TSM due to these issues/limitations.
> > > >
> > > > --
> > > > *Zoltan Forray*
> > > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > > Xymon Monitor Administrator
> > > > VMware Administrator
> > > > Virginia Commonwealth University
> > > > UCC/Office of Technology Services
> > > > www.ucc.vcu.edu
> > > > zforray@vcu.edu - 804-828-4807
> > > > Don't be a phishing victim - VCU and other reputable organizations
> will
> > > > never use email to request that you reply with your password, social
> > > > security number or confidential personal information. For more
> details
> > > > visit INVALID URI REMOVED
> > > > u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> > > > siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> > > > a432oKYronO-w1z-
> > > > ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
> > > >
> > >
> >
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zforray@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit INVALID URI REMOVED
> > u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> >
>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> lwUZLmc_kYAQxroVCZQUCSs&s=umTd28h-
> > GlxqSvNShsNIqm8D1PcanVk0HPcP5KTurKw&e=
> >
>


--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Zoltan, et al:
:IF: I understand the scenario you outline originally, here at Cornell we are using two different approaches in backing up large storage arrays.

1. For backups of CIFS shares in our Shared File Share service hosted on a NetApp device, we rely on a set of Powershell scripts to build a list of shares to backup, then invoke up to 5 SP clients at a time, each client backing up a share. As such, we are able to backup some 200+ shares on a daily basis. I’m not sure this is a good match to your problem...

2. For backups of a large Dell array containing research data that does seem to be a good match, I have defined a set of 10 proxy nodes and 240 hourly schedules (once each hour for each proxy node) that allows us to divide the Dell array up into 240 pieces - pieces that are controlled by the specification of the “objects” in the schedule. That is, in your case, instead of associating node <ISILON-SOM-STORAGE> to the schedule ISILON-SOM-SOMDFS1 with object " \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*”, I would instead have something like
Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRA\*”
Node PROXY2.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRB\*”

Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR2 for object " \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRB\SUBDIRA\*”

And so on. For known large directories, slots of multiple hours are allocated, up to the largest directory which is given its own proxy node with one schedule, and hence 24 hours to back up.

There are pros and cons to both of these, but they do enable us to perform the backups.

FWIW,
Bob

Robert Talda
EZ-Backup Systems Engineer
Cornell University
+1 607-255-8280
rpt4@cornell.edu


> On Jul 11, 2018, at 7:49 AM, Zoltan Forray <zforray@VCU.EDU> wrote:
>
> I will need to translate to English but I gather it is talking about the
> RESOURCEUTILZATION / MAXNUMMP values. While we have increased MAXNUMMP to
> 5 on the server (will try going higher), not sure how much good it would
> do since the backup schedule uses OBJECTS to point to a specific/single
> mountpoint/filesystem (see below) but is worth trying to bump the
> RESOURCEUTILIZATION value on the client even higher...
>
> We have checked the dsminstr.log file and it is spending 92% of the time in
> PROCESS DIRS (no surprise)
>
> 7:46:25 AM SUN : q schedule * ISILON-SOM-SOMADFS1 f=d
> Policy Domain Name: DFS
> Schedule Name: ISILON-SOM-SOMADFS1
> Description: ISILON-SOM-SOMADFS1
> Action: Incremental
> Subaction:
> Options: -subdir=yes
> Objects: \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*
> Priority: 5
> Start Date/Time: 12/05/2017 08:30:00
> Duration: 1 Hour(s)
> Maximum Run Time (Minutes): 0
> Schedule Style: Enhanced
> Period:
> Day of Week: Any
> Month: Any
> Day of Month: Any
> Week of Month: Any
> Expiration:
> Last Update by (administrator): ZFORRAY
> Last Update Date/Time: 01/12/2018 10:30:48
> Managing profile:
>
>
> On Tue, Jul 10, 2018 at 4:06 AM Jansen, Jonas <jansen@itc.rwth-aachen.de>
> wrote:
>
>> It is possible to da a parallel backup of file system parts.
>> https://www.gwdg.de/documents/20182/27257/GN_11-2016_www.pdf (german)
>> have a
>> look on page 10.
>>
>> ---
>> Jonas Jansen
>>
>> IT Center
>> Gruppe: Server & Storage
>> Abteilung: Systeme & Betrieb
>> RWTH Aachen University
>> Seffenter Weg 23
>> 52074 Aachen
>> Tel: +49 241 80-28784
>> Fax: +49 241 80-22134
>> jansen@itc.rwth-aachen.de
>> www.itc.rwth-aachen.de
>>
>> -----Original Message-----
>> From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Del
>> Hoobler
>> Sent: Monday, July 9, 2018 3:29 PM
>> To: ADSM-L@VM.MARIST.EDU
>> Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups
>> not
>> completing in 24-hours
>>
>> They are a 3rd-party partner that offers an integrated Spectrum Protect
>> solution for large filer backups.
>>
>>
>> Del
>>
>> ----------------------------------------------------
>>
>> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/09/2018
>> 09:17:06 AM:
>>
>>> From: Zoltan Forray <zforray@VCU.EDU>
>>> To: ADSM-L@VM.MARIST.EDU
>>> Date: 07/09/2018 09:17 AM
>>> Subject: Re: Looking for suggestions to deal with large backups not
>>> completing in 24-hours
>>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
>>>
>>> Thanks Del. Very interesting. Are they a VAR for IBM?
>>>
>>> Not sure if it would work in the current configuration we are using to
>> back
>>> up ISILON. I have passed the info on.
>>>
>>> BTW, FWIW, when I copied/pasted the info, Chrome spell-checker
>> red-flagged
>>> on "The easy way to incrementally backup billons of objects" (billions).
>>> So if you know anybody at the company, please pass it on to them.
>>>
>>> On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoobler@us.ibm.com> wrote:
>>>
>>>> Another possible idea is to look at General Storage dsmISI MAGS:
>>>>
>>>> INVALID URI REMOVED
>>>
>>
>> u=http-3A__www.general-2Dstorage.com_PRODUCTS_products.html&d=DwIBaQ&c=jf_ia
>> SHvJObTbx-
>>>
>>
>> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
>> lwUZLmc_kYAQxroVCZQUCSs&s=25_psxEcE0fvxruxybvMJZzSZv-
>>> ach7r-VHXaLNVD_E&e=
>>>>
>>>>
>>>> Del
>>>>
>>>>
>>>> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
>>>> 02:52:27 PM:
>>>>
>>>>> From: Zoltan Forray <zforray@VCU.EDU>
>>>>> To: ADSM-L@VM.MARIST.EDU
>>>>> Date: 07/05/2018 02:53 PM
>>>>> Subject: Looking for suggestions to deal with large backups not
>>>>> completing in 24-hours
>>>>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
>>>>>
>>>>> As I have mentioned in the past, we have gone through large
>> migrations
>>>> to
>>>>> DFS based storage on EMC ISILON hardware. As you may recall, we
>> backup
>>>>> these DFS mounts (about 90 at last count) using multiple Windows
>> servers
>>>>> that run multiple ISP nodes (about 30-each) and they access each DFS
>>>>> mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
>>>>>
>>>>> This has lead to lots of performance issue with backups and some
>>>>> departments are now complain that their backups are running into
>>>>> multiple-days in some cases.
>>>>>
>>>>> One such case in a department with 2-nodes with over 30-million
>> objects
>>>> for
>>>>> each node. In the past, their backups were able to finish quicker
>> since
>>>>> they were accessed via dedicated servers and were able to use
>> Journaling
>>>> to
>>>>> reduce the scan times. Unless things have changed, I believe
>> Journling
>>>> is
>>>>> not an option due to how the files are accessed.
>>>>>
>>>>> FWIW, average backups are usually <50k files and <200GB once it
>> finished
>>>>> scanning.....
>>>>>
>>>>> Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
>>>> many
>>>>> of these objects haven't been accessed in many years old. But as I
>>>>> understand it, that won't work either given our current
>> configuration.
>>>>>
>>>>> Given the current DFS configuration (previously CIFS), what can we
>> do to
>>>>> improve backup performance?
>>>>>
>>>>> So, any-and-all ideas are up for discussion. There is even
>> discussion
>>>> on
>>>>> replacing ISP/TSM due to these issues/limitations.
>>>>>
>>>>> --
>>>>> *Zoltan Forray*
>>>>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
>>>>> Xymon Monitor Administrator
>>>>> VMware Administrator
>>>>> Virginia Commonwealth University
>>>>> UCC/Office of Technology Services
>>>>> www.ucc.vcu.edu
>>>>> zforray@vcu.edu - 804-828-4807
>>>>> Don't be a phishing victim - VCU and other reputable organizations
>> will
>>>>> never use email to request that you reply with your password, social
>>>>> security number or confidential personal information. For more
>> details
>>>>> visit INVALID URI REMOVED
>>>>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
>>>>> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
>>>>> a432oKYronO-w1z-
>>>>> ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Zoltan Forray*
>>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
>>> Xymon Monitor Administrator
>>> VMware Administrator
>>> Virginia Commonwealth University
>>> UCC/Office of Technology Services
>>> www.ucc.vcu.edu
>>> zforray@vcu.edu - 804-828-4807
>>> Don't be a phishing victim - VCU and other reputable organizations will
>>> never use email to request that you reply with your password, social
>>> security number or confidential personal information. For more details
>>> visit INVALID URI REMOVED
>>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
>>>
>>
>> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
>> lwUZLmc_kYAQxroVCZQUCSs&s=umTd28h-
>>> GlxqSvNShsNIqm8D1PcanVk0HPcP5KTurKw&e=
>>>
>>
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Robert,

Thanks for the insight/suggestions. Your scenario is similar to ours but
on a larger scale when it comes to the amount of data/files to process,
thus the issue (assuming such since you didn't list numbers). Currently we
have 91 ISILON nodes totaling 140M objects and 230TB of data. The largest
(our troublemaker) has over 21M objects and 26TB of data (this is the one
that takes 4-5 days). dsminstr.log from a recently finished run shows it
only backed up 15K objects.

We agree that this and other similarly larger nodes need to be broken up
into smaller/less objects to backup per node. But the owner of this large
one is balking since previously this was backed up via a solitary Windows
server using Journaling so everything finished in a day.

We have never dealt with proxy nodes but might need to head in that
direction since our current method of allowing users to perform their own
restores relies on the now deprecated Web Client. Our current method is
numerous Windows VM servers with 20-30 nodes defined to each.

How do you handle restore requests?

On Wed, Jul 11, 2018 at 2:56 PM Robert Talda <rpt4@cornell.edu> wrote:

> Zoltan, et al:
> :IF: I understand the scenario you outline originally, here at Cornell
> we are using two different approaches in backing up large storage arrays.
>
> 1. For backups of CIFS shares in our Shared File Share service hosted on a
> NetApp device, we rely on a set of Powershell scripts to build a list of
> shares to backup, then invoke up to 5 SP clients at a time, each client
> backing up a share. As such, we are able to backup some 200+ shares on a
> daily basis. I’m not sure this is a good match to your problem....
>
> 2. For backups of a large Dell array containing research data that does
> seem to be a good match, I have defined a set of 10 proxy nodes and 240
> hourly schedules (once each hour for each proxy node) that allows us to
> divide the Dell array up into 240 pieces - pieces that are controlled by
> the specification of the “objects” in the schedule. That is, in your case,
> instead of associating node <ISILON-SOM-STORAGE> to the schedule
> ISILON-SOM-SOMDFS1 with object " \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*”,
> I would instead have something like
> Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRA\*”
> Node PROXY2.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRB\*”
> …
> Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR2 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRB\SUBDIRA\*”
>
> And so on. For known large directories, slots of multiple hours are
> allocated, up to the largest directory which is given its own proxy node
> with one schedule, and hence 24 hours to back up.
>
> There are pros and cons to both of these, but they do enable us to perform
> the backups.
>
> FWIW,
> Bob
>
> Robert Talda
> EZ-Backup Systems Engineer
> Cornell University
> +1 607-255-8280
> rpt4@cornell.edu
>
>
> > On Jul 11, 2018, at 7:49 AM, Zoltan Forray <zforray@VCU.EDU> wrote:
> >
> > I will need to translate to English but I gather it is talking about the
> > RESOURCEUTILZATION / MAXNUMMP values. While we have increased MAXNUMMP
> to
> > 5 on the server (will try going higher), not sure how much good it would
> > do since the backup schedule uses OBJECTS to point to a specific/single
> > mountpoint/filesystem (see below) but is worth trying to bump the
> > RESOURCEUTILIZATION value on the client even higher...
> >
> > We have checked the dsminstr.log file and it is spending 92% of the time
> in
> > PROCESS DIRS (no surprise)
> >
> > 7:46:25 AM SUN : q schedule * ISILON-SOM-SOMADFS1 f=d
> > Policy Domain Name: DFS
> > Schedule Name: ISILON-SOM-SOMADFS1
> > Description: ISILON-SOM-SOMADFS1
> > Action: Incremental
> > Subaction:
> > Options: -subdir=yes
> > Objects: \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*
> > Priority: 5
> > Start Date/Time: 12/05/2017 08:30:00
> > Duration: 1 Hour(s)
> > Maximum Run Time (Minutes): 0
> > Schedule Style: Enhanced
> > Period:
> > Day of Week: Any
> > Month: Any
> > Day of Month: Any
> > Week of Month: Any
> > Expiration:
> > Last Update by (administrator): ZFORRAY
> > Last Update Date/Time: 01/12/2018 10:30:48
> > Managing profile:
> >
> >
> > On Tue, Jul 10, 2018 at 4:06 AM Jansen, Jonas <jansen@itc.rwth-aachen.de
> >
> > wrote:
> >
> >> It is possible to da a parallel backup of file system parts.
> >> https://www.gwdg.de/documents/20182/27257/GN_11-2016_www.pdf (german)
> >> have a
> >> look on page 10.
> >>
> >> ---
> >> Jonas Jansen
> >>
> >> IT Center
> >> Gruppe: Server & Storage
> >> Abteilung: Systeme & Betrieb
> >> RWTH Aachen University
> >> Seffenter Weg 23
> >> 52074 Aachen
> >> Tel: +49 241 80-28784
> >> Fax: +49 241 80-22134
> >> jansen@itc.rwth-aachen.de
> >> www.itc.rwth-aachen.de
> >>
> >> -----Original Message-----
> >> From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Del
> >> Hoobler
> >> Sent: Monday, July 9, 2018 3:29 PM
> >> To: ADSM-L@VM.MARIST.EDU
> >> Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups
> >> not
> >> completing in 24-hours
> >>
> >> They are a 3rd-party partner that offers an integrated Spectrum Protect
> >> solution for large filer backups.
> >>
> >>
> >> Del
> >>
> >> ----------------------------------------------------
> >>
> >> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/09/2018
> >> 09:17:06 AM:
> >>
> >>> From: Zoltan Forray <zforray@VCU.EDU>
> >>> To: ADSM-L@VM.MARIST.EDU
> >>> Date: 07/09/2018 09:17 AM
> >>> Subject: Re: Looking for suggestions to deal with large backups not
> >>> completing in 24-hours
> >>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >>>
> >>> Thanks Del. Very interesting. Are they a VAR for IBM?
> >>>
> >>> Not sure if it would work in the current configuration we are using to
> >> back
> >>> up ISILON. I have passed the info on.
> >>>
> >>> BTW, FWIW, when I copied/pasted the info, Chrome spell-checker
> >> red-flagged
> >>> on "The easy way to incrementally backup billons of objects"
> (billions).
> >>> So if you know anybody at the company, please pass it on to them.
> >>>
> >>> On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoobler@us.ibm.com> wrote:
> >>>
> >>>> Another possible idea is to look at General Storage dsmISI MAGS:
> >>>>
> >>>> INVALID URI REMOVED
> >>>
> >>
> >>
> u=http-3A__www.general-2Dstorage.com_PRODUCTS_products.html&d=DwIBaQ&c=jf_ia
> >> SHvJObTbx-
> >>>
> >>
> >>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> >> lwUZLmc_kYAQxroVCZQUCSs&s=25_psxEcE0fvxruxybvMJZzSZv-
> >>> ach7r-VHXaLNVD_E&e=
> >>>>
> >>>>
> >>>> Del
> >>>>
> >>>>
> >>>> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
> >>>> 02:52:27 PM:
> >>>>
> >>>>> From: Zoltan Forray <zforray@VCU.EDU>
> >>>>> To: ADSM-L@VM.MARIST.EDU
> >>>>> Date: 07/05/2018 02:53 PM
> >>>>> Subject: Looking for suggestions to deal with large backups not
> >>>>> completing in 24-hours
> >>>>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >>>>>
> >>>>> As I have mentioned in the past, we have gone through large
> >> migrations
> >>>> to
> >>>>> DFS based storage on EMC ISILON hardware. As you may recall, we
> >> backup
> >>>>> these DFS mounts (about 90 at last count) using multiple Windows
> >> servers
> >>>>> that run multiple ISP nodes (about 30-each) and they access each DFS
> >>>>> mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
> >>>>>
> >>>>> This has lead to lots of performance issue with backups and some
> >>>>> departments are now complain that their backups are running into
> >>>>> multiple-days in some cases.
> >>>>>
> >>>>> One such case in a department with 2-nodes with over 30-million
> >> objects
> >>>> for
> >>>>> each node. In the past, their backups were able to finish quicker
> >> since
> >>>>> they were accessed via dedicated servers and were able to use
> >> Journaling
> >>>> to
> >>>>> reduce the scan times. Unless things have changed, I believe
> >> Journling
> >>>> is
> >>>>> not an option due to how the files are accessed.
> >>>>>
> >>>>> FWIW, average backups are usually <50k files and <200GB once it
> >> finished
> >>>>> scanning.....
> >>>>>
> >>>>> Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
> >>>> many
> >>>>> of these objects haven't been accessed in many years old. But as I
> >>>>> understand it, that won't work either given our current
> >> configuration.
> >>>>>
> >>>>> Given the current DFS configuration (previously CIFS), what can we
> >> do to
> >>>>> improve backup performance?
> >>>>>
> >>>>> So, any-and-all ideas are up for discussion. There is even
> >> discussion
> >>>> on
> >>>>> replacing ISP/TSM due to these issues/limitations.
> >>>>>
> >>>>> --
> >>>>> *Zoltan Forray*
> >>>>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> >>>>> Xymon Monitor Administrator
> >>>>> VMware Administrator
> >>>>> Virginia Commonwealth University
> >>>>> UCC/Office of Technology Services
> >>>>> www.ucc.vcu.edu
> >>>>> zforray@vcu.edu - 804-828-4807
> >>>>> Don't be a phishing victim - VCU and other reputable organizations
> >> will
> >>>>> never use email to request that you reply with your password, social
> >>>>> security number or confidential personal information. For more
> >> details
> >>>>> visit INVALID URI REMOVED
> >>>>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> >>>>> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> >>>>> a432oKYronO-w1z-
> >>>>> ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> *Zoltan Forray*
> >>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> >>> Xymon Monitor Administrator
> >>> VMware Administrator
> >>> Virginia Commonwealth University
> >>> UCC/Office of Technology Services
> >>> www.ucc.vcu.edu
> >>> zforray@vcu.edu - 804-828-4807
> >>> Don't be a phishing victim - VCU and other reputable organizations will
> >>> never use email to request that you reply with your password, social
> >>> security number or confidential personal information. For more details
> >>> visit INVALID URI REMOVED
> >>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> >>>
> >>
> >>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> >> lwUZLmc_kYAQxroVCZQUCSs&s=umTd28h-
> >>> GlxqSvNShsNIqm8D1PcanVk0HPcP5KTurKw&e=
> >>>
> >>
> >
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zforray@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit http://phishing.vcu.edu/
>
>

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Hey Zoltan

Key points for backing up isilon:
1 Each isilon node is limited by it's CPU/protocol rather than Networking (other than the new G6 F800's )
2 To increase throughput to/from isilon increase the number isilon nodes you access via your clients
3 To increase the isilon nodes you access you can either mount the storage multiple times from the same client using a different IP, or use TSM proxies.
4 Increase resource utilisation to 10 (max) to increase parallelisation
5 Increase the Max num mount points to be bigger than the number of client machines X the resource utilization X the number of SP clients you run per client machine. This ensures each session is actively working and not waiting for a mount point.
6 Size your Disk storage pool files so that you can have at least 2 X max num of mount points. This is so that should you fill your disk storage pool you do not have lock contention between migration and backup. Ideally you should have enough disk pool storage to do a single run .

We have a setup where we need to do archives of up to 50 TB a day and do this using over 24 dsmc's running across 6 Client vm's with a resource utilisation of 10.

HTH

Grant

________________________________________
From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> on behalf of Zoltan Forray <zforray@VCU.EDU>
Sent: Thursday, 12 July 2018 5:59 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups not completing in 24-hours

Robert,

Thanks for the insight/suggestions. Your scenario is similar to ours but
on a larger scale when it comes to the amount of data/files to process,
thus the issue (assuming such since you didn't list numbers). Currently we
have 91 ISILON nodes totaling 140M objects and 230TB of data. The largest
(our troublemaker) has over 21M objects and 26TB of data (this is the one
that takes 4-5 days). dsminstr.log from a recently finished run shows it
only backed up 15K objects.

We agree that this and other similarly larger nodes need to be broken up
into smaller/less objects to backup per node. But the owner of this large
one is balking since previously this was backed up via a solitary Windows
server using Journaling so everything finished in a day.

We have never dealt with proxy nodes but might need to head in that
direction since our current method of allowing users to perform their own
restores relies on the now deprecated Web Client. Our current method is
numerous Windows VM servers with 20-30 nodes defined to each.

How do you handle restore requests?

On Wed, Jul 11, 2018 at 2:56 PM Robert Talda <rpt4@cornell.edu> wrote:

> Zoltan, et al:
> :IF: I understand the scenario you outline originally, here at Cornell
> we are using two different approaches in backing up large storage arrays.
>
> 1. For backups of CIFS shares in our Shared File Share service hosted on a
> NetApp device, we rely on a set of Powershell scripts to build a list of
> shares to backup, then invoke up to 5 SP clients at a time, each client
> backing up a share. As such, we are able to backup some 200+ shares on a
> daily basis. I’m not sure this is a good match to your problem...
>
> 2. For backups of a large Dell array containing research data that does
> seem to be a good match, I have defined a set of 10 proxy nodes and 240
> hourly schedules (once each hour for each proxy node) that allows us to
> divide the Dell array up into 240 pieces - pieces that are controlled by
> the specification of the “objects” in the schedule. That is, in your case,
> instead of associating node <ISILON-SOM-STORAGE> to the schedule
> ISILON-SOM-SOMDFS1 with object " \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*”,
> I would instead have something like
> Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRA\*”
> Node PROXY2.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRB\*”
> …
> Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR2 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRB\SUBDIRA\*”
>
> And so on. For known large directories, slots of multiple hours are
> allocated, up to the largest directory which is given its own proxy node
> with one schedule, and hence 24 hours to back up.
>
> There are pros and cons to both of these, but they do enable us to perform
> the backups.
>
> FWIW,
> Bob
>
> Robert Talda
> EZ-Backup Systems Engineer
> Cornell University
> +1 607-255-8280
> rpt4@cornell.edu
>
>
> > On Jul 11, 2018, at 7:49 AM, Zoltan Forray <zforray@VCU.EDU> wrote:
> >
> > I will need to translate to English but I gather it is talking about the
> > RESOURCEUTILZATION / MAXNUMMP values. While we have increased MAXNUMMP
> to
> > 5 on the server (will try going higher), not sure how much good it would
> > do since the backup schedule uses OBJECTS to point to a specific/single
> > mountpoint/filesystem (see below) but is worth trying to bump the
> > RESOURCEUTILIZATION value on the client even higher...
> >
> > We have checked the dsminstr.log file and it is spending 92% of the time
> in
> > PROCESS DIRS (no surprise)
> >
> > 7:46:25 AM SUN : q schedule * ISILON-SOM-SOMADFS1 f=d
> > Policy Domain Name: DFS
> > Schedule Name: ISILON-SOM-SOMADFS1
> > Description: ISILON-SOM-SOMADFS1
> > Action: Incremental
> > Subaction:
> > Options: -subdir=yes
> > Objects: \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*
> > Priority: 5
> > Start Date/Time: 12/05/2017 08:30:00
> > Duration: 1 Hour(s)
> > Maximum Run Time (Minutes): 0
> > Schedule Style: Enhanced
> > Period:
> > Day of Week: Any
> > Month: Any
> > Day of Month: Any
> > Week of Month: Any
> > Expiration:
> > Last Update by (administrator): ZFORRAY
> > Last Update Date/Time: 01/12/2018 10:30:48
> > Managing profile:
> >
> >
> > On Tue, Jul 10, 2018 at 4:06 AM Jansen, Jonas <jansen@itc.rwth-aachen.de
> >
> > wrote:
> >
> >> It is possible to da a parallel backup of file system parts.
> >> https://clicktime.symantec.com/a/1/dm3q-cs8Bb1W4hGGSeUGCjyxN9LhH5W4kf2bkxedzfs=?d=mjxQcKgEvht2k1KHCcJY55t0cLlZa8SFn0QAyfY8UqiZOjJurg5d8SgLWX9BcbYfC6TGitrf-Ph2qfRepCgyaWDjSKNYodAOoJSjKm0_Lq9uxM6_ptcYMoV6GzJnAmAfWcaNZzQ4A9feVI55VylreexrcLA66V2wonKc7Dn9n03NNlWFNKCuzHI3PfbYx5U3sJqnr9GUTThkdcgADHGJ-64rNTyo4oiW1piKfzrVaGjuTz1oHLpWiLpTJ6I-pk1I5wOx9zQ1Es36KQOlxEqBi9VnMvzxCDZ1D2iu-cpO5g0HaUOVE4V5PkzCKObae2O39d3x8azoIWIZlBcw_9eJvGqRpkpBZQELN3lp7_822QOz7WZkaqN_wM6aq2g1ardVVltA5BZrutLWwg4Hl_RtyDnL-5u0WWOG22tLhXMq_B2-DZ0URIaTTZNlxSbd3kb0XLwOS1t_TiJro3s8A92uaENtehVhaXcSOAQoJA%3D%3D&u=https%3A%2F%2Fwww.gwdg.de%2Fdocuments%2F20182%2F27257%2FGN_11-2016_www.pdf (german)
> >> have a
> >> look on page 10.
> >>
> >> ---
> >> Jonas Jansen
> >>
> >> IT Center
> >> Gruppe: Server & Storage
> >> Abteilung: Systeme & Betrieb
> >> RWTH Aachen University
> >> Seffenter Weg 23
> >> 52074 Aachen
> >> Tel: +49 241 80-28784
> >> Fax: +49 241 80-22134
> >> jansen@itc.rwth-aachen.de
> >> www.itc.rwth-aachen.de
> >>
> >> -----Original Message-----
> >> From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Del
> >> Hoobler
> >> Sent: Monday, July 9, 2018 3:29 PM
> >> To: ADSM-L@VM.MARIST.EDU
> >> Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups
> >> not
> >> completing in 24-hours
> >>
> >> They are a 3rd-party partner that offers an integrated Spectrum Protect
> >> solution for large filer backups.
> >>
> >>
> >> Del
> >>
> >> ----------------------------------------------------
> >>
> >> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/09/2018
> >> 09:17:06 AM:
> >>
> >>> From: Zoltan Forray <zforray@VCU.EDU>
> >>> To: ADSM-L@VM.MARIST.EDU
> >>> Date: 07/09/2018 09:17 AM
> >>> Subject: Re: Looking for suggestions to deal with large backups not
> >>> completing in 24-hours
> >>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >>>
> >>> Thanks Del. Very interesting. Are they a VAR for IBM?
> >>>
> >>> Not sure if it would work in the current configuration we are using to
> >> back
> >>> up ISILON. I have passed the info on.
> >>>
> >>> BTW, FWIW, when I copied/pasted the info, Chrome spell-checker
> >> red-flagged
> >>> on "The easy way to incrementally backup billons of objects"
> (billions).
> >>> So if you know anybody at the company, please pass it on to them.
> >>>
> >>> On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoobler@us.ibm.com> wrote:
> >>>
> >>>> Another possible idea is to look at General Storage dsmISI MAGS:
> >>>>
> >>>> INVALID URI REMOVED
> >>>
> >>
> >>
> u=http-3A__www.general-2Dstorage.com_PRODUCTS_products.html&d=DwIBaQ&c=jf_ia
> >> SHvJObTbx-
> >>>
> >>
> >>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> >> lwUZLmc_kYAQxroVCZQUCSs&s=25_psxEcE0fvxruxybvMJZzSZv-
> >>> ach7r-VHXaLNVD_E&e=
> >>>>
> >>>>
> >>>> Del
> >>>>
> >>>>
> >>>> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
> >>>> 02:52:27 PM:
> >>>>
> >>>>> From: Zoltan Forray <zforray@VCU.EDU>
> >>>>> To: ADSM-L@VM.MARIST.EDU
> >>>>> Date: 07/05/2018 02:53 PM
> >>>>> Subject: Looking for suggestions to deal with large backups not
> >>>>> completing in 24-hours
> >>>>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >>>>>
> >>>>> As I have mentioned in the past, we have gone through large
> >> migrations
> >>>> to
> >>>>> DFS based storage on EMC ISILON hardware. As you may recall, we
> >> backup
> >>>>> these DFS mounts (about 90 at last count) using multiple Windows
> >> servers
> >>>>> that run multiple ISP nodes (about 30-each) and they access each DFS
> >>>>> mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
> >>>>>
> >>>>> This has lead to lots of performance issue with backups and some
> >>>>> departments are now complain that their backups are running into
> >>>>> multiple-days in some cases.
> >>>>>
> >>>>> One such case in a department with 2-nodes with over 30-million
> >> objects
> >>>> for
> >>>>> each node. In the past, their backups were able to finish quicker
> >> since
> >>>>> they were accessed via dedicated servers and were able to use
> >> Journaling
> >>>> to
> >>>>> reduce the scan times. Unless things have changed, I believe
> >> Journling
> >>>> is
> >>>>> not an option due to how the files are accessed.
> >>>>>
> >>>>> FWIW, average backups are usually <50k files and <200GB once it
> >> finished
> >>>>> scanning.....
> >>>>>
> >>>>> Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
> >>>> many
> >>>>> of these objects haven't been accessed in many years old. But as I
> >>>>> understand it, that won't work either given our current
> >> configuration.
> >>>>>
> >>>>> Given the current DFS configuration (previously CIFS), what can we
> >> do to
> >>>>> improve backup performance?
> >>>>>
> >>>>> So, any-and-all ideas are up for discussion. There is even
> >> discussion
> >>>> on
> >>>>> replacing ISP/TSM due to these issues/limitations.
> >>>>>
> >>>>> --
> >>>>> *Zoltan Forray*
> >>>>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> >>>>> Xymon Monitor Administrator
> >>>>> VMware Administrator
> >>>>> Virginia Commonwealth University
> >>>>> UCC/Office of Technology Services
> >>>>> www.ucc.vcu.edu
> >>>>> zforray@vcu.edu - 804-828-4807
> >>>>> Don't be a phishing victim - VCU and other reputable organizations
> >> will
> >>>>> never use email to request that you reply with your password, social
> >>>>> security number or confidential personal information. For more
> >> details
> >>>>> visit INVALID URI REMOVED
> >>>>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> >>>>> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> >>>>> a432oKYronO-w1z-
> >>>>> ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> *Zoltan Forray*
> >>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> >>> Xymon Monitor Administrator
> >>> VMware Administrator
> >>> Virginia Commonwealth University
> >>> UCC/Office of Technology Services
> >>> www.ucc.vcu.edu
> >>> zforray@vcu.edu - 804-828-4807
> >>> Don't be a phishing victim - VCU and other reputable organizations will
> >>> never use email to request that you reply with your password, social
> >>> security number or confidential personal information. For more details
> >>> visit INVALID URI REMOVED
> >>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> >>>
> >>
> >>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> >> lwUZLmc_kYAQxroVCZQUCSs&s=umTd28h-
> >>> GlxqSvNShsNIqm8D1PcanVk0HPcP5KTurKw&e=
> >>>
> >>
> >
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zforray@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit https://clicktime.symantec.com/a/1/2GCySDV8toqsd_TR5kWeMunzAtyi1r_uZwRqBmkZ0Ms=?d=mjxQcKgEvht2k1KHCcJY55t0cLlZa8SFn0QAyfY8UqiZOjJurg5d8SgLWX9BcbYfC6TGitrf-Ph2qfRepCgyaWDjSKNYodAOoJSjKm0_Lq9uxM6_ptcYMoV6GzJnAmAfWcaNZzQ4A9feVI55VylreexrcLA66V2wonKc7Dn9n03NNlWFNKCuzHI3PfbYx5U3sJqnr9GUTThkdcgADHGJ-64rNTyo4oiW1piKfzrVaGjuTz1oHLpWiLpTJ6I-pk1I5wOx9zQ1Es36KQOlxEqBi9VnMvzxCDZ1D2iu-cpO5g0HaUOVE4V5PkzCKObae2O39d3x8azoIWIZlBcw_9eJvGqRpkpBZQELN3lp7_822QOz7WZkaqN_wM6aq2g1ardVVltA5BZrutLWwg4Hl_RtyDnL-5u0WWOG22tLhXMq_B2-DZ0URIaTTZNlxSbd3kb0XLwOS1t_TiJro3s8A92uaENtehVhaXcSOAQoJA%3D%3D&u=http%3A%2F%2Fphishing.vcu.edu%2F
>
>

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit https://clicktime.symantec.com/a/1/2GCySDV8toqsd_TR5kWeMunzAtyi1r_uZwRqBmkZ0Ms=?d=mjxQcKgEvht2k1KHCcJY55t0cLlZa8SFn0QAyfY8UqiZOjJurg5d8SgLWX9BcbYfC6TGitrf-Ph2qfRepCgyaWDjSKNYodAOoJSjKm0_Lq9uxM6_ptcYMoV6GzJnAmAfWcaNZzQ4A9feVI55VylreexrcLA66V2wonKc7Dn9n03NNlWFNKCuzHI3PfbYx5U3sJqnr9GUTThkdcgADHGJ-64rNTyo4oiW1piKfzrVaGjuTz1oHLpWiLpTJ6I-pk1I5wOx9zQ1Es36KQOlxEqBi9VnMvzxCDZ1D2iu-cpO5g0HaUOVE4V5PkzCKObae2O39d3x8azoIWIZlBcw_9eJvGqRpkpBZQELN3lp7_822QOz7WZkaqN_wM6aq2g1ardVVltA5BZrutLWwg4Hl_RtyDnL-5u0WWOG22tLhXMq_B2-DZ0URIaTTZNlxSbd3kb0XLwOS1t_TiJro3s8A92uaENtehVhaXcSOAQoJA%3D%3D&u=http%3A%2F%2Fphishing.vcu.edu%2F
--
Grant Street
Senior Systems Engineer

T: +61 2 9383 4800 (main)
D: +61 2 8310 3582 (direct)
E: Grant.Street@al.com.au

Building 54 / FSA #19, Fox Studios Australia, 38 Driver Avenue
Moore Park, NSW 2021
AUSTRALIA

[LinkedIn] https://www.linkedin.com/company/animal-logic [Facebook] https://www.facebook.com/Animal-Logic-129284263808191/ [Twitter] https://twitter.com/AnimalLogic [Instagram] https://www.instagram.com/animallogicstudios/
[Animal Logic]http://www.peterrabbit-movie.com

Check out our awesome NEW website www.animallogic.comhttp://www.animallogic.com

CONFIDENTIALITY AND PRIVILEGE NOTICE
This email is intended only to be read or used by the addressee. It is confidential and may contain privileged information. If you are not the intended recipient, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this communication are not waived or lost by reason of the mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email.
This message was imported via the External PhorumMail Module
Zoltan:
Finally get a chance to answer you. I :think: I understand what you are getting at…

First, some numbers - recalling that each of these nodes is one storage device:
Node1: 358,000,000+ files totalling 430 TB of primary occupied space
Node2: 302,000,000+ files totaling 82 TB of primary occupied space
Node3: 79,000,000+ files totaling 75 TB of primary occupied space
Node4: 1,000,000+ files totalling 75 TB of primary occupied space
Node5: 17,000,000+ files totalling 42 TB of primary occupied space
There are more, but I think this answers your initial question.

Restore requests are handled by the local system admin or, for lack of a better description, data admin. (Basically, the research area has a person dedicated to all the various data issues related to research grants, from including proper verbiage in grant requests to making sure the necessary protections are in place).

We try to make it as simple as we can, because we do concentrate all the data in one node per storage device (usually a NAS). So restores are usually done directly from the node - while all backups are done through proxies. Generally, the restores are done without permissions so that the appropriate permissions can be applied to the restored data. (Oft times, the data is restored so a different user or set of users can work with it, so the original permissions aren’t useful)

There are some exceptions - of course, as we work at universities, there are always exceptions - and these we handle as best we can by providing proxy nodes with restricted priviledges.

Let me know if I can provide more,
Bob


Robert Talda
EZ-Backup Systems Engineer
Cornell University
+1 607-255-8280
rpt4@cornell.edu


> On Jul 11, 2018, at 3:59 PM, Zoltan Forray <zforray@VCU.EDU> wrote:
>
> Robert,
>
> Thanks for the insight/suggestions. Your scenario is similar to ours but
> on a larger scale when it comes to the amount of data/files to process,
> thus the issue (assuming such since you didn't list numbers). Currently we
> have 91 ISILON nodes totaling 140M objects and 230TB of data. The largest
> (our troublemaker) has over 21M objects and 26TB of data (this is the one
> that takes 4-5 days). dsminstr.log from a recently finished run shows it
> only backed up 15K objects.
>
> We agree that this and other similarly larger nodes need to be broken up
> into smaller/less objects to backup per node. But the owner of this large
> one is balking since previously this was backed up via a solitary Windows
> server using Journaling so everything finished in a day.
>
> We have never dealt with proxy nodes but might need to head in that
> direction since our current method of allowing users to perform their own
> restores relies on the now deprecated Web Client. Our current method is
> numerous Windows VM servers with 20-30 nodes defined to each.
>
> How do you handle restore requests?
>
> On Wed, Jul 11, 2018 at 2:56 PM Robert Talda <rpt4@cornell.edu> wrote:
>
This message was imported via the External PhorumMail Module
Robert,

Thanks for the extensive details. You backup 5-nodes with as more data
then we do for 90-nodes. So, my question is - what kind of connections do
you have to your NAS/storage device to process that much data in such a
short period of time?

I am not sure what benefit a proxy-node would do for us, other than to
manage multiple nodes from one connection/GUI - or am I totally off base on
this?

Our current configuration is such:

7-Windows 2016 VM's (adding more to spread out the load)
Each of these 7-VM's handle the backups for 5-30 nodes. Each node is a
mountpoint for an user/department ISILON DFS mount -
i.e. \\rams\som\TSM\FC\*, \\rams\som\TSM\UR\* etc. FWIW, the reason we are
using VM's is the connection is actually faster then when we were using
physical servers since they only had gigabit nics.

Even when we moved the biggest ISILON node (20,000,000+ files) to a new VM
with only 4-other nodes, it still took 4-days to scan and backup 102GB of
32TB. Below are a recent end-of-session statistics (current backup started
Friday and is still running)

07/09/2018 02:00:06 ANE4952I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects inspected: 20,276,912 (SESSION: 21423)
07/09/2018 02:00:06 ANE4954I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects backed up: 26,787 (SESSION: 21423)
07/09/2018 02:00:06 ANE4958I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects updated: 31 (SESSION: 21423)
07/09/2018 02:00:06 ANE4960I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects rebound: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4957I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects deleted: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4970I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects expired: 20,630 (SESSION: 21423)
07/09/2018 02:00:06 ANE4959I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects failed: 36 (SESSION: 21423)
07/09/2018 02:00:06 ANE4197I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects encrypted: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4965I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of subfile objects: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4914I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects grew: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4916I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of retries: 124 (SESSION: 21423)
07/09/2018 02:00:06 ANE4977I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of bytes inspected: 31.75 TB (SESSION: 21423)
07/09/2018 02:00:06 ANE4961I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of bytes transferred: 101.90 GB (SESSION: 21423)
07/09/2018 02:00:06 ANE4963I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Data transfer time: 115.78 sec (SESSION: 21423)
07/09/2018 02:00:06 ANE4966I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Network data transfer rate: 922,800.00 KB/sec (SESSION: 21423)
07/09/2018 02:00:06 ANE4967I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Aggregate data transfer rate: 271.46 KB/sec (SESSION: 21423)
07/09/2018 02:00:06 ANE4968I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Objects compressed by: 30% (SESSION: 21423)
07/09/2018 02:00:06 ANE4976I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total data reduction ratio: 99.69% (SESSION: 21423)
07/09/2018 02:00:06 ANE4969I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Subfile objects reduced by: 0% (SESSION: 21423)
07/09/2018 02:00:06 ANE4964I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Elapsed processing time: 109:19:48 (SESSION: 21423)


Even when we m

On Sun, Jul 15, 2018 at 7:30 PM Robert Talda <rpt4@cornell.edu> wrote:

> Zoltan:
> Finally get a chance to answer you. I :think: I understand what you are
> getting at…
>
> First, some numbers - recalling that each of these nodes is one storage
> device:
> Node1: 358,000,000+ files totalling 430 TB of primary occupied space
> Node2: 302,000,000+ files totaling 82 TB of primary occupied space
> Node3: 79,000,000+ files totaling 75 TB of primary occupied space
> Node4: 1,000,000+ files totalling 75 TB of primary occupied space
> Node5: 17,000,000+ files totalling 42 TB of primary occupied space
> There are more, but I think this answers your initial question.
>
> Restore requests are handled by the local system admin or, for lack of a
> better description, data admin. (Basically, the research area has a person
> dedicated to all the various data issues related to research grants, from
> including proper verbiage in grant requests to making sure the necessary
> protections are in place).
>
> We try to make it as simple as we can, because we do concentrate all the
> data in one node per storage device (usually a NAS). So restores are
> usually done directly from the node - while all backups are done through
> proxies. Generally, the restores are done without permissions so that the
> appropriate permissions can be applied to the restored data. (Oft times,
> the data is restored so a different user or set of users can work with it,
> so the original permissions aren’t useful)
>
> There are some exceptions - of course, as we work at universities, there
> are always exceptions - and these we handle as best we can by providing
> proxy nodes with restricted priviledges.
>
> Let me know if I can provide more,
> Bob
>
>
> Robert Talda
> EZ-Backup Systems Engineer
> Cornell University
> +1 607-255-8280
> rpt4@cornell.edu
>
>
> > On Jul 11, 2018, at 3:59 PM, Zoltan Forray <zforray@VCU.EDU> wrote:
> >
> > Robert,
> >
> > Thanks for the insight/suggestions. Your scenario is similar to ours but
> > on a larger scale when it comes to the amount of data/files to process,
> > thus the issue (assuming such since you didn't list numbers). Currently
> we
> > have 91 ISILON nodes totaling 140M objects and 230TB of data. The largest
> > (our troublemaker) has over 21M objects and 26TB of data (this is the one
> > that takes 4-5 days). dsminstr.log from a recently finished run shows it
> > only backed up 15K objects.
> >
> > We agree that this and other similarly larger nodes need to be broken up
> > into smaller/less objects to backup per node. But the owner of this
> large
> > one is balking since previously this was backed up via a solitary Windows
> > server using Journaling so everything finished in a day.
> >
> > We have never dealt with proxy nodes but might need to head in that
> > direction since our current method of allowing users to perform their own
> > restores relies on the now deprecated Web Client. Our current method is
> > numerous Windows VM servers with 20-30 nodes defined to each.
> >
> > How do you handle restore requests?
> >
> > On Wed, Jul 11, 2018 at 2:56 PM Robert Talda <rpt4@cornell.edu> wrote:
> >
>
>

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Zoltan:
I wish I could give you more details about the NAS/storage device connections, but either a) I’m not privy to that information; or b) I know it only as the SAN fabric. That is, our largest backups are from systems in our server farm that are part of the same SAN fabric as both the system running the SP client doing the backups AND the system hosting the TSM server. There is a 10 GB pipe connecting the two physical systems but that hasn’t ever been the bottleneck. And the system running the SP client is a VM as well.

Our bigger challenge was filesystems or shares with lots of files. This is where the proxy node strategy came into play. We were able to work with the system admins to split the backup of the those filesystems into many smaller (in terms of number of files) backups that started deeper in the filesystem. That is, instead of running a backup against
\\rams\som\TSM\FC\*<smb://rams/som/TSM/FC/*>
We would have one backup running through PROXY.NODE1 for
\\rams\som\TSM\FC\dir1\*<smb://rams/som/TSM/FC/dir1/*>
While another was running through PROXY.NODE2 for
\\rams\som\TSM\FC\dir2\*<smb://rams/som/TSM/FC/dir2/*>
And so on and so forth.

We did this using a set of client schedules that used the “objects” option to specify the directory in question:

Def sched DOMAIN PROXY.NODE1.HOUR01 action=incr options=“-subir=yes -asnodename=DATANODE” -objects=‘“\\rams\som\TSM\RC\dir1\” startt=01:00 dur=1 duru=hour

Where DATANODE is the target for agent PROXY.NODE1.

Currently, we are running up to 144 backups (6 Proxy nodes, 24 hourly backups) for our largest devices.

HTH,
Bob

On Jul 16, 2018, at 8:29 AM, Zoltan Forray <zforray@VCU.EDU<mailto:zforray@VCU.EDU>> wrote:

Robert,

Thanks for the extensive details. You backup 5-nodes with as more data
then we do for 90-nodes. So, my question is - what kind of connections do
you have to your NAS/storage device to process that much data in such a
short period of time?

I am not sure what benefit a proxy-node would do for us, other than to
manage multiple nodes from one connection/GUI - or am I totally off base on
this?

Our current configuration is such:

7-Windows 2016 VM's (adding more to spread out the load)
Each of these 7-VM's handle the backups for 5-30 nodes. Each node is a
mountpoint for an user/department ISILON DFS mount -
i.e. \\rams\som\TSM\FC\*, \\rams\som\TSM\UR\*<smb://rams/som/TSM/UR/*> etc. FWIW, the reason we are
using VM's is the connection is actually faster then when we were using
physical servers since they only had gigabit nics.

Even when we moved the biggest ISILON node (20,000,000+ files) to a new VM
with only 4-other nodes, it still took 4-days to scan and backup 102GB of
32TB. Below are a recent end-of-session statistics (current backup started
Friday and is still running)

07/09/2018 02:00:06 ANE4952I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects inspected: 20,276,912 (SESSION: 21423)
07/09/2018 02:00:06 ANE4954I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects backed up: 26,787 (SESSION: 21423)
07/09/2018 02:00:06 ANE4958I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects updated: 31 (SESSION: 21423)
07/09/2018 02:00:06 ANE4960I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects rebound: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4957I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects deleted: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4970I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects expired: 20,630 (SESSION: 21423)
07/09/2018 02:00:06 ANE4959I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects failed: 36 (SESSION: 21423)
07/09/2018 02:00:06 ANE4197I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects encrypted: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4965I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of subfile objects: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4914I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of objects grew: 0 (SESSION: 21423)
07/09/2018 02:00:06 ANE4916I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of retries: 124 (SESSION: 21423)
07/09/2018 02:00:06 ANE4977I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of bytes inspected: 31.75 TB (SESSION: 21423)
07/09/2018 02:00:06 ANE4961I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total number of bytes transferred: 101.90 GB (SESSION: 21423)
07/09/2018 02:00:06 ANE4963I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Data transfer time: 115.78 sec (SESSION: 21423)
07/09/2018 02:00:06 ANE4966I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Network data transfer rate: 922,800.00 KB/sec (SESSION: 21423)
07/09/2018 02:00:06 ANE4967I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Aggregate data transfer rate: 271.46 KB/sec (SESSION: 21423)
07/09/2018 02:00:06 ANE4968I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Objects compressed by: 30% (SESSION: 21423)
07/09/2018 02:00:06 ANE4976I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Total data reduction ratio: 99.69% (SESSION: 21423)
07/09/2018 02:00:06 ANE4969I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Subfile objects reduced by: 0% (SESSION: 21423)
07/09/2018 02:00:06 ANE4964I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
Elapsed processing time: 109:19:48 (SESSION: 21423)


Even when we m

On Sun, Jul 15, 2018 at 7:30 PM Robert Talda <rpt4@cornell.edu<mailto:rpt4@cornell.edu>> wrote:

Zoltan:
Finally get a chance to answer you. I :think: I understand what you are
getting at…

First, some numbers - recalling that each of these nodes is one storage
device:
Node1: 358,000,000+ files totalling 430 TB of primary occupied space
Node2: 302,000,000+ files totaling 82 TB of primary occupied space
Node3: 79,000,000+ files totaling 75 TB of primary occupied space
Node4: 1,000,000+ files totalling 75 TB of primary occupied space
Node5: 17,000,000+ files totalling 42 TB of primary occupied space
There are more, but I think this answers your initial question.

Restore requests are handled by the local system admin or, for lack of a
better description, data admin. (Basically, the research area has a person
dedicated to all the various data issues related to research grants, from
including proper verbiage in grant requests to making sure the necessary
protections are in place).

We try to make it as simple as we can, because we do concentrate all the
data in one node per storage device (usually a NAS). So restores are
usually done directly from the node - while all backups are done through
proxies. Generally, the restores are done without permissions so that the
appropriate permissions can be applied to the restored data. (Oft times,
the data is restored so a different user or set of users can work with it,
so the original permissions aren’t useful)

There are some exceptions - of course, as we work at universities, there
are always exceptions - and these we handle as best we can by providing
proxy nodes with restricted priviledges.

Let me know if I can provide more,
Bob


Robert Talda
EZ-Backup Systems Engineer
Cornell University
+1 607-255-8280
rpt4@cornell.edu<mailto:rpt4@cornell.edu>


On Jul 11, 2018, at 3:59 PM, Zoltan Forray <zforray@VCU.EDU> wrote:

Robert,

Thanks for the insight/suggestions. Your scenario is similar to ours but
on a larger scale when it comes to the amount of data/files to process,
thus the issue (assuming such since you didn't list numbers). Currently
we
have 91 ISILON nodes totaling 140M objects and 230TB of data. The largest
(our troublemaker) has over 21M objects and 26TB of data (this is the one
that takes 4-5 days). dsminstr.log from a recently finished run shows it
only backed up 15K objects.

We agree that this and other similarly larger nodes need to be broken up
into smaller/less objects to backup per node. But the owner of this
large
one is balking since previously this was backed up via a solitary Windows
server using Journaling so everything finished in a day.

We have never dealt with proxy nodes but might need to head in that
direction since our current method of allowing users to perform their own
restores relies on the now deprecated Web Client. Our current method is
numerous Windows VM servers with 20-30 nodes defined to each.

How do you handle restore requests?

On Wed, Jul 11, 2018 at 2:56 PM Robert Talda <rpt4@cornell.edu> wrote:




--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.eduhttp://www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Robert,

Again thanks for the information. It fills in a lot of missing pieces in
my information. From what I gather, you are probably doing backups via SAN
not via IP like we do. Plus as you suggested, breaking up the backup
targets into multiple filesystem/directories to reduce the number of files
each has to scan/manage. I am pushing this issue right now.

I have always been confused with the whole proxy process but from what I
gather, it isn't that much different from what we are doing right now
except to give you a central management point for restores and backups vs
us using the web-client to give departments a way to manage their own
restores. We could adapt our process to use proxy, the biggest hurdle
being what you have accomplished ("*work with the system admins to split
the backup*") and make managing the restores a function of the University
Computer Center (where I work) vs. everyone doing their own thing. Until
we get over this reconfiguration effort, we won't be able to move forwards
on the clients since that would immediately kill the webclient.

So, do I understand correctly, each of your 144 target nodes "
-asnodename=DATANODE”
is a Windows VM? If so, what specs are you using for each VM?

On Mon, Jul 16, 2018 at 11:15 AM Robert Talda <rpt4@cornell.edu> wrote:

> Zoltan:
> I wish I could give you more details about the NAS/storage device
> connections, but either a) I’m not privy to that information; or b) I know
> it only as the SAN fabric. That is, our largest backups are from systems
> in our server farm that are part of the same SAN fabric as both the system
> running the SP client doing the backups AND the system hosting the TSM
> server. There is a 10 GB pipe connecting the two physical systems but that
> hasn’t ever been the bottleneck. And the system running the SP client is a
> VM as well.
>
> Our bigger challenge was filesystems or shares with lots of files. This
> is where the proxy node strategy came into play. We were able to work with
> the system admins to split the backup of the those filesystems into many
> smaller (in terms of number of files) backups that started deeper in the
> filesystem. That is, instead of running a backup against
> \\rams\som\TSM\FC\*<smb://rams/som/TSM/FC/*>
> We would have one backup running through PROXY.NODE1 for
> \\rams\som\TSM\FC\dir1\*<smb://rams/som/TSM/FC/dir1/*>
> While another was running through PROXY.NODE2 for
> \\rams\som\TSM\FC\dir2\*<smb://rams/som/TSM/FC/dir2/*>
> And so on and so forth.
>
> We did this using a set of client schedules that used the “objects” option
> to specify the directory in question:
>
> Def sched DOMAIN PROXY.NODE1.HOUR01 action=incr options=“-subir=yes
> -asnodename=DATANODE” -objects=‘“\\rams\som\TSM\RC\dir1\” startt=01:00
> dur=1 duru=hour
>
> Where DATANODE is the target for agent PROXY.NODE1.
>
> Currently, we are running up to 144 backups (6 Proxy nodes, 24 hourly
> backups) for our largest devices.
>
> HTH,
> Bob
>
> On Jul 16, 2018, at 8:29 AM, Zoltan Forray <zforray@VCU.EDU<mailto:
> zforray@VCU.EDU>> wrote:
>
> Robert,
>
> Thanks for the extensive details. You backup 5-nodes with as more data
> then we do for 90-nodes. So, my question is - what kind of connections do
> you have to your NAS/storage device to process that much data in such a
> short period of time?
>
> I am not sure what benefit a proxy-node would do for us, other than to
> manage multiple nodes from one connection/GUI - or am I totally off base on
> this?
>
> Our current configuration is such:
>
> 7-Windows 2016 VM's (adding more to spread out the load)
> Each of these 7-VM's handle the backups for 5-30 nodes. Each node is a
> mountpoint for an user/department ISILON DFS mount -
> i.e. \\rams\som\TSM\FC\*, \\rams\som\TSM\UR\*<smb://rams/som/TSM/UR/*>
> etc. FWIW, the reason we are
> using VM's is the connection is actually faster then when we were using
> physical servers since they only had gigabit nics.
>
> Even when we moved the biggest ISILON node (20,000,000+ files) to a new VM
> with only 4-other nodes, it still took 4-days to scan and backup 102GB of
> 32TB. Below are a recent end-of-session statistics (current backup started
> Friday and is still running)
>
> 07/09/2018 02:00:06 ANE4952I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects inspected: 20,276,912 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4954I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects backed up: 26,787 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4958I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects updated: 31 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4960I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects rebound: 0 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4957I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects deleted: 0 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4970I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects expired: 20,630 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4959I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects failed: 36 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4197I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects encrypted: 0 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4965I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of subfile objects: 0 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4914I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of objects grew: 0 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4916I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of retries: 124 (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4977I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of bytes inspected: 31.75 TB (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4961I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total number of bytes transferred: 101.90 GB (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4963I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Data transfer time: 115.78 sec (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4966I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Network data transfer rate: 922,800.00 KB/sec (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4967I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Aggregate data transfer rate: 271.46 KB/sec (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4968I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Objects compressed by: 30% (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4976I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Total data reduction ratio: 99.69% (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4969I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Subfile objects reduced by: 0% (SESSION: 21423)
> 07/09/2018 02:00:06 ANE4964I (Session: 21423, Node: ISILON-SOM-SOMADFS2)
> Elapsed processing time: 109:19:48 (SESSION: 21423)
>
>
> Even when we m
>
> On Sun, Jul 15, 2018 at 7:30 PM Robert Talda <rpt4@cornell.edu<mailto:
> rpt4@cornell.edu>> wrote:
>
> Zoltan:
> Finally get a chance to answer you. I :think: I understand what you are
> getting at…
>
> First, some numbers - recalling that each of these nodes is one storage
> device:
> Node1: 358,000,000+ files totalling 430 TB of primary occupied space
> Node2: 302,000,000+ files totaling 82 TB of primary occupied space
> Node3: 79,000,000+ files totaling 75 TB of primary occupied space
> Node4: 1,000,000+ files totalling 75 TB of primary occupied space
> Node5: 17,000,000+ files totalling 42 TB of primary occupied space
> There are more, but I think this answers your initial question.
>
> Restore requests are handled by the local system admin or, for lack of a
> better description, data admin. (Basically, the research area has a person
> dedicated to all the various data issues related to research grants, from
> including proper verbiage in grant requests to making sure the necessary
> protections are in place).
>
> We try to make it as simple as we can, because we do concentrate all the
> data in one node per storage device (usually a NAS). So restores are
> usually done directly from the node - while all backups are done through
> proxies. Generally, the restores are done without permissions so that the
> appropriate permissions can be applied to the restored data. (Oft times,
> the data is restored so a different user or set of users can work with it,
> so the original permissions aren’t useful)
>
> There are some exceptions - of course, as we work at universities, there
> are always exceptions - and these we handle as best we can by providing
> proxy nodes with restricted priviledges.
>
> Let me know if I can provide more,
> Bob
>
>
> Robert Talda
> EZ-Backup Systems Engineer
> Cornell University
> +1 607-255-8280
> rpt4@cornell.edu<mailto:rpt4@cornell.edu>
>
>
> On Jul 11, 2018, at 3:59 PM, Zoltan Forray <zforray@VCU.EDU> wrote:
>
> Robert,
>
> Thanks for the insight/suggestions. Your scenario is similar to ours but
> on a larger scale when it comes to the amount of data/files to process,
> thus the issue (assuming such since you didn't list numbers). Currently
> we
> have 91 ISILON nodes totaling 140M objects and 230TB of data. The largest
> (our troublemaker) has over 21M objects and 26TB of data (this is the one
> that takes 4-5 days). dsminstr.log from a recently finished run shows it
> only backed up 15K objects.
>
> We agree that this and other similarly larger nodes need to be broken up
> into smaller/less objects to backup per node. But the owner of this
> large
> one is balking since previously this was backed up via a solitary Windows
> server using Journaling so everything finished in a day.
>
> We have never dealt with proxy nodes but might need to head in that
> direction since our current method of allowing users to perform their own
> restores relies on the now deprecated Web Client. Our current method is
> numerous Windows VM servers with 20-30 nodes defined to each.
>
> How do you handle restore requests?
>
> On Wed, Jul 11, 2018 at 2:56 PM Robert Talda <rpt4@cornell.edu> wrote:
>
>
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.eduhttp://www.ucc.vcu.edu
> zforray@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/
>
>

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zforray@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://phishing.vcu.edu/
This message was imported via the External PhorumMail Module
Sorry, only registered users may post in this forum.

Click here to login