SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
million files backup
Author Message
Post million files backup 
Hi everybody,

Does anyone know what is the best way to make a filesystem backup than contains million files?

Backup image is not posible because is a GPFS filesystem and is not supported.

Thanks in advance

Jorge
=

Post million files backup 
Jorge,

On Unix systems:
I've done it in two steps. Create a tar file of the file system and zip
it. Create a second file listing all the files in the tarred directory.
The tar extract command allows single files to be recalled if the
absolute path name is available.

I've used this to backup a Sterling Commerce flat file database with
multiple subdirectories holding more than 1.5 million files. The
tar/gzip command took 5 or 6 hours for a 500 GB file system.

I have not tried to do this on a Windows system.

Jim Schneider

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > vm.marist.edu] On Behalf Of
Jorge Amil
Sent: Thursday, February 02, 2012 8:30 AM
To: ADSM-L < at > vm.marist.edu
Subject: [ADSM-L] million files backup

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than
contains million files?

Backup image is not posible because is a GPFS filesystem and is not
supported.

Thanks in advance

Jorge

Post million files backup 
Is it organized into subdirectory trees? If so, virtual mountpoints might be a way to go.



Gary Lee
Senior System Programmer
Ball State University
phone: 765-285-1310


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > VM.MARIST.EDU] On Behalf Of Jorge Amil
Sent: Thursday, February 02, 2012 9:30 AM
To: ADSM-L < at > VM.MARIST.EDU
Subject: [ADSM-L] million files backup

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than contains million files?

Backup image is not posible because is a GPFS filesystem and is not supported.

Thanks in advance

Jorge

Post million files backup 
Hi Jim,

thank you very much for your answer.

Actually we are doing what you say. Filesystem .tar. It was a great solution when the filesystem was 500Gb-1Tb but actually our filesystem is 14Tb. The
tar/gzip command took 10-12 days... Sad

So we need another aproach

Thanks
Jorge

Date: Thu, 2 Feb 2012 08:47:24 -0600
From: jschneider < at > USSCO.COM
Subject: Re: [ADSM-L] million files backup
To: ADSM-L < at > VM.MARIST.EDU

Jorge,

On Unix systems:
I've done it in two steps. Create a tar file of the file system and zip
it. Create a second file listing all the files in the tarred directory.
The tar extract command allows single files to be recalled if the
absolute path name is available.

I've used this to backup a Sterling Commerce flat file database with
multiple subdirectories holding more than 1.5 million files. The
tar/gzip command took 5 or 6 hours for a 500 GB file system.

I have not tried to do this on a Windows system.

Jim Schneider

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > vm.marist.edu] On Behalf Of
Jorge Amil
Sent: Thursday, February 02, 2012 8:30 AM
To: ADSM-L < at > vm.marist.edu
Subject: [ADSM-L] million files backup

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than
contains million files?

Backup image is not posible because is a GPFS filesystem and is not
supported.

Thanks in advance

Jorge

=

Post million files backup 
Be sure to check the mail archives (www.adsm.org or www.mail-archive.com/adsm-l < at > vm.marist.edu/) for common issues having been discussed in the past. A lot of people have contributed a lot of information in the past.

See "Many Small Files challenge" in http://people.bu.edu/rbs/ADSM.QuickFacts for collected notes for dealing with a lot of files.

Richard Sims

Post million files backup 
If you are not using HSM, the virtual mountpoint approach is a good
one. Since we do have an integrated TSM/HSM system on GPFS, we
can't do that. It takes a bit over 3 days to wade through 130+
million files. Not enough disk to use mmbackup, which basically
'journals' the changes. I do run incrementals on directories
that are critical while the overall incremental runs.

What version of GPFS and TSM are you using? We are a bit backlevel
at GPFS 3.3 and TSM 5.5, but are in the process of upgrading. I
think a more current combination will allow for faster backups.

Gretchen Thiele
Princeton University

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > vm.marist.edu] On Behalf Of Lee, Gary
Sent: Thursday, February 02, 2012 9:50 AM
To: ADSM-L < at > vm.marist.edu
Subject: Re: [ADSM-L] million files backup

Is it organized into subdirectory trees? If so, virtual mountpoints might be a way to go.



Gary Lee
Senior System Programmer
Ball State University
phone: 765-285-1310


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > VM.MARIST.EDU] On Behalf Of Jorge Amil
Sent: Thursday, February 02, 2012 9:30 AM
To: ADSM-L < at > VM.MARIST.EDU
Subject: [ADSM-L] million files backup

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than contains million files?

Backup image is not posible because is a GPFS filesystem and is not supported.

Thanks in advance

Jorge

Post million files backup 
Hi ,
it depends on how many changes happens daily and
the other thing is how fast can the tsm-client scan the filesystem.
We have some clients with > 30Mio files and they can scan with tsm more than
5 Million Files per hour - but there is not very much change in the files.

On some clients with also such a mass but with
more changes we are mixing the scans with 'normal' incremental
on week-end with lower activities and with q schedule made for 'incremental by date'
so the client is associated with two schedules like

Policy Domain Name: U15040_SM
Schedule Name: MO_FR_MAIL
Description: Mo-Fr Backup ab 22:00:00 Uhr ... incrbydate
Action: Incremental
Options: -incrbydate
Objects:
Day of Week: Weekday
...
and the normal one
Policy Domain Name: U15040_SM
Schedule Name: SA_SO_MAIL
Description: Wochenends Backup ab 22:00:00 Uhr
Action: Incremental
Options:
Objects:
Priority: 5
Start Date/Time: 04/18/08 22:00:00
Duration: 2 Hour(s)
Schedule Style: Classic
Period: 1 Day(s)
Day of Week: Weekend

Thats nothing special , just classic TSM ... with much bigger ones
you may split or tsm might be not suitable.

best regards
Rainer

Am 02.02.2012 15:30, schrieb Jorge Amil:
Hi everybody,

Does anyone know what is the best way to make a filesystem backup than contains million files?

Backup image is not posible because is a GPFS filesystem and is not supported.

Thanks in advance

Jorge


--
------------------------------------------------------------------------
Rainer Wolf eMail: rainer.wolf < at > uni-ulm.de
kiz - Abt. Infrastruktur Tel/Fax: ++49 731 50-22482/22471
Universitaet Ulm

Post million files backup 
Jorge,

on AIX there is a Journal-based backup for GPFS with a journal daemon process.

Best regards,
Kirsten




-----Ursprüngliche Nachricht-----
Von: ADSM: Dist Stor Manager [mailto:ADSM-L < at > VM.MARIST.EDU] Im Auftrag von Jorge Amil
Gesendet: Donnerstag, 2. Februar 2012 15:30
An: ADSM-L < at > VM.MARIST.EDU
Betreff: [ADSM-L] million files backup

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than contains million files?

Backup image is not posible because is a GPFS filesystem and is not supported.

Thanks in advance

Jorge



-------------------------------------------------------

Fachinformationszentrum Karlsruhe, Gesellschaft für wissenschaftlich-technische Information mbH.
Sitz der Gesellschaft: Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB 101892.
Geschäftsführerin: Sabine Brünger-Weilandt.
Vorsitzender des Aufsichtsrats: MinDirig Dr. Thomas Greiner.

Post million files backup 
One million files shouldn't be a problem with some planning; we have a
GPFS filesystem with 31 million files that we can backup in 18 hours. We
have some other non-GPFS filesystems with several times that number of
files and we can still get backups done in under 24 hours.

Here's some suggestions:

1. Make sure the TSM server is not a bottleneck. Make sure you have
plenty of CPU, RAM, and that the database is on the fastest disks you
can afford. Make sure the database is spread across multiple volumes
with dedicated disks.

2. If you can, run TSM 6. Running DB2 was a huge win for small file
backup performance for us.

3. Work with the people generating data to split their data up into
multiple directory trees. Add each directory as a virtualmountpoint.

4. Use a network fabric that supports RDMA if you can (i.e. Infiniband,
newer 10GbE). This won't help with metadata lookups, but will make data
transfers much more efficient. You can see whether GPFS is using RDMA by
running "mmfsadm dump verbs".

5. Like with any other backup, work with the users to figure out what
actually needs to be backed up.

-- Skylar Thompson (skylar2 < at > u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

On 02/ 2/12 06:30 AM, Jorge Amil wrote:
Hi everybody,

Does anyone know what is the best way to make a filesystem backup than contains million files?

Backup image is not posible because is a GPFS filesystem and is not supported.

Thanks in advance

Jorge


Post million files backup 
GPFS, so use mmbackup with TSM, there is some white/blue/redpaper or a wiki somewhere about that.

On 2 feb. 2012, at 15:30, Jorge Amil wrote:

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than contains million files?

Backup image is not posible because is a GPFS filesystem and is not supported.

Thanks in advance

Jorge


--
Met vriendelijke groeten/Kind Regards,

Remco Post
r.post < at > plcs.nl
+31 6 248 21 622

Post million files backup 
I find that "incrbydate" is often a win in making the first pass across a
filesystem with a large count of files to be backed up.

Doing a "dirsonly" pass is sometimes a win as well.

Thanks,
[RC]





jorgeamil < at > HOTMAIL.COM
Sent by: ADSM-L < at > VM.MARIST.EDU
02/02/2012 06:37 AM
Please respond to
ADSM-L < at > VM.MARIST.EDU


To
ADSM-L < at > VM.MARIST.EDU
cc

Subject
[ADSM-L] million files backup






Hi everybody,

Does anyone know what is the best way to make a filesystem backup than
contains million files?

Backup image is not posible because is a GPFS filesystem and is not
supported.

Thanks in advance

Jorge



If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation. =

Post million files backup 
Have you tried using the memoryefficient=diskcache method? I'm assuming you have. Takes a while, but on fast systems I've had pretty decent results. You may need to go the virtual mount point route though with a file system that large.


See Ya'
Howard Coles Jr.
John 3:16!


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > VM.MARIST.EDU] On Behalf Of Jorge Amil
Sent: Thursday, February 02, 2012 9:06 AM
To: ADSM-L < at > VM.MARIST.EDU
Subject: Re: [ADSM-L] million files backup

Hi Jim,

thank you very much for your answer.

Actually we are doing what you say. Filesystem .tar. It was a great solution when the filesystem was 500Gb-1Tb but actually our filesystem is 14Tb. The
tar/gzip command took 10-12 days... Sad

So we need another aproach

Thanks
Jorge

Date: Thu, 2 Feb 2012 08:47:24 -0600
From: jschneider < at > USSCO.COM
Subject: Re: [ADSM-L] million files backup
To: ADSM-L < at > VM.MARIST.EDU

Jorge,

On Unix systems:
I've done it in two steps. Create a tar file of the file system and zip
it. Create a second file listing all the files in the tarred directory.
The tar extract command allows single files to be recalled if the
absolute path name is available.

I've used this to backup a Sterling Commerce flat file database with
multiple subdirectories holding more than 1.5 million files. The
tar/gzip command took 5 or 6 hours for a 500 GB file system.

I have not tried to do this on a Windows system.

Jim Schneider

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > vm.marist.edu] On Behalf Of
Jorge Amil
Sent: Thursday, February 02, 2012 8:30 AM
To: ADSM-L < at > vm.marist.edu
Subject: [ADSM-L] million files backup

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than
contains million files?

Backup image is not posible because is a GPFS filesystem and is not
supported.

Thanks in advance

Jorge


DISCLAIMER: This communication, along with any documents, files or attachments, is intended only for the use of the addressee and may contain legally privileged and confidential information. If you are not the intended recipient, you are hereby notified that any dissemination, distribution or copying of any information contained in or attached to this communication is strictly prohibited. If you have received this message in error, please notify the sender immediately and destroy the original communication and its attachments without reading, printing or saving in any manner. Please consider the environment before printing this e-mail.

Post million files backup 
I'm able to backup several systems with 100's of millions of files with the diskcache method.

If your backup is running that slow on a gpfs filesystem, you probably have a performance
problem with the filesystem. Can you give a quick over view of the gpfs environment?
ie sata, sas, or are they fc attached? in what topology? what interconnect on the nodes?

----
Daniel Murphy-Olson
Systems Administrator
Mathematics & Computer Science Division
Argonne National Laboratory
630-252-0055

----- Original Message -----
From: "Howard Coles" <Howard.Coles < at > ARDENTHEALTH.COM>
To: ADSM-L < at > VM.MARIST.EDU
Sent: Thursday, February 2, 2012 11:11:07 AM
Subject: Re: [ADSM-L] million files backup

Have you tried using the memoryefficient=diskcache method? I'm assuming you have. Takes a while, but on fast systems I've had pretty decent results. You may need to go the virtual mount point route though with a file system that large.


See Ya'
Howard Coles Jr.
John 3:16!


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > VM.MARIST.EDU] On Behalf Of Jorge Amil
Sent: Thursday, February 02, 2012 9:06 AM
To: ADSM-L < at > VM.MARIST.EDU
Subject: Re: [ADSM-L] million files backup

Hi Jim,

thank you very much for your answer.

Actually we are doing what you say. Filesystem .tar. It was a great solution when the filesystem was 500Gb-1Tb but actually our filesystem is 14Tb. The
tar/gzip command took 10-12 days... Sad

So we need another aproach

Thanks
Jorge

Date: Thu, 2 Feb 2012 08:47:24 -0600
From: jschneider < at > USSCO.COM
Subject: Re: [ADSM-L] million files backup
To: ADSM-L < at > VM.MARIST.EDU

Jorge,

On Unix systems:
I've done it in two steps. Create a tar file of the file system and zip
it. Create a second file listing all the files in the tarred directory.
The tar extract command allows single files to be recalled if the
absolute path name is available.

I've used this to backup a Sterling Commerce flat file database with
multiple subdirectories holding more than 1.5 million files. The
tar/gzip command took 5 or 6 hours for a 500 GB file system.

I have not tried to do this on a Windows system.

Jim Schneider

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L < at > vm.marist.edu] On Behalf Of
Jorge Amil
Sent: Thursday, February 02, 2012 8:30 AM
To: ADSM-L < at > vm.marist.edu
Subject: [ADSM-L] million files backup

Hi everybody,

Does anyone know what is the best way to make a filesystem backup than
contains million files?

Backup image is not posible because is a GPFS filesystem and is not
supported.

Thanks in advance

Jorge


DISCLAIMER: This communication, along with any documents, files or attachments, is intended only for the use of the addressee and may contain legally privileged and confidential information. If you are not the intended recipient, you are hereby notified that any dissemination, distribution or copying of any information contained in or attached to this communication is strictly prohibited. If you have received this message in error, please notify the sender immediately and destroy the original communication and its attachments without reading, printing or saving in any manner. Please consider the environment before printing this e-mail.

Post million files backup 
Thank you very much for all your answers.I´m going to read carefully all your mail to valorate the best solution.

Thanks

Jorge
=

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB