SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
tar is needed, but deleted files not needed
Author Message
Post tar is needed, but deleted files not needed 
Hi!

I have started using backuppc at first with rsync, but backing up servers with millions of small files took very much time (days). This would be normal, but the incremental backups took almost the same time, which is not normal.

I have then read that I should use tar as backup method, because the incremental backups will be much faster. I have tested this and it is true, much-much-much faster!!!
I have then realized that tar does not store deleted file information in the incremental backups which is a real pain in the *ss, because when my clients ask me to restore a state, they will surely hate me if I restore the deleted files too.

Am I right that there is no way to keep track of deleted files?

If yes, aren't there somekind of alternative ways? For instance a directory tree pre-script that uses catalogs? Smile
The best would be if backuppc+tar could use catalog files or something.
Or if backuppc would be able to utilize dar as a backup solution, now that would really be nice, because dar uses catalogs.

Please comment on my problem!

Thanks you,
Daniel

Post tar is needed, but deleted files not needed 
On Fri, Jan 6, 2012 at 12:19 PM, Daniel <dandadude < at > gmail.com> wrote:

I have started using backuppc at first with rsync, but backing up servers
with millions of small files took very much time (days). This would be
normal, but the incremental backups took almost the same time, which is not
normal.

Rsync incrementals still compare the directories to detect deletions
and new files with old timestamps. But it should still be much faster
than a full unless much of the data has changed since the previous
full. Is there any way you can break the runs up into smaller ones
of separate directories?

I have then read that I should use tar as backup method, because the
incremental backups will be much faster. I have tested this and it is true,
much-much-much faster!!!
I have then realized that tar does not store deleted file information in the
incremental backups which is a real pain in the *ss, because when my clients
ask me to restore a state, they will surely hate me if I restore the deleted
files too.

Am I right that there is no way to keep track of deleted files?

Only rsync does it in backuppc.

If yes, aren't there somekind of alternative ways? For instance a directory
tree pre-script that uses catalogs? Smile
The best would be if backuppc+tar could use catalog files or something.
Or if backuppc would be able to utilize dar as a backup solution, now that
would really be nice, because dar uses catalogs.

Please comment on my problem!

GNUtar has a mechanism to track deletions and which files are included
(the --listed-incremental= option) but backuppc doesn't use it.

--
Les Mikesell
lesmikesell < at > gmail.com

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post tar is needed, but deleted files not needed 
Hi Les!Thank you for the input.Well all I see is that incremental backup with rsync can take 3 hours for 750 MB (30 mbit/sec bandwidth, strong computers), which is absolutely crazy. With tar this is 5 minutes at max.
So you can see my concerns Sad And I have a server where the full backup took about 72 hours.I wanted the backuppc machine (with 6 TB HDD) to be able to backup all my servers (there are many), but if backing up of 1 server can take days with incremental, then this is not a solution.
Seperation is possible, I had many shares for rsync to handle, but there are some where the seperation would be a real pain in the *ss, for instance /home etc.On the other hand, tar would have been really nice, but the deleted files problem makes it almost useless for my intentions.
I am really sad about this, because the software is really cool, I like the web interface, I like the command line scripts, I like everything.Only this deletion problem... This should be top priority, and should already be in the software Smile (I mean I know it is open source etc and that I did not develop it, but it's a feature I think many of us would like and that would really make it
a much heavier software for the whole world.The truth is that I have been experimenting with many backup solutions in the past, and they are either very complicated to handle (bacula), or aren't reliable. BackupPC was the numero uno, until I discovered this. It can even work together with autoloaders,
there are scripts on the net for this, so it would have been real neat.I will have to use dar, which is a solution that never let me down, but needs a BackupPC-like interface too :-)It uses catalogs, thus incrementals are real fast, everything is nice with it, just it's a bit more time consuming to set up nicely and in my case, needs software on client and server side.
Thanks for taking the time, I will be watching BackupPC features in the future, I hope it will handle this tar problem soon :-)Regards,Daniel>On Fri, Jan 6, 2012 at 12:19 PM, Daniel <dandadude < at > ...> wrote:

I have started using backuppc at first with rsync, but backing up servers
with millions of small files took very much time (days). This would be
normal, but the incremental backups took almost the same time, which is not
normal.

Rsync incrementals still compare the directories to detect deletions
and new files with old timestamps. But it should still be much faster
than a full unless much of the data has changed since the previous
full. Is there any way you can break the runs up into smaller ones
of separate directories?

I have then read that I should use tar as backup method, because the
incremental backups will be much faster. I have tested this and it is true,
much-much-much faster!!!
I have then realized that tar does not store deleted file information in the
incremental backups which is a real pain in the *ss, because when my clients
ask me to restore a state, they will surely hate me if I restore the deleted
files too.

Am I right that there is no way to keep track of deleted files?

Only rsync does it in backuppc.

If yes, aren't there somekind of alternative ways? For instance a directory
tree pre-script that uses catalogs? Smile
The best would be if backuppc+tar could use catalog files or something.
Or if backuppc would be able to utilize dar as a backup solution, now that
would really be nice, because dar uses catalogs.

Please comment on my problem!

GNUtar has a mechanism to track deletions and which files are included
(the --listed-incremental= option) but backuppc doesn't use it.

--
Les Mikesell
lesmikesell < at > ...

Post tar is needed, but deleted files not needed 
On Sat, Jan 7, 2012 at 2:25 PM, Daniel <dandadude < at > gmail.com> wrote:

Well all I see is that incremental backup with rsync can take 3 hours for
750 MB (30 mbit/sec bandwidth, strong computers), which is absolutely crazy.
With tar this is 5 minutes at max.

That seems slightly extreme. What has to happen is that the target
machine has to send the whole directory before anything starts, then
the backuppc server walks both its previous full of that machine and
the RAM copy from the target looking for directory differences. In an
incremental, it will skip anything where the filename, timestamp, and
length match. Where they don't match, they walk through the file
contents exchanging block checksums to find the differences. Each
directory entry takes a small but finite amount of time, and updating
large files with changes can be slow because it is partly
uncompressed/copied from the existing version and partly copied over
the network. You might improve things a bit if you can find something
to exclude (tmp/cache areas with a lot of filenames, database files
that are better handled in other ways, etc.).

But, 3 hours would still be a reasonable backup window for a lot of purposes.

So you can see my concerns Sad And I have a server where the full backup
took about 72 hours.

The first full is a special case since you copy everything, and the
2nd run will uncompress everything to do the block checksum compares.
If you use checksum caching the 3rd full will actually be the one to
time.

I wanted the backuppc machine (with 6 TB HDD) to be able to backup all my
servers (there are many), but if backing up of 1 server can take days with
incremental, then this is not a solution.

If you have sufficient ram, you can run at least a couple in parallel.
And you can skew the fulls/incrementals so they don't all end up
doing the long runs the same day.

The truth is that I have been experimenting with many backup solutions in
the past, and they are either very complicated to handle (bacula), or aren't
reliable. BackupPC was the numero uno, until I discovered this. It can even
work together with autoloader

You could try amanda if you are going to tape. It knows how to use
the --listed-incremental feature of GNUtar.

--
Les Mikesell
lesmikesell < at > gmail.com

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post tar is needed, but deleted files not needed 
Daniel wrote at about 21:25:02 +0100 on Saturday, January 7, 2012:
Only this deletion problem... This should be top priority, and should
already be in the software Smile (I mean I know it is open source etc
and that I did not develop it, but it's a feature I think many of us
would like and that would really make it
a much heavier software for the whole world.

No problem. As you said, it's open source. So make it *your* top
priority and write the code to 'fix' the 'problem' that you believe to
exist. If it were a top priority for the lead developer (Craig) or for
any of the other many users and contributors, they would have for sure
written the code themselves. Alternatively, you can of course offer to
pay someone to write the code for you...

Meanwhile, the rest of us have been quite content with 'rsync' for
tracking deleted files or with tar without deleted file coverage...

I will have to use dar, which is a solution that never let me down,
but needs a BackupPC-like interface too Smile
It uses catalogs, thus incrementals are real fast, everything is nice
with it, just it's a bit more time consuming to set up nicely and in
my case, needs software on client and server side.

Feel free to write a module to handle 'dtar'. Other users have
contributed modules in the past (I believe the 'ftp' interface was
written that way). Beyond that, I haven't seen anyone else mention
'dtar' let alone express an interest in using it so I doubt you will
find much interest in anyone else developing it.

Thanks for taking the time, I will be watching BackupPC features in
the future, I hope it will handle this tar problem soon Smile

Don't hold your breath unless you plan to contribute yourself...


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post tar is needed, but deleted files not needed 
Dear Les!

Thanks for the valuable information. I might try rsync again keeping in mind what you have told me. Thank you!

----

Dear Jeffrey!

Yup, you are right!
BTW I meant dar (disk archiver) which is also mentioned in the documentation of backuppc. But the same thing implies here too (if I want something, I should develop it).
If I every develop a module, I will contribute it!

Thx very much!
Daniel

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB