Welcome! » Log In » Create A New Profile

Parallelism and deduplication

Posted by Anonymous 
Parallelism and deduplication
May 07, 2016 09:48AM
Hallo, Gandalf,

Du meintest am 07.05.16:

[quote]4) hardlinks must be on the same filesystem. This is clear. So,
something like this:
$ cp daily.4 /mnt/other_file_system
would *resolve* all hardlinks and copy the whole backup (without
hardlinks) to the other filesystem, right?
[/quote]
"That depends!" ...

That depends on the fileystem of the target.

rsync -axH $Source/. $Target

can rebuild/copy all hardlinks, if the filesystem fits.

And newer versions of "cp" also can do this job.

It's necessary if you need a full backup for a bigger (or a new) disk
...

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
Parallelism and deduplication
May 09, 2016 06:01AM
On Fri, May 06, 2016 at 10:53:35AM +0200, Gandalf Corvotempesta wrote:

[quote]Absolutely not. This is not the case.

I'm trying to replace a Bacula backup environment (too many things to
check: huge databases (mine is 980GB), too many backup leves, too
prone to failures, ....) with
a smarter system like rsnapshot, where everything could be summarized
with: "rsync + cp + mv". No backup leves, no databases, files always
available for restore (and even for looking, like "cat my_lost_file")

I would like to copy the weekly backups to a tape library. Obviously I
need to resolve all hardlinks when copying or i'll end up with
inconsistent data in case of restore.
My question is: how can I detect an hardlink made by rsnapshot from an
hardlink that was on the source server ? rsnapshot's hardlink must be
dereferenced when copying to tape, original hardlink must preserved to
restore the system in the original state.
[/quote]
Hard links will be preserved within a snapshot if you tell rsync to use
-S. Apart from that rsnapshot will never create a hard link between two
files within a snapshot.

Unfortunately the only way to tell the difference between those and the
hard links between snapshots is to grovel over the filesystem and
compare every single file's inode number and path.

Consider this:
inode number filename
1 snapshot_root/weekly.0/a
1 snapshot_root/weekly.0/b
2 snapshot_root/weekly.0/c
3 snapshot_root/weekly.0/d
1 snapshot_root/weekly.1/a
1 snapshot_root/weekly.1/b
2 snapshot_root/weekly.1/c
4 snapshot_root/weekly.1/d

What this means is that on your source, a and b are hardlinks to each
other. You want to preserve that hard link. c has only one link on the
source, but hasn't changed recently so has extra links in your backups.
d also has only one link on the source, and has changed recently so only
has one link in your backups.

--
David Cantrell | Minister for Arbitrary Justice

I remember when computers were frustrating because they did
exactly what you told them to. That seems kinda quaint now.
-- JD Baldwin, in the Monastery

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
Parallelism and deduplication
May 09, 2016 06:08AM
On Fri, May 06, 2016 at 02:13:00PM +0200, Helmut Hullen wrote:

[quote]On a tape?
[/quote]
There's nothing preventing backup software from keeping track of hard
links when writing to tape. Even Ye Olde tar knows about them:

$ ls -il foo bar
45432431 -rw-r--r-- 2 dc staff 0 9 May 14:00 bar
45432431 -rw-r--r-- 2 dc staff 0 9 May 14:00 foo
$ tar cf baz.tar foo bar
$ tar tvf baz.tar
-rw-r--r-- 0 dc staff 0 9 May 14:00 foo
hrw-r--r-- 0 dc staff 0 9 May 14:00 bar link to foo

--
David Cantrell | top google result for "internet beard fetish club"

Just because it is possible to do this sort of thing
in the English language doesn't mean it should be done

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
Parallelism and deduplication
May 09, 2016 06:52AM
On Mon, May 09, 2016 at 01:57:52PM +0100, David Cantrell wrote:

[quote]Hard links will be preserved within a snapshot if you tell rsync to use
-S.
[/quote]
That should of course be -H. -S is to tell it to create sparse files
when possible.

--
David Cantrell | A machine for turning tea into grumpiness

" In My Egotistical Opinion, most people's ... programs should be
indented six feet downward and covered with dirt. "
--Blair P. Houghton

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
Parallelism and deduplication
May 09, 2016 08:30AM
On 9 May 2016 at 14:05, David Cantrell <david < at > cantrell.org.uk ([email]david < at > cantrell.org.uk[/email])> wrote:
[quote]On Fri, May 06, 2016 at 02:13:00PM +0200, Helmut Hullen wrote:

[quote]On a tape?
[/quote]
There&#39;s nothing preventing backup software from keeping track of hard
links when writing to tape. Even Ye Olde tar knows about them:

$ ls -il foo bar
45432431 -rw-r--r--  2 dc  staff  0  9 May 14:00 bar
45432431 -rw-r--r--  2 dc  staff  0  9 May 14:00 foo
$ tar cf baz.tar foo bar
$ tar tvf baz.tar
-rw-r--r--  0 dc     staff       0  9 May 14:00 foo
hrw-r--r--  0 dc     staff       0  9 May 14:00 bar link to foo
[/quote]

Once again: keeping track of hard links is trivial. Distinguishing between hard links present in the original data and those created by the backup system is expensive. Tar doesn&#39;t create any new links that weren&#39;t already there in the data, so it&#39;s not the same as rsnapshot.

poc
Sorry, only registered users may post in this forum.

Click here to login