SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
archive inflation from file moves.
Author Message
Post archive inflation from file moves. 
As I understand it moving a data around a file system registers to rsnapshot as new files being created/deleted.

For example I have two disks /home and /snapshot

I download a 5Gb movie to /home/michael/tmp/Macho_Women_with_Guns_2.mp4
rsnapshot runs creating

/snapshot/daily.0/home/michael/tmp/Macho_Women_with_Guns_2.mp4

Next day I move it to /home/archive/movies/Macho_Women_with_Guns_2.mp4

/snapshot/daily.0/home/archive/movies/Macho_Women_with_Guns_2.mp4
/snapshot/daily.1/home/michael/tmp/Macho_Women_with_Guns_2.mp4

So /snapshot will contain 10Gb until the tmp file falls off the end of the snapshot list.

I guess the fix is simply "don't do it" but I was wondering if there was a solution

(getting a bigger disk is not really an option I'm actually working with multi terabyte
datasets and a prefect storm could easily fill the snapshot partition Smile

--
Michael Lush

Post archive inflation from file moves. 
Hallo Michael Lush,

Am Donnerstag, 17. November 2011 11:47 schrieb Michael Lush:
As I understand it moving a data around a file system registers
to rsnapshot as new files being created/deleted.

For example I have two disks /home and /snapshot

I download a 5Gb movie to
/home/michael/tmp/Macho_Women_with_Guns_2.mp4 rsnapshot runs
creating

/snapshot/daily.0/home/michael/tmp/Macho_Women_with_Guns_2.mp4

Next day I move it to
/home/archive/movies/Macho_Women_with_Guns_2.mp4

/snapshot/daily.0/home/archive/movies/Macho_Women_with_Guns_2.mp4
/snapshot/daily.1/home/michael/tmp/Macho_Women_with_Guns_2.mp4

So /snapshot will contain 10Gb until the tmp file falls off the
end of the snapshot list.

I exclude all the tmp folders, because I nerver want to have them
backed up.

--
Herzliche Grüße!
Rolf Muth
Meine Adressen duerfen nicht fuer Werbung verwendet werden!
PGP Public Key:
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xF8DC41935544C89A

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
Le 17/11/2011 13:25, rsnapshot-discuss-request < at > lists.sourceforge.net a
écrit :
As I understand it moving a data around a file system registers to
rsnapshot as new files being created/deleted.(...)I was wondering if there was a
solution

(getting a bigger disk is not really an option I'm actually working with
multi terabyte
datasets and a prefect storm could easily fill the snapshot partition:-)

--
Michael Lush


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
Dear Michael Lush
You can run a command to detect the file moves in your post-backup
script. My own choice was "hardlink"
(http://jak-linux.org/projects/hardlink/) but there are probably others.
I posted some "caveat" about it just a few days ago :
http://sourceforge.net/mailarchive/forum.php?thread_name=4EBB93F1.5030400%40numerigraphe.com&forum_name=rsnapshot-discuss
Yours,
Lionel sausin

As I understand it moving a data around a file system registers to
rsnapshot as new files being created/deleted.(...)I was wondering if there was a
solution

--
Michael Lush


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
Hallo, Michael,

Du meintest am 17.11.11:

As I understand it moving a data around a file system registers to
rsnapshot as new files being created/deleted.

For example I have two disks /home and /snapshot

I download a 5Gb movie to /home/michael/tmp/Macho_Women_with_Guns_2.m
p4 rsnapshot runs creating

/snapshot/daily.0/home/michael/tmp/Macho_Women_with_Guns_2.mp4

Next day I move it to /home/archive/movies/Macho_Women_with_Guns_2.mp
4

/snapshot/daily.0/home/archive/movies/Macho_Women_with_Guns_2.mp4
/snapshot/daily.1/home/michael/tmp/Macho_Women_with_Guns_2.mp4

So /snapshot will contain 10Gb until the tmp file falls off the end
of the snapshot list.

I guess the fix is simply "don't do it" but I was wondering if there
was a solution

Where is the backup problem?
A backup has to "mirror" an actual state. If you change the position of
a file or directory on the original system then the backup has to follow
this change - nothing else.

If you want to replace real dupes somewhere (in this case: in the
backup) then you may use some dupe file program ; I prefer "hardlink"
for my rsnapshot backups.

<http://helmut.hullen.de/filebox/Linux/slackware//ap/hardlink-1.2-i486-1hln.tgz>

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
On Thu, Nov 17, 2011 at 2:47 AM, Michael Lush <mjlush < at > gmail.com> wrote:
As I understand it moving a data around a file system registers to rsnapshot
as new files being created/deleted.

For example I have two disks /home and /snapshot

I download a 5Gb movie to /home/michael/tmp/Macho_Women_with_Guns_2.mp4
rsnapshot runs creating

/snapshot/daily.0/home/michael/tmp/Macho_Women_with_Guns_2.mp4

Next day I move it to /home/archive/movies/Macho_Women_with_Guns_2.mp4

/snapshot/daily.0/home/archive/movies/Macho_Women_with_Guns_2.mp4
/snapshot/daily.1/home/michael/tmp/Macho_Women_with_Guns_2.mp4

So /snapshot will contain 10Gb until the tmp file falls off the end of the
snapshot list.

I guess the fix is simply "don't do it" but I was wondering if there was a
solution

In the past when I've shifted a very large directory from one place to
another, instead of moving it I did cp -al. This hardlinks the files,
and the snapshot should extend those hardlinks to the new location (I
can't recall if this requires a special rsync flag or not). The next
day after a few backups had run, I circled back and removed the
originals. It works alright as a one-shot thing.

-scott

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
On Thu, Nov 17, 2011 at 8:35 PM, Helmut Hullen <Hullen < at > t-online.de ([email]Hullen < at > t-online.de[/email])> wrote:
Hallo, Michael,

Du meintest am 17.11.11:
Where is the backup problem?
A backup has to "mirror" an actual state. If you change the position of
a file or directory on the original system then the backup has to follow
this change - nothing else.

rsnapshot is not just mirroring the actual state of a file system, it is also storing previous states, normally it does
this efficiently.

However if I have two 500Gb drives (one productioin one snapshot) and I move a 300Gb of data from one directory
into a another on production.  the snapshot drive will simply fill up and stop mirroring the actual state
(until human intervention or the old version falls off the end of the snapshot list) which I think is a problem.
 
If you want to replace real dupes somewhere (in this case: in the
backup) then you may use some dupe file program ; I prefer "hardlink"
for my rsnapshot backups.

  <http://helmut.hullen.de/filebox/Linux/slackware//ap/hardlink-1.2-i486-1hln.tgz>

Thanks I'll have a look at that.

--
Michael

Post archive inflation from file moves. 
On Thu, Nov 17, 2011 at 1:13 PM, Scott Hess <scott+rsnapshot < at > doubleu.com> wrote:
On Thu, Nov 17, 2011 at 2:47 AM, Michael Lush <mjlush < at > gmail.com> wrote:
As I understand it moving a data around a file system registers to rsnapshot
as new files being created/deleted.

For example I have two disks /home and /snapshot

I download a 5Gb movie to /home/michael/tmp/Macho_Women_with_Guns_2.mp4
rsnapshot runs creating

/snapshot/daily.0/home/michael/tmp/Macho_Women_with_Guns_2.mp4

Next day I move it to /home/archive/movies/Macho_Women_with_Guns_2.mp4

/snapshot/daily.0/home/archive/movies/Macho_Women_with_Guns_2.mp4
/snapshot/daily.1/home/michael/tmp/Macho_Women_with_Guns_2.mp4

So /snapshot will contain 10Gb until the tmp file falls off the end of the
snapshot list.

I guess the fix is simply "don't do it" but I was wondering if there was a
solution

In the past when I've shifted a very large directory from one place to
another, instead of moving it I did cp -al.  This hardlinks the files,
and the snapshot should extend those hardlinks to the new location (I
can't recall if this requires a special rsync flag or not).  The next
day after a few backups had run, I circled back and removed the
originals.  It works alright as a one-shot thing.

Addendum: You need -H in your rsync flags to get this kind of
hard-link behavior, I think.

-scott

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
On Thu, Nov 17, 2011 at 1:15 PM, Michael Lush <mjlush < at > gmail.com> wrote:
On Thu, Nov 17, 2011 at 8:35 PM, Helmut Hullen <Hullen < at > t-online.de> wrote:
Du meintest am 17.11.11:
Where is the backup problem?
A backup has to "mirror" an actual state. If you change the position of
a file or directory on the original system then the backup has to follow
this change - nothing else.

rsnapshot is not just mirroring the actual state of a file system, it is
also storing previous states, normally it does
this efficiently.

However if I have two 500Gb drives (one productioin one snapshot) and I move
a 300Gb of data from one directory
into a another on production.  the snapshot drive will simply fill up and
stop mirroring the actual state
(until human intervention or the old version falls off the end of the
snapshot list) which I think is a problem.

The problem is that you've asked rsnapshot to mirror two states which,
in aggregate, exceed the capacity of your backup device. It's a
problem, but it's not really rsnapshot's problem. Conceivably rsync
could be expanded to be able to find a link target anywhere w/in the
filesystem, but that is prohibitive, because it would have to traverse
all of that.

I think it would be an interesting extension to rsync to record the
meta-info of all deleted files, to check before creating a file. It
would require completion of a complete meta-info pass first, which
might be incompatible with how it currently operates, though it might
make sense in -H mode (presumably it already has to check in that case
in case a new file is a hardlink to an existing file?).

-scott

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
On 11/17/2011 04:29 PM, Scott Hess wrote:
On Thu, Nov 17, 2011 at 1:13 PM, Scott Hess<scott+rsnapshot < at > doubleu.com> wrote:
On Thu, Nov 17, 2011 at 2:47 AM, Michael Lush<mjlush < at > gmail.com> wrote:
As I understand it moving a data around a file system registers to rsnapshot
as new files being created/deleted.

For example I have two disks /home and /snapshot

I download a 5Gb movie to /home/michael/tmp/Macho_Women_with_Guns_2.mp4
rsnapshot runs creating

/snapshot/daily.0/home/michael/tmp/Macho_Women_with_Guns_2.mp4

Next day I move it to /home/archive/movies/Macho_Women_with_Guns_2.mp4

/snapshot/daily.0/home/archive/movies/Macho_Women_with_Guns_2.mp4
/snapshot/daily.1/home/michael/tmp/Macho_Women_with_Guns_2.mp4

So /snapshot will contain 10Gb until the tmp file falls off the end of the
snapshot list.

I guess the fix is simply "don't do it" but I was wondering if there was a
solution
In the past when I've shifted a very large directory from one place to
another, instead of moving it I did cp -al. This hardlinks the files,
and the snapshot should extend those hardlinks to the new location (I
can't recall if this requires a special rsync flag or not). The next
day after a few backups had run, I circled back and removed the
originals. It works alright as a one-shot thing.
Addendum: You need -H in your rsync flags to get this kind of
hard-link behavior, I think.

-scott

Another option that I've done in the past is to move the moved/renamed
directory/file on the backup server. In your case I would have executed
the following command on the backup server before the next backup run:
mv /snapshot/daily.0/home/michael/tmp/Macho_Women_with_Guns_2.mp4
/snapshot/daily.0/home/archive/moves/

This is, of course, of limited value in a large, multi-user
environment. It only works when I'm doing some housekeeping as the
system admin. But my users don't normally warn me ahead of time when
they rename or move directories around... Smile

Steve

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

View user's profile Send private message
Post archive inflation from file moves. 
Hallo, Michael,

Du meintest am 17.11.11:

Where is the backup problem?
A backup has to "mirror" an actual state. If you change the position
of a file or directory on the original system then the backup has to
follow this change - nothing else.


rsnapshot is not just mirroring the actual state of a file system, it
is also storing previous states, normally it does
this efficiently.

The previous states are renamed older backups.

However if I have two 500Gb drives (one productioin one snapshot) and
I move a 300Gb of data from one directory
into a another on production. the snapshot drive will simply fill up
and stop mirroring the actual state
(until human intervention or the old version falls off the end of the
snapshot list) which I think is a problem.

???
It mirrors the new "actual state".

If a program runs out of space: that's another problem.

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post archive inflation from file moves. 
While it doesn't analyze meta-data, "rsync --fuzzy" could be helpful -
it tries to find a file that "looks like" the source.
But as far as I remember it can only detect files renamed in the same
directory, not files moved in another directory. It's also pretty slow,
from what I remember.
Lionel.

Le 17/11/2011 22:39, Scott Hess <scott+rsnapshot < at > doubleu.com> a écrit :
I think it would be an interesting extension to rsync to record the
meta-info of all deleted files, to check before creating a file. It
would require completion of a complete meta-info pass first, which
might be incompatible with how it currently operates, though it might
make sense in -H mode (presumably it already has to check in that case
in case a new file is a hardlink to an existing file?).


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB