<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>Two different types of de-duplication</title>
		<description>Discuss Two different types of de-duplication</description>
		<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html</link>
		<lastBuildDate>Fri, 10 Feb 2012 20:56:55 +0000</lastBuildDate>
		<generator>JComments</generator>
		<atom:link href="http://www.backupcentral.com/component/jcomments/feed/com_content/129/10.html" rel="self" type="application/rss+xml" />
		<item>
			<title>Richard Lewis says:</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-987</link>
			<description><![CDATA[When you say about the disadvantages of target de-duplication: 'Considered a "band-aid" by some to help backup software that was designed to use disk' do you mean "...designed to use tape"? Ie, legacy apps such as TPM? Or have I misunderstood something? :-)]]></description>
			<dc:creator>Richard Lewis</dc:creator>
			<pubDate>Thu, 02 Sep 2010 15:08:09 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-987</guid>
		</item>
		<item>
			<title>I didn\'t miss the point</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-403</link>
			<description><![CDATA[I said &#34;Source de-duplication requires you to use different backup software on the client(s) where you want to use it.&#34; Also, you cannot completely dismiss Symantec like that. I realize you're a competitor and you need to position against them, but they are absolutely NOT &#34;worthless.&#34; I know of several very large installations that are very happy. Also, how you slam a very successful product with a beta product is beyond me. I fixed the link to your blog.]]></description>
			<dc:creator>W. Curtis Preston</dc:creator>
			<pubDate>Thu, 13 Nov 2008 15:28:34 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-403</guid>
		</item>
		<item>
			<title>Source Vs Target</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-401</link>
			<description><![CDATA[Custis you missed a point. Target/inline de-duplication doesn't need custom/modified a backup agent i.e the backup agent is sending the incremental data without bothering about the duplicates. This saves storage but consumes but not bandwidth. Source de-duplication has a modified agent and sends hashes before sending the data. This saves time, bandwidth and storage. and BTW, the simple checksum matching technique used in puredisk is of no good use. A simple byte insertion can shift all the blocks. An old post from my blog - http://blog.druvaa.com/2008/06/15/data-de-duplication/]]></description>
			<dc:creator>Jaspreet</dc:creator>
			<pubDate>Thu, 13 Nov 2008 15:21:51 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-401</guid>
		</item>
		<item>
			<title>You got it!</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-108</link>
			<description><![CDATA[I did say that none of the features that people typically debate matter (inline vs post process, MD5 vs SHA-1 vs custom, reverse vs forward referencing, etc). What matters is: 1. How big is it? (i.e. how much disk do you give me and what de-dupe ratio do I get with my data?) 2. How fast is it? (i.e. how fast are backups, restores, and the overall de-dupe process?) 3. How much does it cost? All of your arguments against post-process above are aiming at #2. My opinion is neither in-line or post-process de-dupe system can claim any kind of victory. Both have advantages and disadvantages that have to be tested out with your data and your servers. Then, when all that testing is done, you get to compare how big, fast, and expensive the systems are. THAT's all that matters. Having said that, I'd like to comment on some of your statements, as I think they represent common misunderstandings about the process. Instead of doing it here, I'm going to do it in another blog post.]]></description>
			<dc:creator>cpreston</dc:creator>
			<pubDate>Fri, 24 Aug 2007 14:51:30 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-108</guid>
		</item>
		<item>
			<title>inline or (later) batch target</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-107</link>
			<description><![CDATA[In a recent podcast (sorry, I forget which one), Curtis said "inline or delayed de-dupe doesn't matter when selecting a backup appliance" (my words and estimation of Curtis' intent). Although he went on at length to discuss some effects, I thought he left out a few important considerations. First, some or all delayed de-dupe systems seem to require scheduling of a de-dupe batch process where the data or data location may not have full functionality during the de-dupe process. Second, the de-dupe process is extremely resource (especially CPU) hungry. With the process running, the appliance may not perform reasonably with other functions during the process. of course this will change for the better over time. Inline de-dupe appliances are built to withstand high-resource consumption during backup ... there really are no significant back-end processes. Third, if the deduped data is sent "off site", such as with a proprietary system sending the de-dupe information to a similar box, ... the movement off site is necessarily delayed until de-dupe can be accomplished. No such delay is required if inline de-dup. Fourth, if de-dupe is done "inline", the process of getting data from its source to off site is simpler. Simple is good. For these reasons, I see the vendors with "inline" de-dupe as having a significant advantage ... one that shouldn't be waived off as unimportant. Of course, it's possible that I didn't realize my hearing aide batteries needed changing at the time I listened to Curtis' podcast. :wink: ... and my generalizations may well be worse than I think Curtis' was. cheers, wayne]]></description>
			<dc:creator>wts</dc:creator>
			<pubDate>Fri, 24 Aug 2007 10:17:07 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-107</guid>
		</item>
		<item>
			<title>Nice try.</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-85</link>
			<description><![CDATA[As far as I can tell, Puredisk will not be built into the base product. It'll be an option just like it is now. It's just that it will be more integrated with the base product. Nice try, though. :wink:]]></description>
			<dc:creator>cpreston</dc:creator>
			<pubDate>Thu, 02 Aug 2007 10:15:07 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-85</guid>
		</item>
		<item>
			<title>Cost Effective?</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-84</link>
			<description><![CDATA[:-?: So, which way will be the most cost effective route to take for data de-dupe? By using NBU 6.0 & Purchasing the Pure Disk Option OR by upgrading to NBU 6.5 with the Pure Disk already built in? I have been looking at other options than NBU for de-dupe and there are some good ones and bad one in comparison to Pure Disk.]]></description>
			<dc:creator>tsufan</dc:creator>
			<pubDate>Thu, 02 Aug 2007 09:23:46 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-84</guid>
		</item>
		<item>
			<title>It goes both ways</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-81</link>
			<description><![CDATA[I just re-read the article, and I could see where you think I'm saying that de-dupe is file-level, but that's not what I'm saying. I just gave a duplicated file as an example. I'll re-edit the blog entry and give another example. BTW, There is file-level de-dupe and sub-file-level de-dupe. File-level de-dupe is also called CAS, or content-addressable storage, and yes -- they do the hash at the file level. But what we're talking about here is sub-file-level de-dupe. This catches not only duplicated files, but also duplicated pieces of files (i.e. blocks) that have already been seen. So when you back up a spreadsheet every day because it gets updated every day, each day you should back up only the new blocks in that spreadsheet.]]></description>
			<dc:creator>cpreston</dc:creator>
			<pubDate>Tue, 31 Jul 2007 17:00:43 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-81</guid>
		</item>
		<item>
			<title>File level?</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-80</link>
			<description><![CDATA[You seem to suggest, rather explicitly, that dedupes are functioning on the file level. Not only am I pretty sure that's not true (including not of Puredisk, at least, that's not what they said at the EDPF in Minnesota in 2005), but I think it'd be pretty boneheaded to hash files rather than blocks. Was that just for ease of understanding for the reader?]]></description>
			<dc:creator>grammar</dc:creator>
			<pubDate>Tue, 31 Jul 2007 16:18:49 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-80</guid>
		</item>
		<item>
			<title>You got it</title>
			<link>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-79</link>
			<description><![CDATA[You can already buy the Puredisk product now, and yes, it is source de-dupe. NBU 6.5 will offer greater integration between the two products. _I_ talk about remote restores! Check out my next blog entry: "De-duplication & remote restores."]]></description>
			<dc:creator>cpreston</dc:creator>
			<pubDate>Tue, 31 Jul 2007 11:01:37 +0000</pubDate>
			<guid>http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/129-de-dupe-types.html#comment-79</guid>
		</item>
	</channel>
</rss>

