The opinions contained within this website, it's blog(s), forums, and Wikis, are those of the original poster and do not represent the position of my (or any other) employer.
I think that the concept that deduplication vendors "rehydrate" data is completely wrong and needs to be abandoned. This "rehydration" or "reduping" data is blamed for the penalty that some dedupe vendors have on reads, but I think the concept is completely incorrect. Click Read More to see what I'm talking about.
Anyone who attended SNW this year knows that it isn't the show it used to be. At the same hotel as usual, there were plenty of seats available at the bar, plenty at the Starbucks, and some very conspicous companies were absent from the entire ordeal. The BD Event, however, was very different. Cick Read More to see more.
When NetApp made a $1.5B bid for Data Domain, I was shocked. When EMC announced a $1.8B bid to take the deal away from them, I almost hit the floor. Click Read More to see more.
In a surprise reverse of their longstanding "inline is the ONLY way to go and anyone that does post processing is stupid" position, Data Domain today announced that Post Process is A-OK with them. Or at least that's how I'm reading the acquisition of Data Domain for $1.5B by NetApp. Click Read More to see more.
This post has a little to do with backup and everything to do with Star Trek & SCUBA. Feel free to ignore or read on. I attended a special screening of Star Trek XI on Catalina Island this weekend. In attendance were... well. Click Read More to see!
While beginning work on a series of articles entitled "NetBackup Best Practices," I was reminded of the arguments I've had with customers and coworkers about such things. Some argue that there is no such thing as a best practice. Others argue that they don't apply to them. Click Read More to see what I think.
I just read Tony Asaro's blog entry entitled "EMC Anti-Social Media Gang." He says that the EMC trio of Chuck Hollis, Barry Burke, and Mark Twomey has targeted him with personal attacks using terms like "industry streetwalker" and "pimping his services." I read also with interest Chuck Hollis' reply to the post, and Tony's response to, well, you know how that works. I have some thoughts on this subject. Click Read More to see them.
Are you interested in purchasing a target deduplication system (NAS or VTL)? Do you want to perform full testing before you buy, but have been unable to do so due to lack of manpower? Did you get a quote from a consulting company on what it would cost for them to do the testing, and almost die of shock? What if I and my cohorts could do the test for free? Would that be interesting to you? Click Read More for details.
I updated the "Disk Targets" directory in the Backup Central Wiki. Check it out if you want to see who does what now. I've updated the "global dedupe" column and added a "deduped replication" column.
Some of the vendors I'm dealing with in the blogosphere are having a difficult time with my frankness. They think I'm picking on them. I've got news for them: it's not just you. Click Read More to see what I'm talking about.
In a blog post while I was down under, Storagezilla said that EMC doesn't do free, in response to comments from Frank Slootman that they were giving away dedupe to get footprint. Storagezilla basically says that this is untrue. Click Read More to see what I think.
I have just returned from an 18-day part-work, part-vacation trip to wonderful Australia. That's why you didn't seen any entries in that time! Click Read More to see more.
As you’re probably aware, my blog post “The real deal on the 3D4000” drew some rather harsh criticism from Mark Twomey of EMC. He didn't give a title, but he said that he is the, "owner of every DL in EMEA marked engineering sample since product introduction. Setter upper of systems from the cardboard box to production." The only thing I got when I asked EMC was that he was "in the sales organization." Apparently I'm the last person to figure this out, but Mark Twomey is Storagezilla. I have read his blog before, but did not equate the two together. His blog says that he is the "Information Protection Subject Matter Expert for Ireland."
Mark and I have never spoken or met before, but he claims to be “the voice of authority,” and he made several statements that can be summarized as “Preston has no idea what he’s talking about.”Since my credibility is the only thing I have going for me, I felt it was important to make a second post that proves that what I said was true.Click Read More to see the proof.
My previous post on dedupe performance illustrated the impact that global dedupe has on the effective performance of different dedupe appliances.I received a lot of comments from vendors that didn’t have global dedupe saying one of two things.One thing they would say is that the vendors that claimed to have global dedupe didn’t really have it.I know too much to believe that.The other thing they’d say is that global dedupe wasn’t as important as I was making it out to be.Well, that’s definitely not the case, and that’s what this post is all about.Click Read More to see why I think global dedupe is critical for larger environments.
In his blog, Scott Waterhouse (from EMC), asked the question “Why did we build a DL 4000 3D?” He rightly notes that I have been a vocal critic, but he thinks I’m “dead wrong” on my thoughts about the 3D 4000. He’s saying “nothing personal,” and I’ll say that this post is the same thing – nothing personal. He said his piece, now I’ll say mine. This is another long post, but I think it's an important one. Click Read More to see what my problems are with the EMC 3D 4000. (Update: After you've read this post, make sure to read this follow-on post.)
[This article was slightly updated May 7, 2009. New comments are in brackets.]
This blog entry is, to my knowledge, the first article or blog entry to compare the performance numbers of various [target] dedupe vendors side by side. I decided to do this comparison while writing my response to Scott Waterhouse's post about how wonderful the 3DL 4000 is, but then I realized that this part was enough that it should be in a separate post. Click Read More to see a table that compares backup and dedupe performance of the various dedupe products.
As I'm generally a fan of cloud backup services for the home user (and for some corporate users), it is sad that stories are reporting that both HP Upline and Yahoo! Briefcase are shutting down at the end of this month. Click Read More to know more.
Thirteen years ago, two companies accomplished the impossible and created NDMP. It's become such a standard way to back up NAS that you may have forgotten just how revolutionary it was when it came out. I'm going to remind you of its history and say that history needs to repeat itself with dedupe & virtualizatioin. Click Read More to see what in the world I'm talking aobut.
I just finished the St. Louis and San Antonio dedupe schools, and had a great time. I actually ate at 3, count them, THREE restaurants covered on The Food Channel's "Diners, Drive-ins, and Dives." Wanna hear what I thought of them? Click Read More.
IBM announced TSM 6.1 a few days ago, and it's supposed to be GA in March. The two features of DB2 and dedupe have been long awaited. What do I think about them? Click Read More to see more.
If you're not a UNIX geek, you won't care, but a milestone just happened, where UNIX time was 1234567890. You could have counted down with me, but it's over now. If you have no idea what I'm talking about, click Read More to be enlightened.
A Tech Target, that is. You may have already received an email from Tech Target that started out like this:
We're pleased to announce that W. Curtis Preston -- unquestionably the leading independent expert on the topic of backup and recovery -- has joined TechTarget in the role of Executive Editor. Curtis will continue to pursue his career as an independent consultant, while expanding his already prolific writing and speaking efforts with TechTarget.
For more about how this will (or will not) impact Backup Central, as well as the business idea I'm working on, click Read More.
I recently had my first (and probably my last) experience with godaddy services. The interaction left me with a taste nasty enough that I thought I'd blog about it.
After reading yet another story of a vendor in our space trying to stack the deck, I am reminded of my opinion that user-written reviews must be looked at with a very large grain of salt. Click Read More to find out who the guilty party is this time.
Tech Target has announced the cities and dates of all my speaking events this year, and we've finally worked out the details of our new (post-GH) relationship. So I thought I'd tell you when/where I'll be speaking this year. Click Read More to see the schedule!
For me there have always been two storage events: Storage Decisions & Storage Networking World. There are others, but these are the two I have historically missed only if forced to. Now I'm switching one of them out for a show I'll bet you've never heard of and I'll tell you why in this blog entry. Click Read More for more.
Bacula Systems is now the commercial arm for the popular open-source backup product Bacula. "It roams the datacenter at night and sucks the vital essence from your computers." I'm not making that up. This is the second open-source backup product to do such, with Zmanda preceding them. Click Read More to see what I think about this.
I've now seen two vendors saying things to the effect that "once you get 10:1, dedupe ratio doesn't matter." They state that 10:1 saves 90% of disk, and 20:1 saves 95% of disk, so the difference is only 5% -- so why is everyone so concerned about dedupe ratio? To this 90%/95% comment I say, "balderdash!" Click Read More to see more.
This is what happens when you lose thousands of people's data in today's world. Someone makes a hate video about you and posts it on Youtube. Click Read More.
It took all of a few seconds for an entire company and it's 16,000 bloggers to disappear. Now they're gone forever. Click Read More to see more, including an interesting note about one of the affected journalspace bloggers, who is actually quite famous among bloggers.
I'm building a very small focus group consisting of IT managers from "end-user" companies (i.e. not IT vendors). I'd like to see what you (the eventual company's target market) think about the idea before I start spending money to make it happen.
This will require a very small time committment. A few emails, a few phone calls, and no travel -- at least not for now. If the idea gets funded and off the ground, things may change (if you want to continue the relationship). But for right now, I just need a few smart people who see things from the end-user point of view.
If you're interested, please
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
.
Looks like at least one other GlassHouse employee was as surprised by my departure as I was. I had a good laugh when something arrived at my front door just now. (Click Read More to see more.)
I have been let go from GlassHouse Technologies, effective immediately, and am therefore free to pursue other opportunities. Click Read More to find out more.
I'm sitting here in Omaha, NE, waiting for my flight back to warmer weather in San Diego. Something just happened that just touched my heart. In these days of cynicism, it's so nice to see people do the Right Thing(tm).
Click Read More to find out what they did.
I went to Fry's today to buy something for the office, and was just about to pull out my personal MasterCard when I saw the AMEX logo! Click Read More.
After sitting through four vendor presentations at Storage Decisions yesterday, I felt the urge to blog about it. They ranged from mediocre to incredibly bad. I'm certainly not the best presenter in the world, but I do all right, and I do think I've got a few tips to pass on. So here I go. (Click Read More.)
I remember when I first started talking to Quantum about dedupe and they were trying to call their “immediate” deduplication “inline” because it’s happening at the same time as backup. They eventually stopped referring to it as inline, as it does not meet the definition of inline dedupe that was around long before they came out with their product. Unfortunately, now that EMC is now selling their Quantum-based product, they’re apparently trying to do the same thing – or at least one of their bloggers is. As usual, I’m drawing a very thick line between inline and post-process. Click Read More to see why.
I just posted a comment in Scott Waterhouse’s The Backup Blog that I didn’t agree that NetApp was the last major vendor to come out with dedupe. Since that seems opposite to what seems to have happened, click Read More to see why I believe this, and why I think this is important.
Maybe you were like me when you first read about NetBackup's Open Storage Option (OST): you were underwhelmed. You also may have been waiting for vendors to jump on it. (It was announced two years ago.) Things have definitely changed since then. In fact, at a recent very large customer that was considering purchasing a dedupe target, they chose only to look at vendors that supported NetBackup's OST. Click Read More to see why this option is so important for NetBackup customers, and why other backup software vendors better come up with something like it real-quick-like.
Most of my recent blog activity has been spent commenting on other people's blogs. Specifically, vendor blogs. It takes some serious chutzpah, and you have to be very respectful (even if you don't feel like doing so), but it can be very rewarding as well. This blog entry talks about some of the blogs I've been commenting on and why.
NetApp has had an interesting line of data protection products for a long time, including snapmirror, snapvault, snapmanager, and open systems snapvault. What they didn't have was a centralized place to configure, manage, and report on all those things. That all changed with the release of Protection Manager several months ago. Click Read More to learn more.
The second installment of a Byte and Switch four-part series is out, and it's full of the same untrue statements found in the first installment. I will say the same thing I said in a comment I made on the first installment: "Is the author completely unaware [of the real facts] … Or is the author purposefully withholding information…?" Click Read More to see both sides of the story that he is only telling one side of.
I'm reversing a long-standing position of mine, and people who know me know I don't do that sort of thing very often. (My wife would tell you I never do.) Typical backup software installs an agent on each system to be protected. Agentless backup gets the job done without installing agents, choosing instead to log in to each server each time it does a backup. I've never liked this for security reasons, but someone has finally described to me an agentless setup that is just as secure (if not more secure) than the typical agent-based approach. Click Read More to find out which one.
As I started working on making sure all my information was up to date on all the dedupe vendors, I thought about you! What have you always wanted to ask the dedupe vendors? Click read more to see what I'm talking about.
I occasionally hear TSM customers and sales reps tell me that TSM's tape format is so proprietary that even a TSM server can't read it if it doesn't have the database for it. In other words, some people believe that TSM tapes don't need to be encrypted because if you someone got ahold of them, they couldn't read them without the TSM database. This is such a common belief that I have a TSM field manual from 2005 that says "There is no way to restore TSM backups (except for client backup sets) without the database." I would say that sentence would be correct if you added the phrase "in TSM" right after the phrase "TSM backups." I know of four different ways to read TSM tapes without using TSM at all. Click read more to see what they are.
What do you want to know about your favorite dedupe vendor? What do you think you know that you don't? What do you have no idea about? Which vendors' claims are exaggerated?
Occasionally people ask me if those who have regulations requiring the immutability and non-repudiation of certain types of data should be concerned about data deduplication. I've also seen a few blog entries and articles like this one asking the same question. Does dedupe change the data? Can you use deduped storage if you have immutability and/or non-repudiation requirements? Click Read More to see what I think.
With the release of NetBackup 6.5.2, Symantec has created a new watershed event: they have released (to my knowledge) the first mainstream backup features that require disk to use them. Click Read More to learn more.
Symantec turned off LIST access to their FTP server, forcing customer to go through their website to get patches, documentation, etc. Click Read More to see what I think of that.
I received an email today telling me about a whitepaper sponsored by the LTO group and written by The Clipper Group. I'm used to seeing such whitepapers, and used to seeing them state things in such a way that makes the point the sponsor of the paper is trying to make. I've even written a number of these whitepapers myself. But this one just takes the cake, and I'd like to tell you why.
This has nothing to do with backups, but I just finished watching the first episode of Carrier on PBS and I'm feeling proud... Click Read More to find out why.
I keep running into a particular problem at customers and I'm curious if any backup software products have addressed it. Do any backup products load-balance their use of tape drives across multiple Fibre Channel ports? Click Read more to see more details.
A CNET review of Mozy's online backup software, entitled "Everyone likes Mozy, Except Me." makes one or two good points, but IMHO misses the boat and makes no sense to me, a backup person. If you want to know what he said and why I disagree with it, read on.
If you're encrypting the data on your hard drive using OS-level software encryption (e.g. Windows EFS, Vista BitBlocker, MacOS FileVault, Linux DM-Crypt, or TrueCrypt ), then a research study at Princeton University, partially funded by the Department of Homeland Security, has figured out how to read that data without your password. Well, that's just great.
I kept reading stories like this one that said that Quantum's dedupe is inline. Then I would hear from those "in the know" that said it was post-process. Different people at Quantum would say different things. Some would say that they run the dedupe at the same time as the ingest, so they considered it inline, although data is hitting disk before it's deduped. They say since it only hits disk for a few seconds, it's really inline. I said, "No it's not." So what's the scoop? Read on to see.
Odd title, you say? It was inspired by a Cybernetics whitepaper that I read this morning entitled "The Risks of a Disk-Only Backup Strategy." You can read the whitepaper yourself by following this link. While I actually have a fond place in my heart for Cybernetics, I think this paper isn't worth the paper I didn't print it on. Click "Read More" to see why.
Some of you may read this title and say, "No duh!" but I think it's worth talking about. From email to pictures to storage, if you're storing your data on somebody else's servers, it probably isn't getting backed up. Let's talk about that.
Tech Target and GlassHouse have agreed on the cities and dates for 2008, so I thought I'd share them with you as soon as possible. Maybe I'll be speaking at a city near you!
I've received a lot of questions about the different tape drives are doing encryption, so I thought I'd get some details and get back to you. This is the first of a few posts on this topic, and it concentrates on Sun's T10000 tape drive. There will also be a post on IBM's TS1120 and their LTO-4 drive.
Did you know that over 8 billion dollars in gift cards will go unused this year? They get lost; you forgot you had them; you don't have them when you need them; you get the idea. Don't let that happen to you by backing up your gift cards.
After hanging up on the 50th phone call where I recommend personal computer stuff to friends and family, I figured I'd do the same for my backupcentral friends. This blog entry will highlight what I think you should do at home to ensure a nice smooth computing experience -- and yes -- to make sure it's all backed up. I'll also mention a few things I don't think you should do.
According to a recent post on the NetWorker discussion list, EMC messed up a long standing default behavior of NetWorker. I didn't believe it, and it turns out I was right. ;)
Customers and audience members often ask about the maturity of deduplication. Is it mature? Are all products mature? Should you buy it now or wait? I thought that would make a nice blog entry.
Back in August, I posted a blog entry that 10 GbE is a lie, as no one who I had talked to had ever gotten more than 400 MB/s (3200 Mb/s), and I asked anyone who had done better to write. Well, they wrote.
A lot of FUD has been passed around lately about the probability of a hash collision when using a hash-only de-duplication system. What is a hash-only de-dupe system? What is the real probability of a hash collision in such a system? Read on to find out.
According to a story on Linux.com, Symantec's lawyers sent Michael D. Setzer II, the leader of the open-source project formerly known as Ghost for Linux (G4L), an email "requesting" that he change the name of the project (which has been called simply "G4L" for some time) to something else. Although the only references to Ghost or Symantec on his website was saying that it had relationship to Symantec or its product, that apparently wasn't good enough for Symantec. What do I think? Read on.
I did this podcast a few months ago, but it's now available on searchstorage.com. It's always interesting to hear what I said a few months ago to see if I still agree with it.
Data Domain has just announced that it is entering the nearline market with its latest OS release optimized for storing smaller files. What does this mean for the big CAS players on the block?
When speaking about de-dupe to general audiences, a very common question is, "Can TSM customers benefit from de-dupe?" The short answer is yes, but not to the same degree as customers of other backup products.
SysAdmin magazine was the first magazine to publish an article of mine, and they have closed their doors. (They've even notified their subscribers that they can get a refund on their 5-year subscriptions.)
In-line and post-process de-duplication are features -- not benefits. And I think that arguing about which one is better is like trying to argue which is better: synchronous or asynchronous replication. The both have benefits and drawbacks. What matters is whether or not the one you buy meets your requirements, right? I'll try my best to present both sides of the argument, and dispel a lot of what I believe to be misconceptions about this issue.
I know it may sound like an obvious statement, but I think all your backups should fit in one night. Not just 24 hours, but 12 hours or less. Besides the obvious RTO/RPO problems, it just messes up so many parts other parts of the design when backups go over 24 hours. A related issue is that I think it's a bad idea to push all your full backups to the weekend.
Live Free Die Hard is a good movie and a lot of fun, but I don't think I'll design my backups around it. (Spoiler alert: Don't click Read more if you haven't seen the movie.)
I was surprised to learn that a lot of people responded to a 10-07 de-duplication survey to say that they didn't even know what de-dupe was. So I thought I'd write some blog entries on that. This first one is a basic primer on the subject.
Will these guys stop at nothing? Google's acquisition of Postini for $625 million sets it up to provide outsourced email and archiving services for large corporations. Things that make you go hmm....
I'm sitting here in my room at the London Heathrow Hilton awaiting a flight back to the US after spending two weeks in London and surrounding cities. (I was visiting GlassHouse's UK offices and clients, and giving the first London Backup School.) One night that I will never forget was being only a few blocks from one of the two unsuccesful car bombs that attempted to kill hundreds of people in central London.
If you query the ADSM-L archives for "multi-session" and "collocation," you'll see the latter is definitely discussed more than the former -- way more. Which is actually better for improving TSM restore speeds? This blog entry attempts to discuss the pros and cons of both, as well as what you need to do to activate either one.
I'm sitting here at the "Blogger's lounge" at Symantec Vision in Las Vegas, and I just got back from a session on Puredisk, where two customers talked about how they're using it. Turns out it actually works!