Last week I wrote a blog post about how I noticed that none of the post-process dedupe vendors had published their dedupe speeds. They publish only their ingest speeds. I’m happy to report that my bully pulpit worked and three of them have agreed to publish both speeds, and a fourth is considering it.
First, let me say that my first post was slightly incorrect, at least with regards to FalconStor. Falconstor’s dedupe rate was available on their website in a white paper that you can get without registering. In their minds that meant they had published the number and that my claim about them was incorrect, but I still felt that it was buried a little (I never found it until they showed it to me). FalconStor agreed that it could be more prominently displayed, and has published their dedupe speed of 500 MB/s per node right on their VTL product page. (That’s 4000 MB/s of dedupe speed in a single system, since they support up to 8 nodes in a global dedupe system.) FalconStor says that this dedupe speed is supported during ingest.
SEPATON also agreed that they needed to publish their numbers and have done so on their DeltaStor Data Depuplication product page. They’re publishing a 25 TB/day dedupe speed. (That’s 289 MB/s of dedupe speed per node, for a total of 2312 MB/s of dedupe in a single system, since they support up to 8 nodes in a global dedupe system.) SEPATON says that this dedupe speed is supported during ingest.
Exagrid also agreed to publish their dedupe rates, but is in the process of a major website re-design, so they didn’t want to make any updates to the current site. So Marc Crespi agreed to publish the numbers in the meantime in his blog. The blog mentions a dedupe speed of 200 MB/s per node, or 2000 MB/s per 10-node grid with global deduplication. Exagrid supports doing dedupe during ingest, but they tell me that most of their customers prefer to do it after ingest as it increases ingest performance, which reduces backup time. Marc’s blog post explains how their appliance automatically determines when dedupe should run.
Quantum gave me the number for their adaptive dedupe (dedupe performed while ingest is going on) speed for the 6500, which can dedupe data at 2.4 TB/hr (666 MB/s). When asked about whether they would publish the number on their website, Quantum’s Bob Wientzen said, “Quantum includes figures for concurrent deduplication speed in sales engagements and presentations and is happy to discuss this with customers. We haven’t put this figure on our website and we don’t release our future marketing plans.”
I don’t know about you, but I’m pretty psyched about getting these numbers out there. They were never really hidden, because you could get them if you asked for them, but now they’re all out in the open and easily obtained.
Some of the vendors expressed concern about what the inline vendors would do with these numbers, since most of the post-process vendors have dedupe speeds that are at least half as slow as their ingest speeds (with the exception of Quantum, where the two speeds are almost identical). First, let me say that most of the dedupe speeds published above are faster than any of the dedupe speeds offered by any of the inline vendors, and they can all dedupe while data is being ingested. And, since their ingest rates are even faster than their dedupe rates, they offer a very compelling story when compared to the inline vendors. Publish your speeds and be proud, fellows!
----- Signature and Disclaimer -----
Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.