Login Form






Lost Password?
No account yet? Register (FREE)

Search Backup Central

Twitter Updates

Twitter Updates

    follow me on Twitter

    Disclaimer

    The opinions contained within this website, it's blog(s), forums, and Wikis, are those of the original poster and do not represent the position of my (or any other) employer.
    The real deal on the EMC 3D 4000 PDF Print E-mail
    Written by W. Curtis Preston   
    Friday, 06 March 2009
    In his blog, Scott Waterhouse (from EMC), asked the question “Why did we build a DL 4000 3D?”  He rightly notes that I have been a vocal critic, but he thinks I’m “dead wrong” on my thoughts about the 3D 4000.  He’s saying “nothing personal,” and I’ll say that this post is the same thing – nothing personal.  He said his piece, now I’ll say mine.  This is another long post, but I think it's an important one.  Click Read More to see what my problems are with the EMC 3D 4000.  (Update: After you've read this post, make sure to read this follow-on post.)


    First, I regret that some of my thoughts about the 4000 came out in a poorly-worded, poorly formatted
    comment to Chuck’s blog at EMC.  Chuck just gets me so riled up some times and my comments to his blog tend to get emotional.  So, without retracting what I said in those comments, let me say it again with a clear head and better grammar and format.

    The architecture

    The DL3D 4000 starts with an EDL 4106, 4206, or 4406.  This is the bigger/newer version of the same EDL that’s powered by FalconStor, has been around for years and has sold several thousand units for EMC.  I’ll call this piece the “front end,” “front end engine,” or the “FalconStor VTL.”  The 4106 has one front end engine; the 4206 and 4406 have two front end engines.  Then there is the DL3D piece, which is another VTL powered by the same software that powers Quantum’s DXi line. I’ll call this piece the “back end,” the “back end engine,” or the “Quantum VTL.”  Both the front end and the back end have their own storage and their own “head,” and according to Scott’s post, you can only have one back end engine for each front end engine.  Update: According to Mark Twomey in his comments on this blog entry, it's also a valid configuration to have just one back end enginge connected to two front end engines.

    While a user manages both systems via a single interface, the actual interface between the two is what makes the 4000 so “unique.”  First, remember that the original EDL had the ability to copy virtual tapes to physical tapes using matching barcodes.  What happens when you “enable 3D” in the 3D 4000 is that tapes that are to be deduped are first copied from the FalconStor-based front end to the Quantum-based back end using the EDL’s ability to copy tapes from the VTL to physical tape (except in this case it’s actually copying to another VTL). 

    EMC may try to say that this is no different than SEPATON and FalconStor that have an ingest node front and a dedupe node back end.  However, it is very different in a very significant way.  In the case of SEPATON and FalconStor, the front and back ends actually share storage.  Once the front end is done writing something to disk, the back end node can simply read that same disk and dedupe the data.  The EMC box, however, has to physically copy the raw/native/non-deduped data from the front end node to the back end node, and then it can dedupe it.

    Update: The two paragraphs below have been updated from the initial version to remove confusion.  No facts have changed -- only my description of them.

    If the front end wrote the data to disk, and then fed the data to the back end that deduped it completely inline, it would be doing the same amount of I/O as any other post-processing dedupe system.  However, that's not the way the Quantum box works.  It's sort of an inline-post-processing hybrid.  The back end (Quantum VTL) always writes the native data (undeduped) to disk, but it doesn't always read the native data from disk.  The dedupe engine tries to read whatever it can out of RAM, but since it is possible to send data to the box faster than it can dedupe it, it often has to read the data from disk.  (Just how much is read from RAM and how much is read from disk is not known, but the data I have suggests that much of it is read from disk.)

    Whether the data is read from RAM or not, data sent through the 3D 4000 is written two to three times: once to the front end VTL (which EMC apparently calls the native pool), once to the back end native store (which EMC either doesn't know about or doesn't talk about), and if the data is new, one more time to the deduped data pool (which EMC and Quantum call the block pool). So, all data is written at least twice, and written three times if it's new data.

    This is different than a "normal" post-processing system that writes all data once (to the cache) and new data a second time (to the dedupe pool).  Assuming a 20:1 dedupe ratio, that means that the 3D 4000 does approximately 49% more I/O than a normal post-processing system (205%-(105%/205%)).  And my point is that this extra I/O requires additional resources that other disk targets don’t need, and it’s got to add significant latency to the entire process – latency that other VTLs don’t have.

    Update: In the first version of this post, I suggested that data sent to the back end was also stored on the front end for some period of time.  Mark Twomey said this was never the case.  I then updated my post to reflect that.  But guess what?  I was right!  Consider the following quote from the manual: " At this point, the data resides on both the EDL and the 3D 4000." (More on this in my next post.)

    So data that you’re not going to dedupe can stay on the front end and data you’ll dedupe will be stored on the back end.  But you can
    stored native and deduped data with a single Quantum 7500 (or a 1500/3000), albeit with somewhat lower performance (see yesterday’s blog post on performance), or with a competing solution from SEPATON.  Both vendors offer policy-based dedupe that doesn't require a dedicated set of disk.  That disk in the front is not free, and the disk required to store the cached copy in the back is not free.  Remember that only Quantum-based products require this cached copy in the back end.  Update: Mark Twomey also tries to say in his comments that this cache doesn't exist.  Look in the 3D 4000 manual and look for the word "truncate." Truncation refers to when this cached copy of the native/undedupe data is trimmed back when the block pool needs more room.  Trust me, the cached copy on the back end store exists.  (More on this in my next post.) Scott Waterhouse (an EMC blogger) also refers to this cached copy in this blog entry, and talks about it like it’s a great feature.  It’s simply a feature that helps mitigate the problem they have with restoring from deduplicated data in the block pool.  (I covered that issue in the restore section of my performance post.)  Quantum's process of storing and truncating the cached copy of the native version of the data is also well documented in their manuals.  So you can see why I don't believe Mr. Twomey when he says it doesn't exist.  Remember also that only the EMC 3D 4000 requires a front end system that has its own disk. 

    Performance

    Scott’s defense for the 4000 is that they do all this to get industry-leading performance.  Here’s a summary table from my performance post. Please tell me which category they are leading in.  Even if you ignore the global dedupe issue (which I definitely do not), they don’t even have the fastest single-node dedupe rate.  FalconStor wins that one at 1500 MB/s. (The 11000 is an 8-node cluster that dedupes everything together.  Read the other post for details on why I add some together but don’t add others together.  In the dedupe rate category, they’re almost the worst.  (Oddly enough, FalconStor actually “wins” that dubious honor at 250 MB/s if you use SATA disk. The number in the table below is for FC disk.)

    Update: I changed the table below to reflect that Mark Twomey says that it's a valid configuration to have one back end engine with two front end engines.  Therefore, it's valid to have a 2200 MB/s ingest speed and a back end dedupe speed of 400 MB/s.  I'm not sure that really changes anything, though, as you won't be able to ingest data at 2200 MB/s for very long (like 4 hours) before your 400 MB/s dedupe engine can't dedupe the data within 24 hours.

    Backup & dedupe rates for an 8-hour backup window

    Vendor

    Ingest Rate (MB/s)

    Dedupe Rate (MB/s)

    Caveats

    EMC

    1100/2200

    400/400

    You can have two nodes on the front, but they must share a single node in the back if you're to get global dedupe.  You can also put two dedupe nodes in the back, but they don't have global dedupe.  So you get either 1100 or 2200 MB/s, but always 400 MB/s in the back.

    Data Domain

    750

    750

    Max performance with OST only, NFS/CIFS/VTL performance appx 25% less

    FalconStor/Sun

    11000

    3200

    8 node cluster, requires FC disk

    IBM/Diligent

    900

    900

    2 node cluster, requires FC disk

    NetApp

    600

    Not avail.

    2 node data cut in half (no global dedupe)

    Quantum/Dell

    880

    500

    Ingest rate assumes fully deferred mode (would be 500 otherwise)

    SEPATON/HP

    3000

    1500

    5 nodes with global dedupe

    Back to Scott’s Blog

    Now let’s look at some quotes from Scott’s blog in light of what I’ve said above.

    “That is the basic approach, and Mr. Preston hates it.”

    What I see is twice as much storage for not much more (and often less) performance when compared to competitors.  What exactly is there to like about that?

    “If you have a backup window that you care about, and want to stick to, then you care about ingest speed. And the DL4406 (with or without deduplication) gives you the ability to ingest over 8 TB per hour.”

    Update: This answer has been updated to reflect Mark Twomey's statement that you can have one back end engine to two front end engines.

    As I said in the
    performance post, if you have two back end engines, this is really two separate nodes that each do 4 TB/hr.  They don’t share disk or deduped information.  You cannot load balance backup jobs across both halves of that node and keep the same dedupe ratio.  I therefore say this is 4 TB/hr, not 8.  And that, unfortunately, is not an impressive number.

    It is also valid, according to Mark Twomey, to have one back end engine.  Then EMC can say that they have an ingest speed of 2200 MB/s, but then they can only dedupe it at 400 MB/s.  That's not very good, either, as it means you can only back up for four hours at that rate and still dedupe the data within 24 hours.

    “That is well beyond the capacity of any deduplication box currently available. From anybody.”

    Again, take a look at that table and tell me how this is the case.

    “The point is that you have the rest of the day to deduplicate, if you want. What you care about is making that window. Once you have done that, you can deduplicate at your leisure, or until the window rolls around again on the next day.”

    Update: The wording of the following paragraph has been changed to remove confusion.  The facts have not changed.

    Scott talks like dedupe speed doesn’t matter.  It totally matters.  It matters because you can’t get behind on deduping data or your dedupe rate goes in the crapper.  (That is, if you run out of hours in a day to dedupe, you’ll eventually not dedupe some data.)  It also matters because if you plan to replicate, you can’t replicate it until it’s deduped.  (While you can start replicating as soon as you start deduping, you can't finish replicating until you finish deduping.)  And if you take 24 hours to dedupe, the backups that you made this morning (say at 6 AM) won’t be replicated offsite until 6 AM tomorrow morning.  I’m sorry, but many companies don’t want that kind of delay imposed on getting data offsite.

    “If finishing your backup is your priority, the DL4000 3D is an excellent choice.”

    I’d say there are far better choices that don’t come with all the baggage of the 4000.

    “Restores from this [native] pool can be accomplished at up to 1,600 MB/s. Far faster than pretty much any other solution available today, from anybody. At 6 TB an hour, that is certainly much faster than any deduplication solution.”

    Well, except the two vendors and their two partners that go much, much faster than that, sure.

    “As to how much remains in cache: that is up to you. … there is no better choice than a DL4000.”

    How about a product that doesn’t require a cache of days and days of data and still gets better restore performance than the 4000?  Wouldn’t that be a better choice?

    Update: I know that Mark Twomey says this cache doesn't exist.  I'm sorry, but the facts, and his fellow employee's blog, are not in his favor.  Quantum does store cached native copies of the data, and they really need to (for now) to have decent restore performance.  It's also indirectly in the manual, although it doesn't use the word "cache."  It talks about the truncation process that leaves only deduplicated data after it is run (in other words, it deletes the non-deduplicated, native version, which I'm referring to as the cache), and it talks about how truncation is deferred as long as possible, and how this deferment "is intended for helping to improve the performance of applications that read the backed up data." 
    See Scott Waterhouse's blog entry for more information on this cache.  Look for the sentence, "The restore performance of a DL3D has two key metrics."

    As to my claim that "The FalconStor [front end engine] piece adds no value that additional storage on the Quantum part of it [back end engine] wouldn't add." He lists four features that it does add.  I’ll agree that I didn’t think about them.  But I’ll also say they don’t matter to most customers I’ve met.

    Tape caching

    I’m not a fan of back-end tape except for rare circumstances.  (Update: In this response I am specifically referring to VTLs that have the ability to do virtual to physical tape copying by mirroring bar codes and faking out the backup software.  I'm not saying I'm done with tape. I'm just saying I don't like tape caching in most circumstances.)  My experience has been that this is the sure-fire way to having an unreliable VTL and backup system, not to mention being unsupported by your backup software app.  (Tell a Symantec or TSM consultant, SE, or support person that you’re doing tape caching and watch their reaction.)  I think the back app should control all this, and tape caching doesn’t support that.  In addition, it leads to poor media utilization.  But, hey, if you want this, then knock yourself out.  I’d also suggest looking at the “real” FalconStor box with their native dedupe.  It does tape caching too.

    ACSLS compatibility

    What, so I can plug into an ACSLS library?  That only matters if you’re doing tape caching.  And if I did do back end tape, I’d stay as far away from ACSLS if I could.  (Talk about products I can’t stand! Whew!)

    Embedded Media Server/Storage Node

    This buys me nothing in additional functionality.  All it does is allow me to do two things with one box.  I can still do exactly the same functionality by connecting the VTL to a regular media server/storage node.  IMNSHO, if there is enough CPU and I/O available in your VTL to run a media server or storage node, then you’re not using it hard enough.  And if you ARE pushing it to its limit, why the heck would you want to put that functionality in there and use up your valuable CPU cycles to do this?  I see this as an option only for small environments, and if it’s a small environment, why are the buying a 4000?

    Wide number of tape/library emulations

    I ask this all the time: who cares what “brand” your virtual tape is?  This only matters if you’re doing tape caching, where have to match bar codes and tape capacities.

    iSeries connectivity

    OK, you got me.  I’ll give you that one.  I don’t see a lot of iSeries, though.  As in I’ve never actually seen one.  This is also available in the native FalconStor model.

    I also said the following in my original blog comment: "The 4200 & [4400 sic] may offer a faster ingest rate, but it will still be gated by the ingest rate of the 3D/Quantum box on the back end, which is approximately that of the 4100."

    Scott uses the same backup window argument to argue why this isn’t the case.  But I still argue that (a) there are other systems that can handle that backup window just fine and (b) there are only 24 hours in a day, so you can’t just shovel a ton of stuff in the front end and hope it gets deduped on the back end.  Therefore, I stand by my original comment.

    “The DL4000 3D offers you unmatched flexibility and performance to meet your windows”

    Other systems (Quantum, SEPATON & FalconStor) allow you to select and control what gets deduped, so no flexibility advantage.  As to unmatched performance, sure, I’ll give you that.  None of the other systems in that table have performance anywhere near as bad as the 4000.  (OK, I know that was mean, but when you make comments like “unmatched performance” that don’t go anywhere reality, you just open the door way too wide for a joker like me.)

    “Saying that more flexibility and more choice are bad (as long as they don't come at some huge expense in terms of ease of use or manageability) is just silly. “

    I never said that.

    “So for an accurate cost comparison, we would have to look at 6 DL3000s vs. one DL4406 with 1 or 2 deduplication engines. And given the performance of other vendors' deduplication boxes, the choice is usually 5 or 6 of their appliances to. Or one DL4406. Now the proposition seems a little more sensible, doesn't it?”

    But the “5 or 6 appliances” to which you refer do not each come with disk, the way yours do.  They are just more heads to access and address the same disk.  AND when you buy 5 or 6 nodes from them, all that data is deduped together, where your system doesn’t work like that.  So a 5 or 6 node SEPATON or FalconStor system acts like a single system, where a two node system of yours (that is only about twice as fast as one or two of theirs) acts like two completely separate nodes.

    “With respect to price, all I will offer up is: call your sales representative if you are still with me and are interested in doing a comparison.“

    And this is where the rubber really meets the road and my real problem with the 3D 4000.  It’s two completely separate VTLs with two completely separate disk systems.  It literally is twice as much disk as the competitor.  Do you really think that extra disk is free?  Even if they gave it to you for free, do you think the power and cooling on twice as much disk is free, or the people-hours to maintain twice as much disk is free?  The 4000 has to cost more than competing solutions.  If it doesn’t, then EMC is dropping the price below a decent margin just to get in the door.  Gee, they’ve never done that before. (I’m working with a customer right now where EMC is literally giving him over a million dollars of stuff for free just to get his business.)

    Also, remember to include maintenance for the life of the depreciation cycle when comparing pricing.  Some vendors have a really nasty habit of giving things away on the front end to win the business, then showing the customer the real maintenance bill a year later, which is often based on the list price that you never paid.

    “This sounds like a good package to me.”

    As I’ve said before, it’s the only way EMC could fulfill their promise to those that bought the EDL 4000.  They told them “buy it now,” dedupe is coming.  They decided (for both technical and political reasons) not to use the native FalconStor dedupe, and this method of bolting another VTL to the back of the 4000, and copying tapes to be deduped to the other VTL is the only way they could bring dedupe without using the FalconStor dedupe code.

    But when I look at the performance numbers and I look at how much extra disk and internal complexity this thing has (regardless of whether that complexity is hidden from me or not), I don’t see why anyone would buy a 3D 4000 as a new system.  I also predict that within a year, EMC will agree and offer a “native” version of the Quantum-based product that is bigger/faster than the 3000, the 4000 hybrid will slowly disappear, and EMC’s relationship with FalconStor will end.  After all, as one anonymous industry source said, “it is somewhat problematic to manage a single product that uses two products from competing companies.”

    I think there are plenty of reasons to buy the 1500 or 3000.  I can see reasons why an existing 4000 customer would buy the 3D option.  (Although, since you're essentially buying a completely new system that EMC will bolt on to your existing system, you might want to check if there are other options that offer you better performance at less cost.)   But I definitely see no compelling reason to buy a new 3D 4000.  There are faster, less complex, systems that use far less disk and should cost much less.

    Comments
    Search RSS
    Scott Waterhouse  - Thank you     |2009-03-07 08:38:16
    OK, I am definitely not taking that personally, and I definitely do appreciate the thought and reasoning that went into that. I have lots of thoughts, but more importantly, I have a plane to catch to a tropical destination.

    I will respond in a week or so when I get back. In the meantime, thanks again for the reasoned dialog. I think we benefit, and the user community benefits, from this sort of exchange of ideas--even if you and I don't necessarily come to agreement on each and every point.
    Mark Twomey  - Lots to correct.     |2009-03-10 19:03:53
    And now you'll read my piece.

    I've been working with the DL series since introduction and am the voice of authority.

    Without struggling with your awful text editor and quoting you, I have a number of corrections.

    This idea of Front End/Back End you have is wrong. Like in other post processing architectures there is a native pool and a block pool. The block pool is the addition the de-dup option for the 4000 series. It is two units managed as one logical entity, backups designated for de-duplication are defined by a policy set by the end user.

    Data to be de-duplicated is reduced and written to the block pool, data is not moved "multiple times" as you incorrectly state it is moved once from the native pool to the block pool. The use of multiple engines/CPUs being common in some of the other post processing architectures you've listed.

    All post processing architectures have to deal with latency and resource contention the distribution across more than one node improves performance. Others on that table will agree.

    Cached copy? There is no cached copy. The cache as I know it is in the block pool and is 256MB of storage used for immediate de-duplication. There is only one copy of data. I have no idea where this multiple copies thing came from but it's wrong. If the backup data is in the native pool and you go to restore it'll be read from the native pool. If it's in the block pool it'll be read from the block pool. It'll either be one place or the other. Never both at the same time.

    For performance why in your table do we have a single engine compared to multiple engines in other systems? The 4206 and 4406 both have multiple nodes and just like the 4106 all have access to a block pool. How would failover work if all data was not available to all nodes? If your idea of global de-duplication is a unified block pool then I ask that you update all the relevant entries in those tables to 2,200MBs per second and eliminate the incorrect assertion of no global de-duplication. A block pool scaling to 148TB.

    Your replication comment and the idea of days and days is also incorrect. Policy based de-dup means I can begin replication immediately and don't have to wait until the de-dup jobs have completed. Your opinion on consolidated media management is out of touch as not only is it the most frequently sold option with the Disk Library it was the very first request for enhancement submitted to me by customers.

    Tape might be dead to you but it isn't for a lot of backup admins.

    I personally worked on qualifying the iSeries here in Cork and because you haven't seen it doesn't mean it's not a critical component of a lot of peoples infrastructure. And a lot of high end infrastructures at that.

    Pricing and cost. Why would a system with a drive count split between native and block pools be priced differently than any of the other post processing solutions structured the same way?

    As for your assertions to why this is this and that is that, I don't recall you at any of the meetings. Indeed I don't think you have any relationship with EMC do you?

    I'd appreciate it if you made all the relevant corrections and I realise how difficult it can be to pick out facts when you're not building these things from the screws up.

    Regards,
    Zilla. (Owner of every DL in EMEA marked engineering sample since product introduction. Setter upper of systems from the cardboard box to production)
    W. Curtis Preston  - I also have been involved for a while   |2009-03-11 00:28:41
    I have had many, MANY conversations via official and unofficial channels during which I have been told that the path from what the original 4000 (i.e. the Falconstor box) to the 3D 4000 (i.e. the Quantum box) is via the DL's "tape out" interface. Yes, this movement is "hidden" from the end user, and yes, the user only sees one system -- but that movement still happens.

    EMC has built a product where you have gotten two (Falconstor & Quantum) competitor's products to share data. How did you do that? Did you (a) have backups go to the FalconStor box, then somehow teach Quantum how to read the Falconstor tape format, (b) teach the FalconStor box to write backups directly to the Quantum box without stopping on the Falconstor side, or (c) send backups to the Falconstor box, then use the existing "tape out" interface to copy backups from the Falconstor box to the Quantum box?

    (A) just can't be the case. No way. (B) is unlikely, and does not handle tapes in an existing DL 4000 -- which is what the 3D 4000 was designed for. (C) is therefore the most likely case because it doesn't require major recoding of either box; it just needs a little glue to make it work. It also deals with existing virtual tapes in an existing DL 4000, and (C) is what I've been consistently told from EVERY EMC PERSON since the first time I heard how the 3DL box worked.

    If (C) is not how it works, please explain to me what happens in the following (probably pretty common) scenario. Consider a customer who is an existing DL 4000 customer and now they want to add the 3D option. (The exact customer the 3D 4000 was made for.) Once EMC installs the Quantum node and its storage, how do the previous tapes that were stored on the original 4000 get deduped? My answer is that they are copied from the original 4000 to the 3D 4000 via the tape out interface. Yes, hidden from the customer, but copied none the less.

    On to your other comments...

    You do realize that I've had EXTENSIVE indepth conversations with the Quantum folks, and so understand how the Quantum piece works VERY well, right? Their headquarters are lot closer than Hopkinton (45 mins from my house), after all. (Quantum and several other vendors also answered every question in an RFI I did for backupcentral, where EMC declined to answer many questions. So I do say that if I misunderstand your box, it is EMC's fault, not mine.)

    What you call the "native copy" of the data, I call the cache. It is the native copy; it's PURPOSE is a cache used to enable restores many times faster than what is possible from the block pool.

    What about my replicated comment is wrong? You're telling me a 3D 4000 can replicate a block BEFORE it's deduped? So if a backup is made at 6 AM (at the end of the backup window), and the box has so much data to dedeup that this backup doesn't get deduped until 24-hours later, you're saying that it'll get replicated before that? You're kidding, right?

    As for days and days of data... It is absolutely the way the Quantum piece is designed. It's designed to keep as many days' of "native versions" as it can. The only question is how many days of this native data (what I call cache) a user is going to keep. If you keep only one day's worth of data in its native format, and you do a full restore of a large filesystem, the backups from last night will come from the native pool and the other 6 days (assuming weekly fulls) will come from the block pool. Considering the difference in restore speeds (as much as 75%), I would think you wouldn't want that to happen. Therefore, if you want to have decent restore speeds, you're going to want to keep the latest full and all incrementals since the full in native format. Hence the days and days comment.

    As to the data not being in two places at the same time, I'm going to say again that what you're saying just can't be correct. Once a block of data is identified as new/unique, it is copied into the block pool, but that block is still in the "native" pool. So it's in two places at once, until the copy in the native pool is truncated.

    The reason I don't allow you to combine your numbers is that you have two completely separate Quantum boxes behind your two separate Falconstor boxes. Each Quantum box has its own hash table and they do NOT compare data against each other. It is NOT a single block pool; Quantum's software simply doesn't support that and I don't know how you could say that. They are two islands of dedupe and are therefore not much more than two separate appliances in the same box.

    As to your question about how failover happens, that has nothing to do with global dedupe. It is managed via Falconstor's failover mechanism. Both FalconStor boxes can see both Quantum boxes (which really just appear to them as two tape libraries). They normally only write to the one they're controlling, but in a failover situation, they can write/read to/from either one, just like if you had connected two REAL tape libraries behind the two heads of the DL. Two two Quantum boxes know as much about each other as two physical tape libraries do. Sorry, they do NOT have global dedupe.

    As to me not liking the media server inside a VTL, the fact that a bunch of people asked for it doesn't sway me in the least. Saying I'm out of touch is funny, though, given that I'm probably the backup industry's biggest independent proponent of moving things FORWARD when they make sense. (Consider my post on OST/NDMP, for example.) I've pushed CDP and near-CDP in the right setting, source dedupe in the right setting, but I don't happen to like that particular option and I have (IMHO) really good reasons for not liking (the fact that Symantec hates it is a big part of it).

    I never said I've abandoned tape. Not once. I said I don't want to do tape via the Falconstor-style tape-out functionality. I think tape copies should be controlled via the backup software. (I am cool doing it via the tape out functionality if it's controlled by the backup software, but you can do that in the 1500 and 3000; no need to go to the 4000 to get that.

    What I said about iSeries is (while I've never seen one), if you want that functionality, see if you can get it from FalconStor directly. Test their dedupe and see if it works for you. (I'm speaking of the customer, of course, not EMC.) Again, no reason to go the 4000 hybrid system to get that.

    As to the pricing question, it is NOT my understanding that the 3D 4000 is just disk split between native and block pools. That's what you'll find in the _Quantum_ side of the equation, of course. But with the 3D 4000, there is also the Falconstor side of the equation, where you'll find a bunch of disk, too -- disk that's not needed in other vendors' implementations.

    As to "I wasn't in the meetings" all I really said was that this odd design was the only way you could fulfill your promise to bring dedupe to the 4000. I know you promised to do that. I was there at customers when you did it. As to it being the only way, well, the Falconstor route would have been the obvious one, but you chose not to do it for various reasons. The Avamar route was nowhere near fast enough. So the only choice left was to use an outside solution. And you chose the only one that wasn't being resold by a competitor. Seems obvious enough to me.

    As to me making any corrections, I don't see any to make. It's going to take more than a "no we don't" post to make that happen. I am, however, happy to have a phone conference (sooner than later) with anyone at EMC that can answer my questions with what you're saying. (We'll be typing until we're blue in the face if we do it here.)
    Mark Twomey  - No title   |2009-03-24 01:20:15
    What part of there's only one copy of data in the system is unclear?

    There is no cache. There is a native pool and a block pool. There are no cached copies of data. It's either in native format because you've chosen not to de-dup it or it's de-dup'd in the block pool.

    Here's what you said about replication.

    "It also matters because if you plan to replicate, you can't replicate it until it's deduped."

    Correct but the point I was trying to make was that continuous replication replicates data from the first de-duped block onwards. it does not wait until the end of a datastream to replicate data.

    "The reason I don't allow you to combine your numbers is that you have two completely separate Quantum boxes behind your two separate Falconstor boxes. Each Quantum box has its own hash table and they do NOT compare data against each other. It is NOT a single block pool; Quantum's software simply doesn't support that and I don't know how you could say that. They are two islands of dedupe and are therefore not much more than two separate appliances in the same box."

    But the de-dup option is one box. Not two or ten or some other number, one unit. Regardless if it's a single node 4100, dual node 4200 or quad node 4400 it's one box with one block store accessible to all engines.

    Why does everyone else with a clustered nodes and a single block store get to post their max number but EMC doesn't?

    *All engines. One hash table.*

    And when this is pointed out you still won't make the correction.

    You've now been tagged as "Anybody But EMC" and as such I wouldn't wait by the phone if I were you.
    W. Curtis Preston   |2009-03-12 22:58:16
    I've read and re-read your comments trying to understand why we're so far apart. Now I think I get it. If you'll verify that my new understanding of what you're saying is correct, I'll be happy correct the post to reflect that.

    First, let's talk about the native pool/block pool thing. When YOU say native pool and block pool, I now understand (hopefully correctly) that you mean the disk attached to the DL 4000 (AKA Falconstor node) and the disk attached to the 3D 4000 (AKA Quantum node). Now I understand AND AGREE that data will be in either one and not both.

    So you understand what _I_ was saying, I wasn't talking about those pools. I was talking about the Quantum box's ability to store native copies of the data as a cache for restores. They're stored there until the box hits 75% full, at which point it starts "truncating" the native copies ON THE QUANTUM BOX. These are the copies to which I refer when _I_ say "native copies" or "cache copies." Does that make sense now?

    As to your comments about about all engines using one hash table, I now understand that it is a valid configuration (in your eyes) to have a single 3D engine plugged into the back ends of two DL engines. Scott Waterhouse said in his blog that "you can have one deduplication engine per DL Engine. (A DL4106 has one, a DL4206 or DL4406 has two DL Engines.)" I assumed that meant that you would ALWAYS have two engines. What I'm hearing YOU say is that you can have one engine.

    So... If you have one engine (with an ingest/dedupe rate of 1.5 TB/hr, according to Scott), then I'll agree, you can combine the ingest speeds of the two front end systems to have an ingest speed of 8 TB/hr (2200 MB/s) and a dedupe speed of 1.5 TB/hr (400 MB/s). I'm not really sure that's any better, though. If your ingest speed is almost 6 times your dedupe speed, you'll only be able to USE the front end for 4 hours and still dedupe within 24 hours. Yuck.

    As to your comment on replication, I know that replication starts when dedupe starts. But what I'm saying is that replication can't FINISH until dedupe FINISHES, which means that if your dedupe speed is so slow that it takes 24 hours to dedupe it, your "tapes" will be sent offsite MUCH later than they would have if you were doing the old tape and truck method. HOWEVER, if your dedupe speed matches your incoming throughput (the way it can with inline systems and global dedupe post processing systems), then you can have an experience that more closely resembles the old days, with all backups being offsite by 9-10 am, with backups finishing at 8 am.

    I never said "anything but EMC." I _am_ saying "anything but the 3D 4000." That means I'm not talking about the DL 4000 (assuming you don't want dedupe), the 3D 1500 or 3000, and I'm not talking about the myriad other products you offer. I just think that this particular product makes no sense, and I'm standing by that until someone changes my mind. If you're correct and EMC wants to ignore me and let me stay "confused," then that's their choice. I actually think they're bigger than that. We shall see.
    NetBackup User  - The REAL real deal   |2009-03-20 15:15:38
    The REAL real deal on the 3D 4000 is that people who are buying it are returning it. I was a potential customer of both Data Domain and EMC, and asked for references from both. Data Domain gave me several that checked out just fine. EMC gave me four, so I called them. Out of the four references THAT EMC GAVE ME, THREE of them had returned the 3D 4000 and bought Data Domain appliances.
    W. Curtis Preston  - I decided I needed another post   |2009-03-20 15:21:15
    I thought my response to Mark Twomey's comments was important enough that I should write them in a separate post: http://www.backupcentral.com/content/view/232/47/
    W. Curtis Preston  - re: No title     |2009-03-24 01:16:25
    Not sure what happened to Mark's comment that I was responding to in the comment that starts "I've read and re-read...". I'm quoting them here so they don't disappear.

    Mark Twomey wrote:
    What part of there's only one copy of data in the system is unclear?

    There is no cache. There is a native pool and a block pool. There are no cached copies of data. It's either in native format because you've chosen not to de-dup it or it's de-dup'd in the block pool.

    Here's what you said about replication.

    "It also matters because if you plan to replicate, you can't replicate it until it's deduped."

    Correct but the point I was trying to make was that continuous replication replicates data from the first de-duped block onwards. it does not wait until the end of a datastream to replicate data.

    "The reason I don't allow you to combine your numbers is that you have two completely separate Quantum boxes behind your two separate Falconstor boxes. Each Quantum box has its own hash table and they do NOT compare data against each other. It is NOT a single block pool; Quantum's software simply doesn't support that and I don't know how you could say that. They are two islands of dedupe and are therefore not much more than two separate appliances in the same box."

    But the de-dup option is one box. Not two or ten or some other number, one unit. Regardless if it's a single node 4100, dual node 4200 or quad node 4400 it's one box with one block store accessible to all engines.

    Why does everyone else with a clustered nodes and a single block store get to post their max number but EMC doesn't?

    *All engines. One hash table.*

    And when this is pointed out you still won't make the correction.

    You've now been tagged as "Anybody But EMC" and as such I wouldn't wait by the phone if I were you.
    Todd A Johnston  - Unmatched Performance "potential?"     |2009-03-24 09:40:52
    Perhaps the linguistic disconnect for EMC is the age old storage BANDWIDTH vs THROUGHPUT argument.

    They seem to see their solution as unmatched... I'm not sure it's a good idea nor would it be ideal to spend capital today to compete with similar architecture.

    Blade based Proxy engine similar to ORacle RMAN is a more "viable" and ASIC optimized option. Less bundled hardware and more flexible for hashing/ shared cache and futures.


    Lastly: Look at the Birth of a technology and then match it's evolution -
    Ex.
    . VMWARE - started as 2 -4 cpu max Virtualization engine - Never designed for 64 processor consolidation. It was to consolidate applications off multiple bare metal's into one "multi-app" server. "save money" simpler admin.

    Check the facts - It's true. It's been challenged for 3 years to master replication/DR and scalability--- It wasn't designed to go there.

    Blessings all rest with their API and Virtual Appliance community -- it's still NO MVS and scalability is still lacking.

    Point in Check - Storage Arrays that required TWO boxes ---forecver will be two boxes - DataDomain wasn't packaged as software only solution -

    However, Decisions change when EMC can increase a partners margin... perhaps this was more deciding than "technology" or products agility.

    IMHO - Just another shoe shine with more smoke and mirrors. Humans may need to re-familiarize themselves with the Delete key and "self Control"

    That's my CLOUD Computing deduplication strategy for 2010 -beyond.
    SolutionsARchitect.com
    -TAJ
    Only registered users can write comments!

    3.26 Copyright (C) 2008 Compojoom.com / Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."

     
    < Prev   Next >

    Sponsored Links