Server virtualization does NOT cause storage explosion

Server virtualization doesn’t kill storage.  People kill storage.  That’s all I’m saying.

I get hot under the collar when I hear people say things like “server virtualization increases storage requirements by huge amounts.”  They slam server virtualization with this comment, as if changing a server from being a physical one to being a virtual one somehow magically increases its size.  They list it as a reason that you shouldn’t use server virtualization.

So I got a little irked when I heard the CEO of Symantec, Enrique Salem, say something like it in his keynote this week at Symantec Vision. (It was a great show, by the way.)  “Server virtualization increases storage use by 200% – 800%,” he said.  When we had the media Q&A with him, this was the first question out of my mouth.  “What about moving a server from being physical to being virtual increases storage requirements?”  I asked a similar question of every other Symantec person I met with that day, as well as when I met VMware CTO, Steve Herrod.

In retrospect, I was probably a little hard on Mr. Salem during my Q&A.  Even Steve Herrod from VMware verified that the typical VMware customer does see such a storage explosion.  However, I still stand by my statement that this is not VMware’s fault.  Moving to VMware does not cause your storage to magically explode.  Moving to VMware probably does “help” it happen, though.  Here are my thoughts on that.

VMware’s design actually reduces storage use

The average virtual machine image (VMDK in VMware speak) is significantly smaller than the smallest disk drive you can buy to put into a server.  The smallest hard drive I can configure in a Dell server is 250 GB. You can create a thin-provisioned VMDK and it will consume only as much storage as it needs to, which is going to be far less than 250 GB.  I don’t know Hyper-V as well as I do VMware, but I’m guessing it’s similar.  I would also say that moving servers into VMware/Hyper-V means that you can put all those very duplicated images on a single storage volume that supports deduplication, removing that huge storage explosion.  You can’t do that if you’re using physical servers with discrete hard drives.

Many people buy their first “real” storage array when they buy VMware/Hyper-V

They may feel that this “forces” them to increase their storage costs, because they’re used to just buying discrete hard drives — often with no RAID or monitoring.  They then blame this increase in cost on VMware/Hyper-V.  I don’t buy that either.  First, they didn’t have to do that.  They could have bought a nice HP/Dell/IBM server with internal storage and run VMware on that.  The decision to buy a storage array is a second decision.  Second, if VMware “forces” them into the 21st century as far as storage management is concerned, so be it.  It’s about time they have real storage.

Server virtualization often means a lot of test/dev VMs

This was Mr. Salem’s point.  VMware/Hyper-V makes it really easy to have many, many different images of different configurations, so people create dozens or hundreds of VMs in their test/dev environment, and that causes a huge increase in storage.  I again say that you could continue to do in your dev/test lab whatever it was you did before you had VMware/Hyper-V, so it isn’t VMware/Hyper-V’s fault that you lab now uses 10 times more storage than it used to.  But it sure does make it easy, though, doesn’t it?  I would also say that this increase in storage is accompanied by a huge increase in usability of the lab.

VM sprawl is evil and real and it eats up storage

This was the universal comment from most everyone I talked to.  When we step out of the test/dev world, it is a reality that when you are buying physical servers, there tends to be much more of an approval process.  When all you have to do to create a new server is click the right button on your mouse, you tend to create new “servers” very quickly.  Next thing you know, you have a whole lot more servers (and images of Windows/Linux) than you ever would have had if you had physical servers. VM sprawl is real, and it should be addressed with process and procedure.

VMware and Hyper-V are not the problem here.  What we do with it is the problem.  Yes, they make it much easier to do dumb things like VM sprawl, but blaming VMware and Hyper-V on your storage explosion is like blaming Ferrari for your tickets.  Just saying.


Written by W. Curtis Preston (@wcpreston), four-time O'Reilly author, and host of The Backup Wrap-up podcast. I am now the Technology Evangelist at Sullivan Strickler, which helps companies manage their legacy data

4 comments
  • If your SAN has primary storage deduplication, Virtualization can actually reduce the amount of real storage you utilize. Imagine the OS blocks of all those physical servers addressing pointers and the actual blocks are sitting in cache. You end up with reduced storage and faster access times. This of course assumes that you are deploying from common VM templates.

    -DD

  • I don’t think I have heard anyone blame VMware/Hyper-V itself, but at the same time, you don’t want to call someone’s baby ugly would you?

    Virtualization as a whole, is a disruptive technology IMO. Backup strategies need to change, server and storage purchases as well. New change management processes and patching procedures… And the human element as you point out that causes sprawl and such all make for a new frontier for the modern datacenter.

  • On the topic of how moving to virtual servers increases storage requirements, there’s one other big reason, possibly the biggest one, why this occurs. And it has nothing to do with VMware alone, since the same thing happens on Hyper-V and XenServer, and probably the other type 1 hypervisors as well.

    Here’s a typical scenario. The VI admin sizes the storage requirements based on experiences gleaned from the client/server world where a single application typically runs on a dedicated server. That application generates an I/O pattern which, once profiled, probably won’t change a lot over time (although it might grow on an easily predictable linear basis). So you can really tune the storage configuration you use for it to perform at its optimum.

    Now shift to the virtual world. Here you may have 8-12 individual VMs, each with their own completely unrelated I/O pattern, all running on a single physical machine. All that I/O gets dumped into the hypervisor, whose job it is to write it out to disk. This I/O pattern is significantly more random and significantly more write intensive than what you generally get with the client/server model. In laying this I/O pattern down on disk, the rotational latencies and seek times of spinning disks start to really dominate the total data transfer times, and the storage slows down. This has everything to do with the randomness of the writes, and almost nothing to do with whether its VMware, Microsoft, or Citrix in the mix.

    Faced with this storage slowdown, the admin correctly diagnoses the problem as “not enough IOPS” and starts to throw hardware at the problem. More disk spindles, SSDs, higher end storage, etc. all of which increase the cost of the storage. If the admin brings the performance back into the desired range before they run out of budget, then they’re still left with an unexpectedly high bill for storage relative to what they originally budgeted. And if they run out of budget before they get there, well… that project might be shelved.

    In my opinion, this is one of the main reasons why VDI projects are having a bit of a tough time getting off the ground. And on VDI, you have another problem contributing to high storage costs as well: you’re taking a bunch of files that were formerly being stored on IDE drives (I can buy 1TB of IDE storage at Fry’s for about $120) and centralizing them on SAN-based storage where, for $120 you probably can’t even buy the manual…! So VDI gets hit even harder by this “storage cost” issue than virtual server environments.

  • Server virtualization fans are wildly enthusiastic, but even some true believers are worried about how quickly scads of virtual machines (VMs) are being added to corporate IT environments.

    “We love VMware,” said Tom Dugan, director of tec