VMware Storage Sudoku

Last Friday I was brainstorming with Gabrie van Zanten about the optimal placement of the VMDKs across our LUNs. We tried to come up with an algorithm that could give us insight in what would be our optimal storage layout.

First read the complete post here.

This weekend I was continuously thinking about this challenge.
I came up with some more requirements:

  • Don’t focus on VMs, but focus on the VMDKs. This because we have VMs that won’t fit completely in one LUN of 500GB. VMs might have different VMDKs with different IO behaviour.
  • Take the VM memory size into account, because we don’t have reservations we will have a swap file of this size.
  • Some VMs, like SQL Servers, need their VMDKs spread over different LUNS for better performance. So we will need some kind of affinity rules.
  • Check if the average IO load is above the maximum recommended value of VMware or the storage box vendor. Maybe we will need more LUNs.
  • Take the current storage layout as a starting point because some VMDK’s might be in the right place allready. So we don’t have to reshuffle everything.
  • Maximum LUN fillrate will be 90%.
  • Use a deviation of +/-10% on the average IO to work with.

Then a complete other thought came across my mind.

Why do we still have VMFS?
If we can assign LUNs directly to the VM, we can let the storage box balance the IO load.
First we can think of using RDMs, but you will quickly find yourself running out of SCSI IDs, because off the maximum of 256 LUNs.
You could add extra HBAs, but do you have sufficient PCI slots?

In the next version of ESX there’s the new Cisco Nexus 1000v to give back the network management tasks we took from the network guys.
So where is the virtual fibrechannel switch to give back the storage management tasks we took from the storage guys?

Related posts:

  1. Unable to login to your ESX server Tweet Ivo Beerens posted this article last week on the defunct cimservera processes that render an ESX Host unmanageable. See also this VMWare KB Article. Symptoms include: Unable to log...

4 Comments on “VMware Storage Sudoku”

  1. #1 duncan
    on Feb 8th, 2009 at 6:41 pm

    Do a search on vstorage chad sakac’s website and check the preview of what kind of integration will be possible in the near future!

  2. #2 Steve Chambers
    on Feb 8th, 2009 at 7:18 pm

    Hey Arnim, we need VMFS because we need a clustered filesystem for multple hosts to access the same LUN. Without VMFS there is no vMotion.

    Is it so bad that you can’t make more than 256 LUNs available to one host, or one cluster of hosts? I suppose if you have a cluster that can host more than 256 VMs and ALL VMs need this performance… but do all the VMs on that cluster need RDMs for the IO load balancing you mention?

    If, as you say in your article, you think about the VMDKs – then out of all the VMs on your cluster, how many of those VMDKs need the performance you mention? If it’s less than 256 in the cluster, which it probably is, then some mix is in order?

    Storage management in virtualization is an interesting topic and I’ve run many an emotional workshop :-) I always go for the Occam’s Razor approach myself… :-)

  3. #3 Omar Sultan
    on Feb 8th, 2009 at 11:53 pm

    Arnim:

    The Cisco Nexus 1000V was part of a broader set of virtual machine-related products and technologies we termed VN-Link. You can check out this video to see some of the things we are doing on the storage side: http://snurl.com/bjdx4 — check out starting at about the 2 minute mark. Beyond this, we are continuing to develop storage solutions for VM environments.

    Omar Sultan
    Cisco
    Data Center Solutions

  4. #4 Arnim van Lieshout
    on Feb 10th, 2009 at 2:55 pm

    Thanks for your Comments.
    Steve is right about the need for a clustered file system, but do we need the clustered filesystem to store vmdk’s?
    I was not thinking about RDMs as the way implemented now. I was thinking on some kind of virtual storage switch, integrating the ESX host into the SAN fabric.

    The reason why we were talking about IO load balancing is because we saw some hot spots on our Storage box. So we took a look into the ESX LUNs a found out that some LUNs had an much higher IO load than others. There was no VM with a particular performance problem, but we opted to distribute the load evenly across our LUNs.

Leave a Comment