Scott Lowe

Virtualization Short Take #36

Scott Lowe's Blog - 9 hours 33 min ago

It’s been a busy couple of weeks! I was in Vienna, Austria, all last week, and I’m on the US West Coast this week. Even though I’ve been on the go, I’ve still been collecting various virtualization-related posts and tidbits. Here they are for you in Virtualization Short Take #36! I hope you find something useful.

  • You might recall that in early 2008 I wrote about how thin provisioned VMDKs on NFS storage tend to inflate. In a recent post, Chad Sakac pointed out that VMware has addressed this problem, which is caused by the use of the eagerzeroedthick VMDK format instead of the zeroedthick format. The fix requires ESX 3.5 Update 5 and VirtualCenter 2.5 Update 6, plus a configuration change that is outlined in this VMware KB article. Kudos to VMware for fixing the underlying issue instead of just forcing customers to upgrade to vSphere.
  • VMware’s Scott Drummonds provides a bit more information on the memory compression technology previewed by Steve Herrod at Partner Exchange 2010 a few weeks ago. In my opinion, anyone who says that the hypervisor is a commodity isn’t paying attention to the fact that VMware is still innovating in this space.
  • If you perform virtualization assessments using VMware’s Capacity Planner tool, you’ll find Gabe’s Capacity Planner troubleshooting tips helpful.
  • Gabe also published a “wish list” for VMware datastores. If you take a deeper look at what Gabe is really trying to address, though, a great deal of the functionality he’s looking for could be achieved through a combination of policy-based storage tiering and greater integration between VMware and the storage array. Would you really need category labels on VMware datastores if the underlying storage was tiering data effectively based on utilization? Probably not. It might still make sense in some cases, but I think the vast majority of cases would be addressed. I think that you are going to see some very cool innovation in this space over the course of this year.
  • Simon Gallagher recently asked this question: with the move to ESXi, is NFS more useful than VMFS? It appears that a large part of Simon’s argument centers around the speed at which files can be transferred into VMFS using ESXi, and it seems to me that VMware needs to do some optimization there. I’m not knocking NFS—I’ve used it extensively in the past and I have and continue to recommend it to customers where it is appropriate—but I’m not sure that you can build an argument for NFS based on ESXi’s file transfer performance. My friend and former colleague Aaron Delp (whose blog was recently added to Planet V12n; congrats!) points out that fixing VMDK alignment using ESXi could be an issue; now that’s a great point. Even third-party utilities like vOptimizer don’t work with ESXi. In my opinion, these points underscore the need for VMware to concentrate very heavily on ESXi if that is indeed going to be the “platform moving forward”.
  • I came across an interesting VMware KB article while browsing the weekly VMware KB digest for the week ending 2/28/10. The article, which discusses a situation in which VMware HA would fail to configure at 90% completion, describes how some network switches—HP ProCurve 1810G switches with automatic denial-of-service protection enabled and Cisco Catalyst 4948 switches with ICMP rate limiting enabled—can drop packets that are necessary for VMware HA to configure and start correctly.
  • Unfortunately, the latest VMware KB weekly digest (for the week ending 3/6/2010) didn’t include links to the actual articles that were published. Bummer! Still, it’s easy enough to simply look up the articles directly.
  • EMC today released a couple of plug-ins for vCenter Server. The Celerra plug-in for VMware Environments brings Celerra NFS provisioning into the vSphere Client. The Celerra Failback Plug-in for SRM automates failback in VMware SRM environments. The official press release is here, which contains links to more information on the individual plug-ins. (Disclaimer: I work for EMC.)
  • Newly-minted VCDX #029 Frank Denneman posted a good article on using reservations on resource pools to bypass slot sizing. As Frank points out, it’s not a recommended practice necessarily, but it might be warranted depending on customer requirements.
  • Duncan’s recent article on the behavior of CPU and memory reservations is also helpful, especially for those who might not be familiar with the differences between the two types of reservations.
  • Similarly, this guest post on Duncan’s site by VCDX Craig Risinger also helps explain how shares on a resource pool work. This is good information to have if you are unfamiliar with the topic.
  • I’m not a security geek, but I did think that the RSA-Intel-VMware announcement at RSAC 2010 (third-party coverage here) was pretty cool. Security experts, I’d love to hear your thoughts on the matter. What was good about the announcement? What was missing?
  • If you will be working with distributed vSwitches, this post by EMC’s Gregg Robertson might help; it underscores the need to ensure that your environment is being consistently and thoroughly patched and maintained. vCenter Update Manager, anyone?

I do have a few other articles in my “things to read list” that I haven’t yet gotten around to reading:

The Official Quest Software Desktop Virtualization Group Blog » Blog Archive » How to Integrate ThinApp with Quest vWorkspace 7.0
DRS Resource Distribution Chart
HP Flex-10 versus Nexus 5000 & Nexus 1000V with 10GE passthrough

That’s it for now. I hope that you’ve found something useful here, and—as always—I’d love to hear your thoughts in the comments below.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Virtualization Short Take #36

Similar Posts:
Categories: Scott Lowe

Links for 2010-03-08 [del.icio.us]

Scott Lowe's Blog - 16 hours 38 min ago
  • VMware Labs
    The VMware Labs site contains some unofficial and unsupported (but quite cool) tools from VMware engineers.
Categories: Scott Lowe

VMware ESX, EMC CLARiiON Arrays, and Multiple Protocols

Scott Lowe's Blog - Fri, 03/05/2010 - 12:01

I was browsing through an EMC technical document titled “EMC CLARiiON Integration with VMware ESX Server” (download it here) a little while ago and I came across a phrase in the document that caught my attention:

“VMware ESX/ESXi support both Fibre Channel and iSCSI storage. However, VMware and EMC do not support connecting VMware ESX/ESXi servers to CLARiiON Fibre Channel and iSCSI devices on the same array simultaneously.”

What? No Fibre Channel and iSCSI from the same array to a VMware ESX/ESXi host simultaneously? That piqued my curiosity, so I contacted a few people within EMC to question the veracity of that statement. It turns out that the answer is more complicated than it might seem at first glance.

For those of you who aren’t interested in the deep technical details, here’s the short explanation behind this behavior:

  • VMware fully supports the use of both Fibre Channel and iSCSI from the same array to the same VMware ESX/ESXi host simultaneously.
  • VMware does not support presenting the same LUN via both protocols concurrently to the same host. (I qualified this directly with VMware.)
  • For a Celerra, you can use both Fibre Channel (via the CLARiiON side of the array) and iSCSI (via the Celerra side of the array) simultaneously. This is a fully supported configuration.
  • A CLARiiON array can easily present the same LUN via both Fibre Channel and iSCSI, but then VMware wouldn’t support it (see earlier bullet).
  • With a CLARiiON array, it is possible to present some LUNs via Fibre Channel and some LUNs via iSCSI to the same VMware ESX/ESXi host (i.e., LUN A via Fibre Channel and LUN B via iSCSI), but EMC will only support it if you file an RPQ. Without an RPQ, it’s an unsupported configuration. An RPQ, by the way, is a request to qualify a certain configuration for support.

I’m confident that some other array vendors out there will be very quick to jump on this post and harp on this limitation until the cows come home. I would just ask this question: is it really as big of a limitation as it seems? I’ll come back to that question in a moment.

With the short explanation in mind, here are the more in-depth details. If you like the longer, more technical explanation, then read on!

From EMC’s side, the root of the restriction about using both Fibre Channel and iSCSI devices on the same array simultaneously stems from the interaction of host registration and storage groups.

Host registration is a requirement in the CLARiiON world. In order to present storage to a host from a CLARiiON array, you must first register the host’s initiators with the array in Navisphere. Once the host has been registered, then you can proceed with presenting storage to that host. In theory the CLARiiON could operate without registering hosts and initiators, but EMC chose to require registration. EMC made this choice in order to help simplify host management.

Requiring host registration is a bit different than some of other storage arrays on the market. It’s not better or worse—just different. (Remember, pros and cons come from every technology decision.)

If you’re like me, you’re probably wondering at this point how requiring host registration simplifies anything. Instead of having to manage multiple paths, multiple initiators, and individual hosts every time you want to present storage to a host, you only need to register the host—and all of its initiators—and then you can refer to that same object (the host) over and over again as needed. Yes, host registration does mean a bit more work up front, but the idea is that it will save some work down the road. I guess you can think of host registration kind of like defining aliases in your Fibre Channel zoning configuration: it’s a bit more work up front, but it simplifies things later down the road. If you didn’t create device aliases in your Fibre Channel switch, you’d end up having to re-enter Fibre Channel WWPNs multiple times. You create the aliases so that it’s easier later. The same applies to host registration. Again, it’s a matter of choices.

One might also say that registration is security measure, albeit a weak measure. Rather than allow just any Fibre Channel-attached or iSCSI-attached host to see storage, the array requires that it know about the host (via host registration) in order to present storage to the host. This provides an additional layer of security to ensure that only authorized hosts are presented storage from the array.

Now you have a fairly decent idea of why host registration is necessary. So how does host registration occur? Host registration can occur either manually or automatically. Starting with version 4.0, both VMware ESX and VMware ESXi will automatically register with a CLARiiON array running any recent version of FLARE (ESX 3i version 3.5 also supports this form of push registration). FLARE release 28 and earlier will show these hosts as “Manually registered, unmanaged”; starting with FLARE 29, these hosts are listed as “Manually registered, managed”. In either case, the registration occurs automatically. If the host is Fibre Channel-attached, then the Fibre Channel initiators will be included in the automatic registration. The same goes for iSCSI initiators. Normally, this is a good thing because it saves the administrator the extra steps of registering the host with the storage array. (Also, because VMware ESX/ESXi hosts register automatically, there is no need to install the Navisphere Agent.)

In this case, though, the automatic registration causes a problem. Why? This goes back to the second item I said I needed to discuss: storage groups. Specifically, storage groups have two characteristics that come into play here:

  1. First, any given host—not just VMware ESX/ESXi hosts, but all types of hosts—can only be connected to a single storage group at any given time.
  2. Second, while the CLARiiON can present Fibre Channel LUNs and iSCSI LUNs simultaneously (including presenting the same LUN via both protocols simultaneously), there is no way within a single storage group to specify which LUNs should be accessed via Fibre Channel and which LUNs should be accessed via iSCSI. This is necessary because VMware won’t support accessing the same LUN via both protocols at the same time (see earlier VMware support statement).

Do you see how all the pieces come together? The only way to control which LUNs should be presented via which protocol is to use multiple storage groups—but a host can only be in a single storage group at a time. With only a single host object for any given VMware ESX/ESXi host, that host can only see either Fibre Channel LUNs (by being in a storage group containing Fibre Channel LUNs) or iSCSI LUNs (by being in a storage group containing iSCSI LUNs), but not both. Hence, the statement in the CLARiiON document I referenced in the very beginning of this blog post that outlines using either Fibre Channel or iSCSI but not both. This behavior is required to enforce the single-protocol LUN access required by VMware.

As with all things, there is a workaround. Because it is a workaround, that’s why the RPQ is necessary to get full support.

To work around this problem, you’ll need to ignore the automatic host registration (or disable the automatic host registration) and instead create two manually registered “pseudo-hosts”: one with the Fibre Channel initiators and one with the iSCSI initiators. These “pseudo-hosts” will need fake IP addresses (if they both use the same IP address, Navisphere will treat them as the same host, thus defeating the purpose of the workaround). Put the Fibre Channel initiators into the Fibre Channel storage group(s), and put the iSCSI initiators into the iSCSI storage group(s). Each “pseudo-host” will be able to see LUNs presented to that storage group and therefore would see both Fibre Channel and iSCSI LUNs at the same time. And, as required by VMware, any given LUN would be accessed only via Fibre Channel or iSCSI but not both. Remember that you need to file an RPQ in order to get support on this configuration.

For VMware ESX/ESXi 4.0 hosts (and ESX 3i version 3.5 hosts), you can disable automatic registration using the Disk.EnableNaviReg advanced configuration option. Setting this value to 0 disables the automatic registration with Navisphere. (Here are screenshots for VMware ESX 3i and VMware ESX/ESXi 4.) If you disable the automatic registration, then you only need to manually register the Fibre Channel and iSCSI initiators as separate “pseudo-hosts” and you’re ready to go.

Let me reiterate again that if you are presenting iSCSI LUNs via the Celerra and not the CLARiiON, none of this applies. Presenting Fibre Channel LUNs via the CLARiiON and iSCSI LUNs via the Celerra to the same VMware ESX/ESXi host is fine. This workaround that I’ve described only applies when you want to present some LUNs via Fibre Channel and some LUNs via iSCSI from a CLARiiON to a single VMware ESX/ESXi host.

Earlier you’ll recall that I asked this question: is this really a limitation? There are a couple of viewpoints:

  • One viewpoint states there is no need for both Fibre Channel and iSCSI connectivity to the same array. Since you already have Fibre Channel connectivity to the array, what’s the point in using iSCSI? Conversely, if you already have iSCSI connectivity to an array, why invest in establishing Fibre Channel connectivity? Since you can’t use it for failover (that would violate the VMware support position), running another block protocol against the same array and same sets of disks doesn’t add a great deal of value.
  • A second viewpoint argues that the ability to provide a differentiation of service based on the different performance characteristics of Fibre Channel and iSCSI (and NFS, but we’re focusing on block protocols for this discussion) is valuable, and thus the need to be able to easily present LUNs via either protocol from the same array to the same host is a worthwhile function. There are a number of potential use cases here—test/development environments, Tier 2 applications, varying SLAs, etc. This is especially true if you are using different disk pools (fast Fibre Channel drives or EFDs vs. slower SATA drives) on the same array.

I can see both sides of the coin. Personally, I tend to side more with the second viewpoint and would prefer to see the CLARiiON have the ability to easily present Fibre Channel and iSCSI to the same host, especially when multiple disk pools are involved. I think that CLARiiON engineering is now evaluating this possibility; as more information emerges, I’ll be sure to keep you posted.

Courteous and professional comments, clarifications, or corrections are always welcome!

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

VMware ESX, EMC CLARiiON Arrays, and Multiple Protocols

Similar Posts:
Categories: Scott Lowe

Links for 2010-03-04 [del.icio.us]

Scott Lowe's Blog - Fri, 03/05/2010 - 09:00
Categories: Scott Lowe

PXE Booting VMware ESX 4.0

Scott Lowe's Blog - Wed, 03/03/2010 - 00:20

I recently had the opportunity to work on a proof of concept (PoC) in which we wanted to help a customer streamline the processes needed to deploy new hosts and reduce the amount of time it took overall. One of the tools we used in the PoC for this purpose was PXE booting VMware ESX for an automated installation. Here are the details on how we made this work.

Before I get into the details, I’ll provide this disclaimer: there are probably easier ways of making this work. I specifically didn’t use UDA or similar because I wanted to gain the experience of how to do this the “old fashioned” way. I also wanted to be able to walk the customer through the “old fashioned” way and explain all the various components.

With that in mind, here are the components you’ll need to make this work:

  1. You’ll need a DHCP server to pass down the PXE boot information. In this particular instance, I used an existing Windows-based DHCP server. Any DHCP server should work; feel free to use the Linux ISC DHCP server if you prefer.
  2. You’ll need an FTP server to host the kickstart script and VMware ESX 4.0 Update 1 installation files. In this case, I used a third-party FTP server running on the same Windows-based server as DHCP. Again, feel free to use a Linux-based FTP server if you prefer.
  3. You will need a TFTP server to provide the boot files. The third-party FTP server used in the previous step also provided TFTP functionality. Use whatever TFTP server you prefer.

Make sure that each of these components is working as expected before proceeding. Otherwise, you’ll spend time troubleshooting problems that aren’t immediately apparent.

Preparing for the Automated ESX Installation

First, copy the contents for the VMware ESX 4.0 Update 1 DVD—not the actual ISO, but the contents of the ISO—to a directory on the FTP server. Test it to make sure that the files can be accessed via an anonymous FTP user.

Also go ahead and create a simple kickstart script that automates the installation of VMware ESX. I won’t bother to go into detail on this step here; it’s been quite adequately documented elsewhere. You’ll need to put this kickstart script on the FTP server as well.

At this point, you’re ready to proceed with gathering the PXE boot files.

Gathering the PXE Boot Files

The first task you’ll need to complete is gathering the necessary files for a PXE boot environment.

First, copy the vmlinuz and initrd.img files from the VMware ESX 4.0 Update 1 ISO image. Since I use a Mac, for me this was a simple case of mounting the ISO image and copying out the files I needed. Linux or Windows users, it might be a bit more complicated for you. These files, by the way, are in the ISOLINUX folder on the DVD image.

Next, you’ll need the PXE boot files. Specifically, you’ll need the menu.c32 and pxelinux.0 files. These files are not on the DVD ISO image; you’ll have to download Syslinux from this web site. Once you download Syslinux, extract the files into a temporary directory. You’ll find menu.c32 in the com32/menu folder; you’ll find pxelinux.0 in the core folder. Copy both of these files, along with vmlinuz and initrd.img, into the root directory of the TFTP server. (If you don’t know the root directory of the TFTP server, double-check its configuration.)

You’re now ready to configure the PXE boot process.

Configuring the PXE Boot Environment

Once the necessary files have been placed into the root directory of the TFTP server, you’re ready to configure the PXE boot environment. To do this, you’ll need to create a PXE configuration file on the TFTP server.

The file should be placed into a folder named pxelinux.cfg under the root of the TFTP server. The filename of the PXE configuration file should be named something like this:

01-<MAC address of network interface on host>

If the MAC address of the host was 01:02:03:04:05:06, the name of the text file in the pxelinux.cfg folder on the TFTP server would be:

01-01-02-03-04-05-06

The PoC in which I was engaged involved Cisco UCS, so we knew in advance what the MAC addresses were going to be (the MAC address is assigned in the UCS service profile).

The contents of this file should look something like this (lines have been wrapped here for readability and are marked by backslashes; don’t insert any line breaks in the actual file):

default menu.c32
menu title Custom PXE Boot Menu Title
timeout 30
 
label scripted
menu label Scripted installation
kernel vmlinuz
append initrd=initrd.img mem=512M ksdevice=vmnic0 \
  ks=ftp://A.B.C.D/ks.cfg
IPAPPEND 1

You’ll want to replace ftp://A.B.C.D/ks.cfg with the correct IP address and path for the kickstart script on the FTP server.

Only one step remains: configuring the DHCP server.

Configuring the DHCP Server for PXE Boot

As I mentioned earlier, I used the Windows DHCP server as a matter of ease and convenience; feel free to use whatever DHCP server best suits your needs. There are only two options that are necessary for PXE boot:

066 Boot Server Host Name (specify the IP address of the TFTP server)
067 Bootfile Name (specify pxelinux.0)

In this particular example, I created reservations for each MAC address. Because the values were the same for all reservations, I used server-wide DHCP options, but you could use reservation-specific DHCP options if you wanted different boot options on a per-MAC address (i.e., per-reservation) basis.

The End Result

Recall that this PoC was using Cisco UCS blades. Thus, in this environment, to prepare for a new host coming online we only had to make sure that we had a PXE configuration file and create a matching DHCP reservation. The MAC address would get assigned via the service profile, and when the blade booted then it would automatically proceed with an unattended installation. Combined with Host Profiles in VMware vCenter, this took the process of bringing new ESX/ESXi hosts online down to mere minutes. A definite win for any customer!

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

PXE Booting VMware ESX 4.0

Similar Posts:
Categories: Scott Lowe

Congrats to HyTrust

Scott Lowe's Blog - Fri, 02/26/2010 - 06:05

Congratulations are in order for virtualization security company HyTrust, who within the last few days has had a flurry of activity. I’ve met HyTrust’s CEO several times (he’s a great guy, by the way) and I’ve followed the company since their early days; personally, I think it’s great that they’re seeing some success.

The first big piece of news was that HyTrust hired Jim Gannon, a former VMware executive, to serve as the VP of Sales. The full press release for that announcement is here.

The second big piece of news comes in the form of this press release announcing that HyTrust has secured $10.5 million in Series B financing, including an investment from Cisco Systems.

The third and final piece of news, and the one that I personally find most exciting, is that HyTrust has been named one of ten finalists for the “Most Innovative Company at RSA Conference 2010″ award. Congratulations on all three counts!

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Congrats to HyTrust

Similar Posts:
Categories: Scott Lowe

A Potential Use for the iPad

Scott Lowe's Blog - Tue, 02/23/2010 - 04:49

There’s a lot of hype surrounding the Apple iPad. Some people are proclaiming it’s the end of traditional media like newspapers, magazines, and books. I’m not so sure about that, but I have found one potential use for the iPad that—for me, at least—might be compelling enough to make me go buy one later this year.

One task that I’m finding as a member of EMC’s vSpecialist team is that there is a lot of reading. We’re responsible for reading all sorts of documents. I don’t mind doing this in the evenings, when I’m not writing for one of my upcoming books or studying for a certification exam, but I’d really much rather prefer to do this in a way that makes it possible for me to be with my family. So, having some sort of device that would allow me to review documents while I’m sitting in the den with the kids would be great.

My thought is that I could leverage something like Dropbox to synchronize documents between my MacBook Pro and an iPad. With the documents easily accessible on (or from) the iPad, I could sit on the couch and read or review documents while the kids sit next to me and watch TV or read a book. This would help me stay on top of the document reviewing without pulling me into my office and away from the family.

What do you think? Good idea, or not? Anyone else have any uses for the iPad that you’d like to describe? Speak up in the comments.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

A Potential Use for the iPad

Similar Posts:
Categories: Scott Lowe

On the VCDX Defense

Scott Lowe's Blog - Sat, 02/20/2010 - 04:06

I’ve been thinking about how to write this post since last Friday afternoon after I completed my VCDX defense panel. Even now, a week later, I’m not sure that I have the right words to use.

After many long hours of preparation on the application and the submitted design, after days and weeks of waiting, after hours spent reviewing the design, and after reading numerous tips and tricks from established VCDXs, it all culminated in the defense panel. There, in the defense panel, I would have to stand before three knowledgeable, established design experts and defend the choices made in the design. I’d have to explain why I chose block storage over NFS, why Fibre Channel over iSCSI, why blade servers vs. rack-mount servers. I’d have to explain why I chose the LUN size I chose, and I’d have to defend the zoning that controlled the presentation of those LUNs. Clusters, cluster sizes, features enabled or not enabled, networking layout, VM density on the LUNs, projected IOPS—nothing was safe from their inquisition. And yes, I’d have to explain why the design used NetApp storage instead of EMC storage. (It was a customer requirement, i.e., a design constraint.)

Surprisingly, when the time for the VCDX defense panel arrived I found myself a lot more nervous than I had expected I would be. After all, this was just a friendly conversation with technical peers, right? In many ways it could be viewed that way, but the underlying purpose behind the conversation was ever-present: there was a reason I was there standing before these three people. It wasn’t just “shooting the breeze” with friends; there was a purpose there. It wasn’t just bouncing ideas off co-workers or industry colleagues; there was a reason for the conversation. It’s not that the panelists did anything to cause this feeling; they were completely fine, very courteous and quite friendly. (In case you’re wondering, I’m not going to disclose who was on my panel. They are welcome to disclose if they so choose, but that will be their decision.)

Looking back on it now, I realize that I should have gotten better control of my nerves. I spoke too quickly. I rushed through questions that probably deserved more explanation. I forgot details about my design. I got tripped up by relatively simple questions. I’ve made no secret of the fact that I wasn’t pleased by how well I performed—or didn’t perform—in the VCDX defense panel. I was upset that I had been thrown off and that I wasn’t able to recall all the details from my design. For a few hours after completing the defense panel, I beat myself up over how things had gone. But it didn’t take me too long to make peace with not having passed. I knew that if I had not passed, the experience was still worthwhile as a learning experience. Even if I hadn’t gained the VCDX certification, I’d still gained knowledge and experience. And hey, there was always another chance to defend at VMworld, right?

After returning home from Las Vegas, I spent the week thinking about what I would write after I’d finally gotten my results. I tried to prepare for the questions like “How in the world could Scott not pass?” I thought about explaining that the defense panel was only doing their job; they were preserving the value of the certification. After all, if the bar is not held high, what is the value of VCDX? All the while I secretly hoped that the result would be something other than what I was confident it would be.

And so it was that as I was driving my kids to a church youth group function tonight—after a long and unproductive day working with some rather stubborn equipment in the lab—that I received an e-mail from VMware. The first line of the message was this:

Congratulations! You have achieved the VCDX3 certification. Your VCDX number is: VCDX39

Unbelievable! I’d passed! I was so excited. I’d hoped for this result, but I honestly did not believe that I had managed to pull it off. I immediately called Crystal to tell her the news. I think she might have been even more excited than me.

Having now been through this entire process, what advice do I have for aspiring VCDX candidates?

  • As many others have stated, know your design. If I had only one thing to change about my entire process, this would be it. You should know it forward and backward: every detail, every choice, and every reason behind the design.
  • Don’t be too nervous. I allowed my nerves to get the best of me, there’s no question. I also don’t doubt that I would have done better had I not been so nervous. (As a side note, it’s interesting to me that I can stand up and speak in front of large crowds and not be nervous, but standing in front of those panelists really threw me. Odd.)
  • Understand the impact of your choices. As Duncan pointed out in this recent blog post, it’s really about the impact. Be prepared to discuss the reasons for the decisions in your design and the impact of the decisions in your design.

That’s it from the latest VCDX to join the ranks. I’ll post another update later with more tips and tricks that I learned from the experience, but those are some that jump to my mind immediately.

Have a great weekend!

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

On the VCDX Defense

Similar Posts:
Categories: Scott Lowe

Links for 2010-02-17 [del.icio.us]

Scott Lowe's Blog - Thu, 02/18/2010 - 09:00
Categories: Scott Lowe

A Collection of UCS Posts

Scott Lowe's Blog - Thu, 02/18/2010 - 08:03

There’s been quite a few good Cisco UCS posts published recently; I thought it might be handy to collect a list of some of them (I’m sure that I will miss some). Here are a few that I’ve seen over the past few weeks (in no particular order):

Swapping UCS Blades with Local Boot Policies
Get Spidey Powers with UCS; but with Great Power comes Great Responsibility
25 ways that Cisco UCS frees you to do other things
Cisco UCS - How Many FEX Uplinks Do I Need?
UCS local disk policy + some vBlock
Cisco UCS: different workload, different configuration, same blade. Simple.
Cisco UCS Information for “Server People”
Cisco UCS vs. IBM and HP - Where are the Brains?
UCS Gotchas? and how much time does it take day to day?

Anyone else have any UCS posts that have surfaced recently? Add them in the comments below.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

A Collection of UCS Posts

Similar Posts:
Categories: Scott Lowe

Virtualization Short Take #35

Scott Lowe's Blog - Thu, 02/18/2010 - 07:38

It’s that time again: time for another Virtualization Short Take! Here’s a collection of links, articles, posts, and other tidbits that I’ve found interesting, informative, or useful over the last few weeks. I hope that you find something useful as well!

  • Tom Howarth has been spending some time with Microsoft’s App-V application virtualization solution; he’s written a three-part series (Part 1, Part 2, and Part 3). Part 1 discusses domain and certificate setup, Part 2 centers around policies and GPO settings, and Part 3 covers the client-side setup for App-V. While Tom’s overview is extremely helpful, I don’t recall seeing any thoughts on App-V as a product. Tom, did you like it? Not like it? What was good or bad about it? It would be great to have a post that brings this sort of information together.
  • Interested in getting a better feel for the communications that occur between an ESX/ESXi host and vCenter Server? This post discusses decoding SSL traffic with Wireshark so that you can see what’s happening.
  • Jeremy Waldrop of Varrow has a good “getting started” post on using vCenter Server’s storage alarms. If you’re looking for an introductory piece, this is a good place to start.
  • If you’re using Hyper-V and have VMs that are generating lots of network traffic, this post from the Windows Server Performance Team discussing increasing the VMBus buffer size is probably worth a look for you.
  • And while I’m mentioning Hyper-V, Ben Armstrong aka Virtual PC Guy discusses an RDP ActiveX control that provides RDP connectivity to a VM (not RDP connectivity to the guest OS, which is distinct and separate). I’ve never been a huge fan of ActiveX controls, but this could be useful in certain environments.
  • Is defragmentation of VMs a good thing? Scott Drummonds asks the same question in this blog post. My only comment: avoid defragmentation with thin provisioned disks (array-level or hypervisor-level thin provisioning).
  • Of course, Scott Drummonds also had a flurry of very useful posts over the last few weeks: missing Perfmon counters, inaccuracy of guest performance counters, and Las Vegas taxi rates. (The Las Vegas taxi post actually helped me save some money when headed to the airport after PEX. Your mileage may vary—pun intended.)
  • Eric Sloof’s home-grown tests of running linked clones on an SSD aren’t definitive, but they definitely back up the value that has been seen with the deployment of EFDs (Enterprise Flash Drives) in virtualized environments.
  • This PowerShell script will show you the logged-in user for a given VMware View desktop. Handy!
  • Readers seeking more information on guest OS alignment should read this article by Jeff Muir. While the focus of the article is on VHD and NTFS alignment, the underlying principles are also applicable to VMDK files in VMware environments.
  • Frank Denneman, VCDX 29, has had a few good posts recently. He had a post that discusses the use of local storage for VM swap; this post was then parlayed into a greater discussion on understanding the impact of design decisions. It’s a pretty fitting discussion given the timing around all the VCDX defense panels at Partner Exchange and Frank’s own elevation to the VCDX priesthood. Frank’s article on VM sizing and NUMA was also a great read. Keep up the good work, Frank! (And I’m still waiting to see all the info about memory reservations you promised me…)
  • Jason Boche recently highlighted his adventures in using Round Robin multipathing with his EMC Celerra. One key takeaway is that he had to reboot the ESX/ESXi host after changing the SATP, so keep that in mind. There is also a very specific CLARiiON configuration that needs to be set: the Failover Mode needs to be set to 4.
  • Jonathan Medd provides some great information on users who might be new to vCenter Update Manager in this article.
  • If you are planning on virtualizing any SQL Server systems, be sure to check out this list of best practices for SQL Server, written by Scott Drummonds. The document is a bit old (December 2008), but the recommendations are still valid.
  • It appears that VMware has updated this KB article recommending the use of the LSI Logic vSCSI controller for low I/O environments. I’m glad to see VMware has added more information and clarification; the previous version of the article was a bit spartan, to say the least.
  • I think that Figure 1 on this page on Cisco solutions for VMware View environments would give even Hany Michael a run for the money! While Figure 1 is pretty complex, the information in the article is useful and helps underscore some of the many different ways Cisco products can be put to use in a VMware View environment.
  • Here’s a useful document on integrating Cisco UCS with VMware DPM.
  • This weekly summary of new KB articles is quite useful. OK, I know this isn’t new and many people probably already knew about it but it’s still useful. So get off my case, OK?

There’s more that I could include, but I should probably wrap this up. Here are a few other links worth mentioning:

The Backup Blog: Avamar and VMware Backup Revisited
VMware KB: ESX 4.0 and ESXi 4.0 shutdown and reboot commands
VMware KB: Masking a LUN from ESX and ESXi 4.0 using the MASK_PATH plug-in
Rethinking vNetwork Security
Announcing NVSPBind

That’s it for this time around. Thanks for reading and feel free to submit any interesting links you’ve found in the comments!

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Virtualization Short Take #35

Similar Posts:
Categories: Scott Lowe

Links for 2010-02-15 [del.icio.us]

Scott Lowe's Blog - Tue, 02/16/2010 - 09:00
Categories: Scott Lowe

VMware vSphere Pro Video Training

Scott Lowe's Blog - Fri, 02/12/2010 - 04:18

In case you’ve haven’t already heard, David Davis and the good folks over at Train Signal recently released an advanced VMware vSphere training course called “VMware vSphere Pro Series Training, Vol 1″. You can get more information about this new course from Train Signal’s web site.

The new video course features not only David Davis, but also well-known virtualization figures Hal Rottenberg and Rick Scherer. David Davis takes viewers of the training course through a section on VMware View, VMware’s product for virtual desktops, and ThinApp, VMware’s application virtualization solution. Hal provides coverage of PowerCLI (is anyone surprised?), and Rick discusses the Cisco Nexus 1000V. All in all, the new video course is almost 11 hours in length.

Train Signal also includes multiple digital formats as well to make it easier for busy administrators to be able to view or listen to the content.

I do have to say that I haven’t yet had the opportunity to actually view any of these videos. However, I do know both David Davis and Rick Scherer personally (sorry Hal, I haven’t met you personally yet). I’m confident that this is a good quality product. If you’re a VMware vSphere administrator looking to expand his or her knowledge of VMware View, ThinApp, PowerCLI, and/or the Nexus 1000V, this new training course is an excellent place to start.

Disclaimer: Train Signal is a paid sponsor of this site.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

VMware vSphere Pro Video Training

Similar Posts:
Categories: Scott Lowe

Partner Exchange 2010 Session TECHDV0721

Scott Lowe's Blog - Fri, 02/12/2010 - 00:10

This is a two-hour session on VMware View security architecture and security benefits titled “VMware View Security Benefits, Architecture, and Best Practices”.

So what is VMware’s security strategy? First, start with core platform security. This encompasses all the various features and functions of the hypervisor like memory protection and isolation, kernel module protections, hypervisor attack surface, etc. Next, continue with operational security. This is about integrating VMware’s products into your organization’s existing operational security policies and includes things like the vSphere Security Hardening Guide that was recently released. Using security virtual appliances is another step that enables broad-based security for all VMs in the environment. Finally, VMware is striving for a “better than physical” model where virtual security is better than physical security. Consider VMsafe as an effort in this area.

The presenter next reviewed the VMware View infrastructure and all the various components that are included in this infrastructure. To ensure security, all of these various components need to be reviewed with an eye on security. For example, componentizing the different parts of a View infrastructure—for example, separating access points, user data, applications, data, and operating system helps to secure each of these different pieces.

A further benefit of this separation is that it allows for the creation of a true “gold master” for VMs. Products like ThinApp and VMware View Composer helps to simplify this process and help maintain a true “gold master” image. This means that all the various security guidelines can be more easily incorporated into this master image, the master image can be patched more easily, configuration drift is reduced, and you can recover more easily and more quickly after an attack.

Using virtual desktops also allows organizations to more easily create “desktop security zones” that help isolate higher-risk PCs from lower-risk PCs, thus containing potential security risk to a limited subset of all desktops. This might also help with meeting compliance requirements (the presenter specifically mentioned PCI).

Thin clients are helpful in reducing complexity at the edge, which can (in some cases) help reduce the attack surface and limit the amount of work that IT organizations need to do to help secure the endpoints.

What about backing up data? Using View to centralize desktops allows organizations to more easily implementation full data backups for the various types of data that are being created within the virtual desktop environment.

The presenter next moves on to vSphere security. Because VMware View depends upon vCenter and ESX/ESXi, the security of View is dependent upon the security of vCenter and ESX/ESXi. This led into a discussion of the benefits of virtualization vs. the security impact of virtualization. The topics covered here include all the usual suspects: greater impact of misconfiguration or attack; loss of visibility in the network access layer; loss of separation between network admins and server admins; potential VM sprawl without consistent configurations and properly defined procedures; possible security problems resulting from VM mobility; and unauthorized access to VMs because of VM encapsulation (users copying a VM by copying the VM’s files).

So how does one protect the virtual infrastructure? You use existing techniques such as hardening and lockdown; defense in depth; and authorization, authentication, and accounting.

The same goes for protecting virtual machines. Use anti-virus, IDP/IDS systems, firewalls, etc. VMsafe and the functionality enabled by VMsafe will be very helpful here.

Be sure to isolate the management interfaces using physically separate management networks or by using VLANs. You should also control access to the management network using ACLs, jump boxes, VPNs, or other access controls. Only authorized individuals should have access to the management network and “ordinary end-users” should absolutely not have access.

The separation of duties is also important. Use vCenter Server’s built-in roles to enable the principle of least privilege to help enforce separation of duties. Third-party products like HyTrust might also be helpful.

The presenter argues that moving to a vNetwork Distributed Switch is a security benefit. One big plus is the mitigation of the risk associated with misconfiguration. In addition, there is support for private VLANs (PVLANs), inbound traffic shaping, Network VMotion, and (with the Nexus 1000V) ACLs and a natural separation of duties.

At this point the presenter moves on to a discussion of secure access to virtual desktops.

Authentication is one key area; View supports AD authentication as well as RSA SecurID. View Manager does not store any of the authentication information; this is all offloaded to Active Directory or the RSA Authentication Manager. Smart Card authentication is an alternative to standard username and password authentication. The certificate on the Smart Card contains a Subject Alternative Name (SAN); the SAN is matched against the User Principal Name (UPN) in Active Directory. Smart Card authentication is not supported with PCoIP.

View does support a form of single-sign on so that users log on to the View Client and is authenticated all the way down to the virtual desktop.

Future support with regard to authentication will include Kerberos realm authentication; UPN authentication; RADIUS support in the View Connection Server; and improved SSO to virtual desktops.

Moving on to access options, PCoIP requires direct access to the virtual desktop; it won’t work with SSL tunneling. Fortunately, PCoIP is already encrypted (wirespeed encryption using AES 128-bit encryption). For non-PCoIP connections, HTTPS tunneling of RDP is supported by VMware View. This can greatly simplify firewall configuration (only TCP port 443 is required). Secure tunneling also has the benefit of helping to maintain sessions in the event of a dropped connection.

Some advantages of PCoIP is the built-in encryption and support for blocking USB Plug events (to control USB device usage).

The View Security Server enables you to create a DMZ infrastructure that prevents end points from having direct access to virtual desktops or the Connection Server. The use of load balancers is supported with both Security Servers and Connection Servers.

VMware does recommend replacing the self-signed certificates that are supplied with VMware View with valid SSL certificates. Note that the specific SSLv3/TLSv1 ciphers that are used with secure connections can be configured to enable or disable specific ciphers.

The use of a VPN can also help provide a single point of entry and simply the firewall configuration.

The next topic is VMware View’s entitlements model. View uses Microsoft ADAM on Windows Server 2003 or Microsoft AD LDS on Windows Server 2008. Back-end Active Directory is still leveraged for authentication. View uses the idea of foreign security principals (FSPs), which means that Active Directory doesn’t have to be synchronized with the local LDAP instance. In addition, user authorizations and entitlements don’t have to be stored in Active Directory (which would require schema extensions).

At this point the presenter moves into a discussion of View security best practices:

  • Harden the base OS within the virtual desktops and enforce refresh intervals and OS patching.
  • Choose the proper authentication model and use a Security Server or VPN for secure remote access.
  • Be sure to understand the firewall requirements and configure the firewall accordingly.
  • Be sure to harden the Connection Server and the underlying Windows Server OS upon which it is installed.
  • Replace the default self-signed certificates.
  • Set appropriate entitlements within the Connection Server. Zone users according to use case and risk.
  • Avoid direct remote access to virtual desktops where possible. Don’t allow users to connect without going through the Connection Server.
  • Control USB access, redirection of clipboard, printers, and drives.
  • Leverage Active Directory Group Policy to help with virtual desktop OS lockdown and some View-specific settings. (You might need to use Loopback Policy Processing in this instance.)
  • Know the different ports and the directions that are required when configuring firewalls. Refer to the View Architecture Planning Guide for full details.
  • Install anti-virus, but use a minimal installation to reduce bloat.
  • Use a staggered or randomized scanning policy to avoid overwhelming the infrastructure. Use policies or corporate configuration tools to enforce staggered scanning and signature updates and to configure exclusion lists (only need to scan the user data disk; the base OS is locked down through the use of linked clones).
  • Consider a VMsafe Ready AV product.
  • Include Network Access Control (NAC) management agent in the parent VM prior to cloning.
  • Use ThinApp to gain some security benefits (prevents the OS from getting infected through the actions of a ThinApped application). Consider using ThinApp for browsers.
  • Specific to ThinApp and anti-virus, don’t install AV on the Capture/Build system if at all possible. If AV is installed, no on-demand scanning of the ThinApp project directory.

The next topic of the session was a discussion of using VMware vShield Zones. vShield Zones provide virtual firewalls that operate as transparent Layer 2 bridges and allow you to create different security zones. This can provide some technological enforcement of zones for different user environments (different pools for web browsing vs. internal CRM access and these pools cannot communicate with each other because of vShield Zones).

The presenter wrapped up the session with an overview of VMsafe and how VMsafe can help contribute to the security of a VMware View environment. VMsafe enables greater protection of VMs through APIs that allow deepened inspection of CPU/memory, networking, and storage. For example, VMsafe allows knowledge of specific CPU state or inspection of specific memory pages. VMsafe allows networking traffic to be inspected, intercepted, modified, or even replicated (consider vShield Zones integrated with the VMsafe APIs). With regard to storage, VMsafe allows the ability to mount VMDKs, inspect storage I/Os, and do so transparently and inline to the storage stack.

The session wrapped up with a list VMsafe-integrated solutions from companies like Altor Networks, TrendMicro, McAfee, and Checkpoint.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Partner Exchange 2010 Session TECHDV0721

Similar Posts:
Categories: Scott Lowe

Partner Exchange 2010 Session TECHMGT0921

Scott Lowe's Blog - Thu, 02/11/2010 - 18:28

This is a session on vCenter Chargeback deployment, configuration, and best practices. The presenter is Naeem Malik from VMware. Naeem works in VMware’s PS organization and specializes in Chargeback, CapacityIQ, and Capacity Planner assessments.

There are three things to keep in mind when you are thinking about a vCenter Chargeback implementation: hierarchy, cost models, and cost templates. Malik will discuss those in more detail later.

So why chargeback? Chargeback is necessary to handle the new model of shared resources instead of dedicated physical servers. There is no longer a one application-to-one server model; now many applications run on the same server. Why is this new model necessary? Malik quotes Gartner that “the speed and flexibility of virtualization makes some form of chargeback mandatory”. Otherwise, organizations run the risk of VM sprawl. After all, VMs are not free—they require CPU, memory, storage, and network capacity.

vCenter Chargeback is a resource accounting tool that helps users and organizations understand that VMs are not free. It features support for fixed, allocation, and utilization-based costing; provides the ability to charge different amounts for different tiers of infrastructure; and can schedule reports and e-mail results.

In a vCenter Chargeback implementation, Malik believes that a large part of a consultant’s time is taken up helping organizations define the resource costing (if the organization has not already established those costs).

From an architectural perspective, vCenter Chargeback uses a separate database but also pulls information from the vCenter Server database. vCenter Chargeback can run as a VM and integrates with an organization’s existing e-mail systems (via SMTP) and existing Active Directory/LDAP infrastructures. Both SQL Server and Oracle are supported for Chargeback. The Chargeback server will interact with vCenter Server to pull performance and utilization information.

A single Chargeback server supports up to 5 vCenter Server instances and up to 5,000 VMs/entities. An embedded data collector is found on the Chargeback server itself. When an implementation goes beyond 5 vCenter Server instances, an additional data collector is necessary. The data collectors can be easily deployed from the Chargeback server itself. Extremely large implementations (up to 75 vCenter Server instances and up to 20,000 VMs/entities) require multiple Chargeback servers behind a load balancer with multiple data collectors.

Chargeback uses HTTP over TCP port 8080, HTTPS over TCP port 443, load balancing configuration over TCP port 8009, LDAP over TCP port 389, and SMTP over TCP port 25 (the slide had a typo and listed port 24).

Earlier Malik had mentioned three things to keep in mind. The first of these is hierarchy. The hierarchy controls how reports are created. The second thing to keep in mind is the cost model. There is fixed costing (fixed cost for a VM instance), allocation-based costing (variable costs per VM based on allocated resources), and utilization costing (variable cost per VM based on actual resources utilized). Many customers are using a hybrid model that is somewhere between fixed costing and allocation-based costing. The third thing is cost templates. Cost templates combine cost accounting information with fixed costs.

Cost accounting works with the cost model to determine how resources are actually priced. If a customer hasn’t already determined costs for their resources (CPU, RAM, storage, networking), VMware has a tool that can help determine this number. This can be difficult and requires “buy in” from all applicable stakeholders within the environment. Items that require costs assigned to them include CPU, memory, disk, disk I/O, and network I/O.

Fixed costs (not the same as the fixed cost model) include stuff like power/cooling, software licenses, real estate in the data center, labor, etc.

Cost templates combine the cost accounting for the various resources with the fixed costs and allow you to apply a cost multiple to the metering element (GHz of CPU cycles used, GB of RAM used, etc.). Multiple cost templates can be created to help allow for flexible costing of VMs.

vCenter Chargeback also has extensive reporting functionality. Scheduled reports are available, and reports can be generated at any point within the hierarchy (datacenter object, cluster object, host object). Reports can be customized for a company-specific look and feel. Reports are available via e-mail or via the Chargeback web UI.

At this point, Malik now moves into a product “demo”, which is essentially a collection of screenshots from vCenter Chargeback that help illustrate the various components and features of Chargeback.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Partner Exchange 2010 Session TECHMGT0921

Similar Posts:
Categories: Scott Lowe

Partner Exchange 2010 Session TECHBC0320

Scott Lowe's Blog - Tue, 02/09/2010 - 20:25

This is a liveblog for VMware Partner Exchange session TECHBC0320, “How VMware Leverages Microsoft Volume Shadow Services for Virtual Machine Snapshots”. The presenter is Paul Vasquez with VMware; he works within the Technical Alliances Organization at VMware with a focus on backups.

The session starts out with an overview of VMware snapshots followed by a quick overview of Microsoft Volume Shadow Copy Services.

Vasquez is careful to distinguish VMware snapshots from array-based snapshots, which is good since that seems to confuse a number of people. VMware snapshots can include the state of memory (optional), settings, and disk. Snapshots are taken at the VM level, and up to 32 snapshots can be taken. Over 20 snapshots can cause performance concerns and, in Vasquez’s words, “can cause undesirable results”.

In general, a snapshot will include all disks although there are ways to exclude disks from a snapshot.

Operations involving VMware snapshots include taking a snapshot (self-explanatory), reverting to a snapshot (reverts the VM to the snapshot state, the delta file remains until the snapshot is deleted), and deleting a snapshot (delta file is removed, VM continues running in the current state).

Some use cases for snapshots include: rollback capability for testing patches or updates; rollback for failed software installation; protection against unwanted results of OS reconfigurations or testing; backups (for creating consistent copies of a VM); and replication.

The delta file grows as-needed; over time, the delta file will grow larger and larger. Vasquez cautions attendees to be sure to plan datastore sizes to account for snapshots for VMs and the delta file growth caused by the changes to those VMs.

A good question was raised about read I/Os and the impact of snapshots (does

The presentation now moves on to a discussion of VSS. One component of VSS is the requestor; the requestor makes a request from a provider, and the writer provides information on how to provide information to a requestor. Providers are included with Windows and are responsible for intercepting I/O requests to create and represent volume shadow copies on the file system. There are also 3rd party providers. In this context of this discussion (VSS integration with VMware snapshots), VMware Tools is the requestor.

There is a wide range of applications that provide VSS support, including Exchange, SQL, SharePoint, Active Directory, BITS, DHCP, and WINS. The vssadmin list providers command will show all the providers. (Note that you won’t see the VMware Tools when you run this command; it is dynamically loaded only at snapshot time and then unloaded.)

The vssadmin list writers command will show a list of writers.

The general flow of operation with VSS runs like this:

  1. Requestor makes a shadow copy.
  2. The writer is told to freeze all I/O.
  3. The provider creates a shadow copy.
  4. The writer is told to “thaw,” or resume, I/O to the application.
  5. The requestor now has access to the shadow copy.

The writer can support multiple enumerations, or different ways of coordinating the creation of the shadow copy. Exchange, for example, supports Full (backs up databases, logs, and checkpoints; truncates logs), Copy (backs up databases, logs, and checkpoints; does not truncate logs), Incremental (backs up and truncates logs), Differential (backs up logs but does not truncate). Of these, VMware uses the Copy enumeration when requesting shadow copies. Supposedly, the reason this is the case is to prevent interfering with backup applications that aren’t aware that logs were truncated. In addition, when VMware calls VSS, all writers are engaged, so it’s not possible to selectively choose which VSS writers should be engaged (can’t engage VSS for Exchange but not SQL within the same VM, for example).

In the future, VMware Tools will offer granular control over which VSS enumeration is used. Granular control over which VSS writers can be engaged is also planned.

Vasquez now moves into a discussion of how VMware snapshots and VSS integrate together. When a VMware snapshot is taken, this is when VSS integration comes into play. Obviously, for VSS integration the VM must be powered on (the guest OS must be running in order for VSS to be operational).

Some form of quiescing is always used when a snapshot is taken (unless the VM is powered off). The VMware Sync driver provides a crash-consistent copy of the VM but doesn’t interact with applications. This option is available in vSphere 4.0 and can be used when no VSS support from the application is available. Obviously, there is VSS support (hence this session), and there are pre- and post-quiesce scripts that can be used to create homebrew solutions as well. Both VSS and the Sync driver can be enabled using VMware Tools.

VSS support is enabled in VMware ESX 3.5 Update 2 or higher.

Going back to the VSS flow earlier, an additional step is present before the writer resumes I/O to take the VMware snapshot. After the VMware snapshot is taken, the shadow copy created by the provider is discarded because it is no longer needed. Once again, Vasquez reminds attendees that the VMware Tools Requestor only supports the copy enumeration.

An attendee asked if any plans were in place to do quiescing at the VMFS layer (supposedly to assist with hardware-based snapshots); Vasquez responds that some form of VMFS quiescing would be helpful, but there are challenges with that arrangement that make it currently very difficult to actually achieve.

(Vasquez also commented on the end-of-life policy for the ESX Service Console, but I’ll hold on mentioning what was said until I verify the confidentiality of the statement.)

Some additional things to remember:

  • VMware Tools build must be 110268 or higher.
  • VMware Tools must be running and VSS must be functioning properly.
  • VSS Service must be set to Manual or Automatic.
  • ESX 3.5 Update 2 is required for VSS support.
  • Be sure VSS support is installed with VMware Tools.
  • Try not to keep VMware snapshots around for a long time. Manage snapshots carefully.
  • Sync driver can be used as a failback in the event VSS support fails.
  • VSS snapshot has a 10 second timeout. Rare cases could cause a failure of getting the VSS shadow copy.

Most of the information contained in this presentation are found in the current vSphere documents and in Microsoft’s VSS documentation. (I’ll update this post with URLs when possible.)

And that’s it for the session.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Partner Exchange 2010 Session TECHBC0320

Similar Posts:
Categories: Scott Lowe

Moving Lab Manager Datastores

Scott Lowe's Blog - Thu, 02/04/2010 - 22:14

You might have noted a slight incompatibility between VMware vCenter Lab Manager and one of VMware vSphere’s core features, Storage VMotion, in this earlier post on VMware Lab Manager design considerations by former co-worker Aaron Delp:

Storage VMotion and VMware VCB are not supported with Lab Manager.

Obviously, this could present a problem for users who might need to migrate Lab Manager datastores from one LUN or array to another LUN or array. So what’s a user to do?

Fortunately, an internal discussion on this earlier today turned up some great information on a utility called SSMove. What is SSMove?

SSMove is a utility installed on the Lab Manager server that allows you to move data from one datastore to another. You can move a specific tree of related virtual machines. See “Viewing Virtual Machine Datastore Directories” in the Lab Manager User’s Guide for more information on trees. To move an entire datastore, you must move all its trees individually.

Credit goes to rockstar team member Denis Guyadeen for pointing out this utility. More information is available at this links:

VMware KB: Moving a datastore using SSMove
VMware KB: SSMove does not work if a datastore is disabled (3.0.2 only)
VMware Lab Manager 3 Online Library - Managing Datastores

So, if you are needing to migrate data in Lab Manager from one datastore to another, this is your tool.

I haven’t yet found any information on whether SSMove is also included in Lab Manager 4. (To be fair, I haven’t really searched too hard.) If anyone knows, please speak up in the comments.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Moving Lab Manager Datastores

Similar Posts:
Categories: Scott Lowe

Using IP-Based Storage with VMware vSphere on Cisco UCS

Scott Lowe's Blog - Thu, 02/04/2010 - 15:27

I had a reader contact me with a couple of questions, one of which I felt warranted a blog post. Paraphrased, the question was this: How do I make IP-based storage work with VMware vSphere on Cisco UCS?

At first glance, you might look at this question and scoff. Remember though, that Cisco UCS does—at this time—have a few limitations that make this a bit more complicated than at first glance. Specifically:

  • Recall that the UCS 6100XP fabric interconnects only have two kinds of ports: server ports and uplink ports.
  • Server ports are southbound, meaning they can only connect to the I/O Modules running in the back of the blade chassis.
  • Uplink ports are northbound, meaning they can only connect to an upstream switch. They cannot be used to connect directly to another end host or directly to storage.

With this in mind, then, how does one connect IP-based storage to a Cisco UCS? In these scenarios, you must have another set of Ethernet switches between the 6100XP fabric interconnects and the target storage array. Further, since the 6100XP fabric interconnects require 10GbE uplinks and do not—at this time—offer any 1GbE uplink functionality, you need to have the right switches between the 6100XP fabric interconnects and the target storage array.

Naturally, the Nexus 5000 fits the bill quite nicely. You can use a pair of Nexus 5000 switches between the UCS 6100XP interconnects and the storage array. Dual-connect the 6100XP interconnects to the Nexus 5000 switches for redundancy and active-active data connections, and dual-connect the target storage array to the Nexus 5000 switches for redundancy and (depending upon the array) active-active data connections. It would look something like this:

From the VMware side of the house, since you’re using 10GbE end-to-end, it’s very unlikely that you’ll need to worry about bandwidth; that eliminates any concerns over multiple VMkernel ports on multiple subnets or using multiple NFS targets so as to be able to use link aggregation. (I’m not entirely sure you could use link aggregation with the 6100XP interconnects anyway. Anyone?) However, since you are talking Cisco UCS you’ll have only two 10GbE connections (unless you’re using the full width blade, which is unlikely). This means you’ll need to pay careful attention to the VMware vSwitch (or dvSwitch, or Nexus 1000V) configuration. In general, the recommendation in this sort of configuration is to place Service Console, VMotion, and IP-based storage traffic on one 10GbE uplink, place virtual machine traffic on the second 10GbE uplink, and use whatever mechanisms are available to preferentially specify which uplink should be used in the course of normal operation. This provides redundancy in the uplinks but some level of separation of traffic.

One quick side note: although I’m talking IP-based storage here, block-based storage fans need to remember that Cisco UCS does not—at this time—support northbound FCoE. That means that although you have FCoE support southbound, and FCoE support in the Nexus 5000, and possibly FCoE support in your storage arrays, you still can’t do end-to-end FCoE with Cisco UCS.

For those readers who are very familiar with Cisco UCS and Nexus, this will seem like a pretty simplistic post. However, we need to keep in mind that there are lots of readers out there who have not had the same level of exposure. Hopefully, this will help provide some guidance and food for thought.

(Of course, one could just buy a Vblock and not have to worry about putting all the pieces together…hey, can’t blame me for trying, right?)

Clarifications, questions, or suggestions are welcome in the comments below. Thanks!

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Using IP-Based Storage with VMware vSphere on Cisco UCS

Similar Posts:
Categories: Scott Lowe

EMC Celerra Optimizations for VMware on NFS

Scott Lowe's Blog - Mon, 02/01/2010 - 00:34

Recently Jason Boche posted some storage performance numbers from his EMC Celerra NS-120. Out of those storage performance tests, Jason noted that the NFS performance of the NS-120 seemed a bit off, so he contacted EMC’s vSpecialist team (also known as “Chad’s Army”, but I like vSpecialists better) to see if there was something that could or should be done to improve NFS performance on the NS-120. After collaborating internally, one of our team members (a great guy named Kevin Zawodzinski) responded to Jason with some suggestions. I wanted to reproduce those suggestions here for everyone’s benefit.

Note that some of these recommendations are already found in the multi-vendor NFS post available on here on Chad’s site as well as here on Vaughn Stewart’s site.

In addition, most if not all of these recommendations are also found in the VMware on Celerra best practices document available from EMC’s web site here.

Without further ado, then…

  • As has been stated on multiple occasions and by multiple people, be sure that virtual machine disk/application partitions have been properly aligned. We recommend a 1MB boundary. Note that Windows Server 2008 aligns at a 1MB boundary automatically.
  • Use a block size of 8KB unless other recommended or required by the application vendor. Note that the default NTFS block size is 4KB. (Pages 128 through 138 of the Celerra best practices document contain more information on this bullet as well as the previous bullet.)
  • Turn on the uncached write mechanism for NFS file systems used as VMware datastores. This can have a significant performance improvement for VMDKs on NFS but isn’t the default setting. From the Control Station, you can use this command to turn on the uncached write mechanism:
    server_mount <data mover name> -option <options>,uncached <file system name> <mount point>
    Be sure to review pages 99 through 101 of the VMware on Celerra best practices document for more information on the uncached write mechanism and any considerations for its use.
  • Change the VMware ESX settings NFS.SendBufferSize and NFS.ReadBufferSize to a value that is a multiple of 32. The recommended value is 64. See page 73 of the best practices document for more details.
  • If you’ve adjusted the NFS.MaxVolumes parameter in order to have access to more than 8 NFS datastores, you should also adjust Net.TcpIpHeapSize and Net.TcpIpHeapMax parameters. The increase should be proportional; if you increase the maximum volumes to 32 (a common configuration), then you should increase the other parameters by a factor of 4 as well. Page 73 of the best practices document covers this. This VMware KB article and this VMware KB article also have more information.
  • Although not directly related to performance, best practices call for setting NFS.HeartbeatFrequency (or NFS.HeartbeatDelta in VMware vSphere) to 12, NFS.HeartbeatTimeout to 5, and NFS.HeartbeatMaxFailures to 10.
  • Ensure that the LUNs backing the NFS file systems are allocated to the clar_r5_performance pool. This configuration will balance the load across the SPs, LUNs, etc., and help improve performance.

Depending upon the other workloads on the system, another NFS performance optimization is to ensure that the maximum amount of write cache on the SPs is configured. However, be aware this may impact other workloads on the array.

As Jason noted in his post, implementing these changes—especially the uncached write mechanism—offered performance benefits for NFS workloads.

Keep these configuration recommendations in mind when setting up your EMC Celerra for VMware on NFS.

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

EMC Celerra Optimizations for VMware on NFS

Similar Posts:
Categories: Scott Lowe

Links for 2010-01-25 [del.icio.us]

Scott Lowe's Blog - Tue, 01/26/2010 - 09:00
Categories: Scott Lowe
Syndicate content