In VMware ESX you can use the vimsh command from the command line in order to retrieve CDP information. With the release of ESXi 4.1, the vimsh command is not included. However there is still a way to retrieve CDP informtation via the CLI. Instead of using vimsh, you simply use vim-cmd. The path to the utility is /bin/vim-cmd.
I have to write this down as I'm sure I'll come across this issue again in the future. Today I had to install ESXi 4.1 on a ProLiant BL465C G7 blade server. This turned out to be problematic to say the least. It turns our that the native ISO imaged downloaded from the VMware website does not include all of the drivers required for ProLiant G7 blades. Anyway, as I had a very busy day, I really didn't have time to try and figure this one out for myself. Lucky for me, Steve [Bryen] has done this already and posted the workaround on his blog:
Those of us who use VCB (VMware Consolidated Backup) to perform backups of their SAN based virtual machines may know how time consuming and frustrating it can be to find and clean up stale snapshots on virtual machines that were left behind by failed VCB backups. This is even more time consuming if you have a large scale virtual environment with hundreds or even thousands of virtual machines than needs to be backup up on a daily basis.
Let me first give a brief explanation on how VCB goes about backing up virtual machines and why having stale snapshots on virtual machines prior to a VCB backup job will spell problems.
Every time a VCB backup job kicks off, a snapshot is created on the VM that is going to be backed up. Whilst this snapshot is in place, all changes that takes place in VM’s guest OS will be written to delta VMDK files, that is one delta file for every virtual disk on the VM. These files increment in 16MB chunks and on a busy VM, say for instance a VM that hosts a large database, these 16MB increments may result in several gigabytes per delta file. Whilst any changes are being written to these delta files, VCB can go ahead and mount the main VMDK files to the VCB proxy server in order to make the VMDK files or their contents available to your backup software, i.e. Netbackup. When the backup job completes, VCB will then remove the snapshot by merging the changes recorded in the delta files with the main VMDK files and delete the delta files from the SAN.
Now, in theory this sounds very neat, and in reality it is. That is, until it goes wrong. Sometimes when a VCB backup job fails (and they do fail from time to time), the snapshot on the VM doesn’t get removed. In this case, all changes to the guest OS will still continue to write to delta files. And to make things even worse, I’ve seen cases where the snapshot failed to be removed even though the VCB backup job completed successfully. In this case, Netbackup will show a successful backup, yet the snapshot still exists on the VM. You simply can’t assume that all virtual machine snapshots are cleared off just because Netbackup or whatever you use as your backup application reports successful backups.
So why are stale snapshots a problem you might ask? Well, not only do they grow to huge sizes which may actually cause the datastore to fill up and crash all other VMs on that datastore, but VCB will probably not be able to perform backup operations on a VM that already has snapshots. So yes, a stale snapshot on a VM will cause your next VCB job to fail. You also run the risk of your snapshot delta files to go out of sync with each other and that could cause a loss of data in the worst case. All of which I have first hand experience.
My advice is simple. Make sure you don’t have any snapshots on any virtual machines in scope of being backed up prior to the backup window opening. This is simple, but if you have hundreds of virtual machines, going though each VM to check for snapshots is insane! So, myself and colleague came up with a Perl script that will go and check for any delta files in all datastores seen by the ESX host and return a list of delta files via email.
Today I had an issue here an ESX host became unresponsive in vCenter, yet the VMs that were running on the host were fine. The normal remedy for this issue would be to restart the management agent on the ESX host via the Service Console:
However, this did not work. The mgmt-vmware restart command hung while stopping the "VMware ESX Server Host Agent". Ten minites after executing mgmt-vmware restart, I decided to break out of the process by pressing Ctrl+z.
Clearly, there was a problem with the existing running instance of the management agent, vmware-hostd. The only way to get this working without a host reboot, is to find the PID for vmware-hostd and kill it:
To locate the PID for the running vmware-hostd process execute:
ps -auxwww |grep vmware-hostd
You will see output similar to: (I've marked the PID in BOLD text)
root 13089 1.3 2.6 179080 6988 ? S 2008 1695:23 /usr/lib/vmware/hostd/vmware-hostd /etc/vmware/hostd/config.xml -u
To kill the running process, execute:
kill -9 <PID> (I had to run "kill -9 13089")
Once vmware-hostd is no longer running, you can restart the management agent by running:
service mgmt-vmware restart
I don't know if anyone else has come across this issue, but since upgrading to ESX 3.5 Update 2 we've been having strange problems with VCB snapshots. I've not had much time for troubleshooting in the last 3 or so weeks, but I found a workaround. Before I get ahead of myself, let me just first explain the issue we are having.
When backing up our VMs with VCB, the snapshot delta files created by VCB needs to be merged back into the main VMDK. However, for the past few weeks I've seen cases on some of our ESX hosts where the snapshots never gets merged back, and the delta files just keep on stacking up every time a snapshot is created. In other words, here’s what happens:
1. VCB creates a snapshot of a VM. This creates delta files such as VMNAME-000001.vmdk.
2. When the backup process completes, the snapshot delta VMDK is supposed to merge back into the main or its parent VMDK file but fails to do so. Now, this is not normally a problem, as you can just go and "delete" the snapshot using the snapshot manager in the VI Client.
3. However, when you go to the snapshot manager in the VI client, there are no VCB snapshots but there may be a "consolidate helper" snapshot. Even if I delete this snapshot, the process fails to merge the VMDK files back.
4. If I then create another snapshot manually using the snapshot manager, this creates the second set of delta files such as VMNAME-000002.vmdk.
5. When I then try to delete the snapshot, the VI Client reports the Virtual Machine as having no snapshots, however when browsing the data store, I can still see all the delta files. Also, when I log onto the ESX server where the VM is running, and issue vmware-cmd /vmfs/volumes/<DATASTORE>/<VMNAME>/<VMNAME>.vmx hassnapshot, the ESX server returns no snapshots for that VM.
This is a strange problem. The Virtual Machine clearly has snapshots delta files in its data store; however the ESX host is unaware of any snapshots for that VM.Now I did find a workaround for this problem, but I've been unable to find the root cause of this problem as I've been way to busy the last few weeks to have a good look at it.
The workaround is:
1. Log onto the console of the ESX host where the VM with snapshot problems is running on.
2. Restart the management agent on that server with service mgmt-vmware restart.
3. In the VI Client, go to the snapshot manager and manually create a snapshot for the VM (without a memory snapshot).
4. Now, "Delete" all snapshots. This should merge all delta files back into the main VMDK file.
The workaround suggests that there is a problem with the management agent on the ESX hosts, but if so, it has to be something in the Update 2 release as my cluster has 16 hosts and the problem seems to be popping up on random hosts daily. I am now planning to upgrade to Update 3 to see if that will clear the problem.
If anyone else has come cross a similar issue, please drop me an email or a comment.
Just a quick guide to patching a standalone ESX 3.5 Server using esxupdate from the service console:
Download all available patches to your local computer from the following link: (ALSO DOWNLOAD THE FILE CALLED contents.zip)
Decide where to place the patches in a partition with enough space to accommodate the patches on the target ESX Server. It is not recommended to use the root ( / ) partition at all. A good strategy is to create a directory called updates under the /var partition.