When working on a Linux VM via the VMware Remote Console over a WAN or slow link, the keystrokes sent to the console might end up reppeating. In order to avoind this, perform the following steps:
1. Power down the VM
2. Add the following line to the VMX file (can also be done by editing the Advanced VM settings using the vSphere Client):
keyboard.typematicMinDelay = "2000000"
3. Save the VMX file
4. Power on the VM.
I decided to post this as I keep on forgetting what the fix is. Now I'll know where to find it in the future without having to go to Google ;-)
Last week, a colleague of mine was informed by VMware of an issue with VMware ESX 4.0 Update 1 hosting 32-bit Windows 2003 Virtual Machines with more than 3GB of RAM. the issue causes VMs with the configuration specified below to crash upon reboot.
In ESX 3.5, when trying to create a new snapshot, the following error is reported in the VI Client:
An invalid snapshot configuration was detected
Also, when reading the Virtual Machine log file (vmware.log) located in the same location as the VM, you may find references to delta files such as "vm_name-000001.vmdk", but when browsing the datastore, the delta file does not exists. Also, when running vmware-cmd hassnapshot, "hassnapshot ()= " is returned, which means that the VM has no snapshots in place.
If this is truly the case, why is "An invalid snapshot configuration was detected" returned when trying to create a new snapshot?
VMware records snapshot information about the current VM in a .vmsd file. This file is located with the rest of the virtual machine configuration and VMDK files on the datastore. The file is normally called .vmsd. This file will contain information, even if your VM has no snapshots in place.
It is possible that in some cases, ESX fails to properly clean up after previous snapshots were removed. The information for previous snapshots may still be recorded in the .vmsd file. The file may indicate that you still have snapshots in pace, although all previous snapshots were removed and the dalta files have been merged. When you then try to create a snapshot, the .vmsd file will inform the ESX host that there is a delta file in place and that it has to create a second or third delta file. When the ESX host interigates the VMFS file system, it's unable to find the snapshot delta files specified in the .vmsd file and therefore errors with "an invalid snapshot configuration was detected."
This is by no means a new issue. However, I still get support calls regarding tasks that get stuck on VMs. What do I mean by "stuck tasks"? Well, I've seen cases where a snapshot task initiated by VCB got stuck in the state of "Creating Virtual Machine Snapshot". Then VM goes down and cannot be accessed via the console, does not respond to pings, and the status of VMtools turns to "Unknown". You also cannot do "Power On" on the VM either as the "Creating Virtual Machine Snapshot" task is still showing as an active task. You can wait, but after 30 minutes, chances are that it won't sort itself out, so user intervention is required!
This is normally the approuch I take to sort this out:
2. Browse the datastore where the VM is located (best to do this via the CLI on the service console with "ls -lh") and check the time stamps of the files to see how log the snapshots, if any,have been sitting there for.
3. in VirtualCenter, or "vCenter" the VM will probably still be showing as powered on. Check on which of your ESX hosts it is running.
4. Log onto the service console of the ESX host that is running the VM. Elevate your priviledges to root.
5. Now, as the VM has an active task, you won't be able to send any other commands to the VM. You won't be able to use vmware-cmd to change the state of the VM either. Until the task that's stuck in progress has completed, the ESX host will not be able to send any power commands to the VM. The only way to now release the VM from it's sorry state and get rid of the "Active task" is to kill the VM's running process from the service console. In order to do so, you need to find the PID for the "running" VM. To get the PID do:
The Syntax is:
ps -auxwww |grep <VM-NAME>
Suppose you have a VM called WKSTNL01 The command will be:
ps -auxwww |grep WKSTNL01
This should return something like this:
root 12322 0.0 0.4 3140 1320 ? S<s 13:32 0:03 /usr/lib/vmware/bin/vmkload_app --sched.group=host/user/pool1 /usr/lib/vmware/bin/vmware-vmx -ssched.group=host/user/pool1 -# name=VMware ESX;version=4.0.0;buildnumber=164009;licensename=VMware ESX Server;licenseversion=4.0 build-164009; -@ pipe=/tmp/vmhsdaemon-0/vmx673aca8b7403868b; /vmfs/volumes/489a1228-2bfd25b5-6a2c-000e0cc41e52/WKSTNL01/WKSTNL01.vmx
The PID in this instance is 12322. This is what we need to kill.
6. Kill the process ID with kill -9:
kill -9 12322
7. Delete any snapshots created
8. Power On the VM.
I’ve found myself asking this very annoying question just last week again. Which one of the servers is holding a lock on a virtual machine log file that was last modified 3 months ago?
Last week I came across a problem where VCB failed a job while trying to perform a full backup of one of the VMs. This was because one of the log files for the Virtual Machine was locked on the SAN. VCB was therefore unable to copy the log file to the backup server and therefore failed the entire job.
Normally, a simple VMotion of the Virtual Machine to another host will solve this issue, but I wasn’t as lucky this time. So I thought powering off the VM will do it... Didn’t work! No matter what I did, I just couldn’t get the lock released on that file. One of the ESX hosts in the cluster was holding on to the log file, but how do I go about finding out which one of the 20 ESX hosts is was? To me, this sounded like a job for vmkfstools, and indeed it was. Well, sort off. Using vmksftools, I was able to retrieve the MAC address of the ESX host in the cluster that was holding on to the 3 month old log file.
The command is:
vmkfstools –D /filename
In my case this was;
vmkfstools –D /vmfs/volumes/iscsi-002-vmfs/WKSTN01/vmware.log
The output is then written to /var/log/vmkernel.
To get the output, simply do:
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.020 cpu0:4174)FS3: 142:
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.020 cpu0:4174)Lock [type 10c00001 offset 29190144 v 7, hb offset 4083712
Jun 20 15:35:33 esx1 vmkernel: gen 1881, mode 1, owner 4a2128d2-86a81c3a-ce30-000e0cc41e98 mtime 893]
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.020 cpu0:4174)Addr , gen 6, links 1, type reg, flags 0x0, uid 0, gid 0, mode 644
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.021 cpu0:4174)len 312433, nb 1 tbz 0, cow 0, zla 1, bs 1048576
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.021 cpu0:4174)FS3: 144:
The MAC address of the host locking the file is reported in line 3:
Now, this is the bit where I can’t make it any easier for you. Unless you write a script, (and I don’t have that much time at the moment) the only way to find the host with that MAC is to log onto each host via SSH and run:
esxcfg-info |grep –i ‘system uuid’
This will then return the UUID for the host you are on. If it matches the MAC retrieved using vmkfstools, then you know the process that’s keeping the lock is on that server.
So what process is locking the file? That I can’t tell you. I can only give you some tips as to how to find it.
1. Power off the VM in vCenter;
2. Log onto the service console of the host that’s locking the file;
3. Try to move or delete the lock file from the service console of the locking host. This worked me. If it works for you, then good. If not, go to step 4;
4. Try and see if there’s a process running with the filename that is locked;
ps –auxwww |grep
If it returns a line(other than the grep line) kill the process with “kill -9
5. If it doesn’t return any processes under that filename, then try and search for a PID with the VM name that has a locked file:
If it returns a PID, kill the PID, as your VM was already powered off in step one and should therefore not have a PID on any host;
6. If it still doesn’t work, leave a comment and we'll have a look at it ;-)
After patching some test ESX hosts with ESX 3.5 Update 4, the problem with the VMware tools being shown as "Not running" after a VCB backup operation seems to have been solved. This has cured some backup woes at least.
I will now run ESX 3.5 Update 4 in a test cluster (with virtual machines that will be backed up with VCB) for a few of weeks before updating production ESX hosts to Update 4.