...backed up daily by:


myTemplate

A Real World Template Design for Joomla

Poll

Would you rather like to receive:
 

Search VirtualVCP

The VMware Search Engine

Custom Search

RSS Feeds

Home
Stuck task on VM
User Rating: / 0
Troubleshooting Tips - VMware VI3: Virtual Machine Operations
Written by Rynardt Spies   
Monday, 29 June 2009

This is by no means a new issue. However, I still get support calls regarding tasks that get stuck on VMs. What do I mean by "stuck tasks"? Well, I've seen cases where a snapshot task initiated by VCB got stuck in the state of "Creating Virtual Machine Snapshot". Then VM goes down and cannot be accessed via the console, does not respond to pings, and the status of VMtools turns to "Unknown". You also cannot do "Power On" on the VM either as the "Creating Virtual Machine Snapshot" task is still showing as an active task. You can wait, but after 30 minutes, chances are that it won't sort itself out, so user intervention is required!

This is normally the approuch I take to sort this out:

1. Make sure that the VM is inaccessible to everyone and that it really is down.

2. Browse the datastore where the VM is located (best to do this via the CLI on the service console with "ls -lh") and check the time stamps of the files to see how log the snapshots, if any,have been sitting there for.
3. in VirtualCenter, or "vCenter" the VM will probably still be showing as powered on. Check on which of your ESX hosts it is running.
4. Log onto the service console of the ESX host that is running the VM. Elevate your priviledges to root.
5. Now, as the VM has an active task, you won't be able to send any other commands to the VM. You won't be able to use vmware-cmd to change the state of the VM either. Until the task that's stuck in progress has completed, the ESX host will not be able to send any power commands to the VM. The only way to now release the VM from it's sorry state and get rid of the "Active task" is to kill the VM's running process from the service console. In order to do so, you need to find the PID for the "running" VM. To get the PID do:

The Syntax is:
ps -auxwww |grep <VM-NAME>

Example:
Suppose you have a VM called WKSTNL01 The command will be:
ps -auxwww |grep WKSTNL01


This should return something like this:

root     12322  0.0  0.4   3140  1320 ?        S<s  13:32   0:03 /usr/lib/vmware/bin/vmkload_app --sched.group=host/user/pool1 /usr/lib/vmware/bin/vmware-vmx -ssched.group=host/user/pool1 -# name=VMware ESX;version=4.0.0;buildnumber=164009;licensename=VMware ESX Server;licenseversion=4.0 build-164009; -@ pipe=/tmp/vmhsdaemon-0/vmx673aca8b7403868b; /vmfs/volumes/489a1228-2bfd25b5-6a2c-000e0cc41e52/WKSTNL01/WKSTNL01.vmx

The PID in this instance is 12322. This is what we need to kill.

6. Kill the process ID with kill -9:

kill -9 12322


7. Delete any snapshots created

8. Power On the VM.

 
Which ESX host is locking my files?
User Rating: / 0
Troubleshooting Tips - VMware VI3: Virtual Machine Operations
Written by Rynardt Spies   
Monday, 22 June 2009

I’ve found myself asking this very annoying question just last week again. Which one of the servers is holding a lock on a virtual machine log file that was last modified 3 months ago?

Last week I came across a problem where VCB failed a job while trying to perform a full backup of one of the VMs. This was because one of the log files for the Virtual Machine was locked on the SAN. VCB was therefore unable to copy the log file to the backup server and therefore failed the entire job.

Normally, a simple VMotion of the Virtual Machine to another host will solve this issue, but I wasn’t as lucky this time. So I thought powering off the VM will do it... Didn’t work! No matter what I did, I just couldn’t get the lock released on that file. One of the ESX hosts in the cluster was holding on to the log file, but how do I go about finding out which one of the 20 ESX hosts is was? To me, this sounded like a job for vmkfstools, and indeed it was. Well, sort off. Using vmksftools, I was able to retrieve the MAC address of the ESX host in the cluster that was holding on to the 3 month old log file.

The command is:

vmkfstools –D <path-to-vm>/filename

In my case this was;

vmkfstools –D /vmfs/volumes/iscsi-002-vmfs/WKSTN01/vmware.log

The output is then written to /var/log/vmkernel.

To get the output, simply do:

tail /var/log/vmkernel

This returned:

Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.020 cpu0:4174)FS3: 142: <START vmware.log>
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.020 cpu0:4174)Lock [type 10c00001 offset 29190144 v 7, hb offset 4083712
Jun 20 15:35:33 esx1 vmkernel: gen 1881, mode 1, owner 4a2128d2-86a81c3a-ce30-000e0cc41e98 mtime 893]
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.020 cpu0:4174)Addr <4, 53, 5>, gen 6, links 1, type reg, flags 0x0, uid 0, gid 0, mode 644
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.021 cpu0:4174)len 312433, nb 1 tbz 0, cow 0, zla 1, bs 1048576
Jun 20 15:35:33 esx1 vmkernel: 23:02:22:35.021 cpu0:4174)FS3: 144: <END vmware.log> 

The MAC address of the host locking the file is reported in line 3:

000e0cc41e98

Now, this is the bit where I can’t make it any easier for you. Unless you write a script, (and I don’t have that much time at the moment) the only way to find the host with that MAC is to log onto each host via SSH and run: 

esxcfg-info |grep –i  ‘system uuid’

This will then return the UUID for the host you are on. If it matches the MAC retrieved using vmkfstools, then you know the process that’s keeping the lock is on that server.

So what process is locking the file? That I can’t tell you. I can only give you some tips as to how to find it.
1.       Power off the VM in vCenter;
2.       Log onto the service console of the host that’s locking the file;
3.       Try to move or delete the lock file from the service console of the locking host. This worked me. If it works for you, then good. If not, go to step 4;
4.       Try and see if there’s a process running with the filename that is locked;

ps –auxwww |grep <locked filename>

If it returns a line(other than the grep line) kill the process with “kill -9 <PID>"

5.       If it doesn’t return any processes under that filename, then try and search for a PID with the VM name that has a locked file:

ps –auxwww|grep <vm name>

If it returns a PID, kill the PID, as your VM was already powered off in step one and should therefore not have a PID on any host;

6.       If it still doesn’t work, leave a comment and we'll have a look at it ;-) 

 
Last chance to win a trip to VMworld 2009!
User Rating: / 0
The News - Latest News
Written by Rynardt Spies   
Wednesday, 10 June 2009

Here's a last chace to win a trip to VMworld 2009:

http://www.veeam.com/go/video/

 

 
New Release: Veeam Monitor 4.0
User Rating: / 0
The News - Latest News
Written by Rynardt Spies   
Tuesday, 02 June 2009

Image
Image: Veeam
Veeam Software has released Veeam Monitor 4.0 and Veeam Monitor Free Edition 4.0. The new version of Veeam Monitor Includes:

  • Support for VMware vSphere 4;
  • Storage monitoring;
  • Hardware monitoring;
  • More new features requested by customers.

 

I've not had time to install and play with Veeam Monitor 4.0, but if Veeam Monitor 3.0 is anything to go by, I'm sure Veeam Monitor 4.0 will be good. I'll install it and do a review sometime this week.

 

You can download Veeam Monitor 4.0 from: http://www.veeam.com/vmware-esx-monitoring.html 

You can download Veeam Monitor Free Edition 4.0 from: http://www.veeam.com/esxi-monitoring-free.html

 
<< Start < Prev 1 2 3 4 5 6 7 8 9 Next > End >>

Results 1 - 8 of 70

Awarded vExpert for 2009

VMware vExpert for 2009

I recommend VMware vSphere 4!
Save up to 25% on selected VMware products!

Vi-Pedia Virtualization Wiki

Virtualization Wiki

Help Keep VirtualVCP Alive!

Information and Resources on this site are 100% free, however it costs money to run this site. Please help support this site by giving a donation.