If you’ve worked with XenServer for any length of time, you have no doubt experienced having a VM turn “orange” or “amber” or otherwise become unmanageable. Here are couple of similar problem scenarios and solutions that might help.
Problem Scenario #1:
You notice that a VM has numerous lifecycle events on a XenServer. It has continuously attempted to shutdown, but remains in the green/on state. The VM will not display a console, or POST information. Manual shutdowns in XenCenter do not work (Shutdown or Force Shutdown).
Solution:
You may have success trying some of these ideas, or it may take a combination of these to obtain control of the stuck VM.
- Start by trying an ‘xe-toolstack-restart’ on the pool master server. This is the easiest fix, and will work a majority of the time. You will lose connection to your pool momentarily. If this doesn’t work, go onto the next steps listed below.
- If this is a XenDesktop hosted VM, put the VM in maintenance mode, if you cannot force a Start/Shutdown from the DDC
- From the XenServer console, try the following command to force a shutdown: ‘xe vm-shutdown –force vm=VMNAME’. If VM does not shutdown with this command, proceed to next step
- In XenCenter, once the above two items are done, attempt to “Reboot” the VM. It may restart now.
Related Issue:
I experienced a similar issue where all members were down in a XenServer pool, but the pool Master remained up and functional. The ‘toolstack’ processes were not running on pool members. An ‘xe-toolstack-restart’ was required on each pool member XenServer before the server would appear functional and participate in the XenServer Pool.
Problem Scenario #2:
This is the most common scenario you will see. A virtual machine will go into an “amber” or “orange” state and you are unable to shutdown, reboot, or even forcefully reset the VM.
Solution:
-
Find the UUID of the hung VM.
You can do this via the command line with ‘xe vm-list’ or via XenCenter. -
Find the Domain ID of the hung VM.
Run ‘list_domains’ from the command line, and match the UUID with the ID numberid | uuid | state
0 | 2fe455fe-3185-4abc-bff6-a3e9a04680b0 | R
47 | 267227f3-a59e-dafe-b183-82210cf51ec4 | B
59 | 298817fb-8a3e-7501-11e0-045a8aa860ff | B
60 | 46e3d5aa-2f02-dfdc-b053-9a8ac56ec5d1 | B
61 | 16cf3204-eb17-5a12-e8d0-c72087bda690 | B
62 | 1f9053b5-c6ca-40bb-504e-3017c37e7281 | H
63 | ddaec491-097a-e271-362b-f2f985e26e4a | R
65 | 55f3b225-4f65-d1ea-aa19-add44c5acce7 | B
66 | 7adef6fd-9171-5426-b333-6fb1b57b8e60 | B H
67 | 6046dc13-f70b-8398-56fb-069c22440a7c | B
68 | f201cd94-a501-00c2-d21e-8c2f03ea167b | B H -
Run destroy_domain on the Domain ID.
# /opt/xensource/debug/destroy_domain -domid 62
-
The VM will still show itself as running, so now, we need to reboot it.
# xe vm-reboot name-label=’name of the VM’ –force
- The VM is now rebooted, and you can bring it up as if you had just pulled the plug. That is, check for some disk corruption, etc.
Resource:
https://support.citrix.com/article/CTX131421