Archive for the 'Hyper-V' Category

Hyper-V Replica DC not working

I used the capabilities of Hyper-V 2012 to create a test environment that mirrors production systems (including a Domain Controller).  I couldn’t get the other computers in the environment to use the replica for authentication.  After a little more testing, I figured out that I couldn’t open AD DS on the DC. 

Lots of searching on various errors eventually let me to this blog post, where a simple registry edit solved my problem:

http://exchangeonline.in/windows-server-2012-naming-information-located-because-domain-exist-contacted/

Failed to power on with error the process cannot access the file

I have been moving some of my Hyper-V installs from 2008 R2 to 2012.  Because of the large number of VHD’s attached to a particular VM, I elected to disconnect the disk from the old host and connect them to the new host, and then do an import.

I did this on a few smaller instances and it worked fine, but on this one I kept getting an error every time I tried to start the VM.  The error was “Failed to power on with error The process cannot access the file…” and the path to one of the many VHDs.

In the end, I removed the antivirus.  I believe what was causing the problem was a time out trying to access all the VHDs and the antivirus was slowing it down too much.

Error (415) adding a host to SCVMM 2012 sp1

I kept having errors adding hosts to a VMM server, even though all of the prereqs were met.  

I received the following errors every time I tried to add the hosts:

Error (415)
Agent installation failed copying C:\Program Files\Microsoft System Center 2012\Virtual Machine Manager\agents\I386\3.1.6011.0\msiInstaller.exe to \\<hostname>\ADMIN$\msiInstaller.exe.
The specified network name is no longer available

Recommended Action
1. Ensure <Hostname.FQDN> is online and not blocked by a firewall.
2. Ensure that file and printer sharing is enabled on <Hostname.FQDN> and it not blocked by a firewall.
3. Ensure that there is sufficient free space on the system volume.
4. Verify that the ADMIN$ share on <Hostname.FQDN>exists. If the ADMIN$ share does not exist, reboot <Hostname.FQDN> and then try the operation again.

Warning (10444)
The VMM management server was unable to impersonate the supplied credentials.

Recommended Action
To add a host in a disjointed domain namespace, ensure that the credentials are valid and of a domain account. In addition, the SCVMMService must run as the local system account or a domain account with sufficient privileges to be able to impersonate other users.

This took me much longer than the 5 minutes it should have taken to figure out. 

Basically, we have two links to the remote hosts.  Traffic to that remote site is routed differently depending on the which subnet it is on.  Also, we have a VLAN that is specifically set for switch management.  Once I moved the VMM server to a VLAN that was NOT restricted, the hosts added just fine.

If that isn’t your issue, but you get the Error (415) above, there is a knowledge base article that says you may have to enable the fileserver role first on a 2012 host.

Hyper-V host blank black screen

I recently had a problem with a couple of IBM Blades that I was trying to deploy as Hyper-V hosts.  I employ the use of a replay volume from our Compellent storage to create an image for my blades.  Basically, I install one, sysprep it, and then copy the volume mount the copy as the boot volume for each of my blades.  The most recent hosts that I attempted to use this technique with, would boot, but once they made it into Windows, the screen would go black and there would be no way to interact with the machine other than turning it off.

This happened on two blades in two chassis so I assumed that it must be the image.  I made a new image, and it worked just fine, until I installed the Hyper-V role.  Once I installed the Hyper-V role, the machine exhibited the same behavior as above. 

With a short amount of searching, I came across this: http://social.technet.microsoft.com/Forums/en-US/winserverhyperv/thread/61fd5b0d-9d15-4f74-a970-7aafe491ef67

I have actually seen this mentioned before for something else, I just don’t recall what, but basically the problem was that the two processors in each blade were different revisions.  The simple solution we employed was to swap a processor from each blade to have the two processors match.  Now all is well in the land of Hyper-V.  At least for the moment…

SCVMM and P2V Adventures

Where I work, we have been using Microsoft Virtualization since Virtual Server was in Beta.  Of course, we don’t necessarily use all of the functions and features of all the software we have, but one feature that I have used a good bit is the “Convert physical server” action in System Center Virtual Machine Manager.  Until recently, I have used this with great success.  We run IBM xSeries servers and I have converted something like 50 of them to virtual machines running on Hyper-V over the past several years. 

In late 2007, we bought our first IBM Blade Center (which I am very happy with) and with that move we also decided to do “boot from SAN” for all of our blades.  Just seemed to make sense that we wouldn’t put moving parts in a device that was designed to run so well without moving parts. 

At the time, we were implementing a new ERP system and several “hanger on” type applications, and Hyper-V (virtualization in general) wasn’t something that was supported by a lot of the software we were deploying.  So we have a lot of powerful blade servers, running a lot of low use applications.  I have managed to eradicate several of those wasteful installations, but there are a set that I am only now getting buy-in to virtualize. 

And today’s adventure begins with a Windows Server 2003 SP2 machine installed Boot from SAN on an IBM HS21-XM Blade server.

First attempt:

1.  Convert physical server

2.  Virtual machine name

3.  Scan System

image

Looks good..

4. Conversion options

image

we can try the defaults..

5.  Specify the processor and memory… 

6.  Select the host, path, network, start options, etc..

7.  The job starts, the machine gets copied over, and …

That try resulted in a blue screen loop.. 

image

Ok… time to try the Offline conversion:

1. Proceed as above but select the Offline conversion option at step 4.

2.  hmm..  conversion warnings… must correct to proceed..

Warning (13246)
No compatible drivers were identified for the device: Broadcom BCM5708S NetXtreme II GigE (NDIS VBD Client). The offline physical-to-virtual conversion requires a driver for this device.

Device Type: network adapter
Device Description: Broadcom BCM5708S NetXtreme II GigE (NDIS VBD Client)
Device Manufacturer: Broadcom Corporation
Hardware IDs (listed in order of preference):
B06BDRV\L2ND&PCI_16AC14E4&SUBSYS_03271014&REV_12

Compatible IDs (listed in order of preference):
B06BDRV\L2ND&PCI_16AC14E4&SUBSYS_03271014
B06BDRV\L2ND&PCI_16AC14E4
B06BDRV\L2ND

Recommended Action
Create a new folder under C:\Program Files\Microsoft System Center Virtual Machine Manager 2008 R2\Driver Import on the Virtual Machine Manager server and then copy the necessary 32-bit Windows Vista driver package files for this device to the new folder. The driver package files include the driver (.sys) and installation (.inf and .cat) files. Check the device manufacturer’s website for the necessary drivers.

We don’t really need to do that right…

Had some trouble with that part…  finally figured out that the drivers that need to be placed in that folder are the “RIS” drivers. 

Try number 3 (or 30, I lost count)…

1. Proceed as try number 2, ignore warning because we did put the driver in there, and

Blue screen loop…

Hmm… maybe this is just not meant to be.  Did some more searching and found this article:

http://blogs.msdn.com/b/robertvi/archive/2009/10/07/after-installing-hyper-v-integration-services-on-the-next-reboot-the-vm-displays-bsod-0x0000007b.aspx 

Basically, there are some people seeing the exact same blue screen that I was seeing, except this was after the install of updated integration components.  But I wasn’t installing integration components yet… or was I?

image

Ok so maybe it was getting that far and just “blowing up” after the install of the components.  Good thing about this being a P2V, I can go back to the source machine pretty easy and check the registry:

image

Looks like we may have an answer here.  Change the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Wdf01000\Group entry to be WdfLoadGroup instead of base. 

It is my guess, that this would have worked even with the online conversion option.

“netvsc” error in Hyper-V guest

We use Citrix Presentation Server for a number of applications, and lately we have had a significant increase in issues with one set of our Citrix servers.  We have 3 main sets of Citrix servers and the problems have only been happening on one set. 

One of the sets doesn’t have this error, but wouldn’t because they are physical servers.  They have been in production a long time, and we have plans to virtualize them. 

The second set doesn’t get the errors, but it is fewer servers and fewer users.

The third set:

    • is virtual
    • runs on 2008 R2 Hyper-V
    • has more servers (6 as opposed to 4 or 5 for the other two)
    • supports more users and more users per server (averages around 20 users per server during business hours)

Around November, we started upgrading our hosts from 2008 to 2008 R2.  The problems have been getting progressively worse peaking in the last 2 months.  Our last 2008 host was converted in March. 

After some event log review, we were able to correlate some of the issues to the following error in the event log:

Event Type:    Warning
Event Source:    netvsc
Event Category:    None
Event ID:    5
Date:        4/19/2010
Time:        3:49:53 PM
User:        N/A
Computer:    <ServerNameChangedToProtectTheGuilty>
Description:
The miniport ‘Microsoft Virtual Machine Bus Network Adapter #4’ hung.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00 00 00 00 02 00 52 00   ……R.
0008: 00 00 00 00 05 00 00 80   …….€
0010: 00 00 00 00 00 00 00 00   ……..
0018: 00 00 00 00 00 00 00 00   ……..
0020: 00 00 00 00 00 00 00 00   ……..

and right behind that would be this message:

Event Type:    Information
Event Source:    netvsc
Event Category:    None
Event ID:    4
Date:        4/19/2010
Time:        3:49:53 PM
User:        N/A
Computer:    <ServerNameChangedToProtectTheGuilty>
Description:
The miniport ‘Microsoft Virtual Machine Bus Network Adapter #4’ reset.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00 00 00 00 02 00 52 00   ……R.
0008: 00 00 00 00 04 00 00 40   …….@
0010: 00 00 00 00 00 00 00 00   ……..
0018: 00 00 00 00 00 00 00 00   ……..
0020: 00 00 00 00 00 00 00 00   ……..

After doing a bit of searching and getting a lot of nothing, and doing some on site troubleshooting without much luck, I finally broke down and called Microsoft.  I spent a day e-mailing back and forth with someone who was suggesting that I try all the things that I had already tried, so I contacted our TAM and had the case escalated. 

The technician then informed me that there was an internal hotfix that had not been fully tested yet, that related to my issue.  It seems that in 2008 R2 Hyper-V guests running Server 2003, the network adapter will hang and then reset under heavy load.  The hotfix has to be applied to the host and then the integration services on the guest have to be updated.  In my environment, when I updated, I had to remove the integration services from the guest before the updated NIC driver would install.  I reported this behavior to the technician I was working with, but he said that he couldn’t reproduce that particular problem and that he had no issues updating his test environment.

It is my understanding that the hotfix will be released under KB981836.  When you install this, it changes the integration services version from 6.1.7600.16385 to 6.1.7600.20683.  You can see this if you look at the driver version on the guest NIC.

Error installing Integration Services on Hyper-V VM

I was trying to change from a “Legacy Network Adapter” to a “Network Adapter” on one of my Hyper-V VMs.   I added the “Network Adapter” and removed the “Legacy Network Adapter”, and started the machine up.  When the machine came up, it wouldn’t connect to the network.  Having seen this before I knew exactly how to fix it.  Run the Integration Services install again.  I did that, and got:

An error has occurred: One of the update processes returned error code  61658

So, being the smart guy that I am, I rebooted and tried again.  Same result.  The hits I got on google suggested that I was still running an CTP or Beta, but I am running the RTM version (2008 Hyper-V, but not R2 on this one). 

I was logging into this machine with a Domain account, but it wasn’t on the network.  We use some restrictions on the server desktops and redirect application setting and such, so I thought that might be related. 

Here is what worked:

I added back the Legacy Network Adapter (leaving the non-Legacy adapter as well)

I installed integration services again (and it worked just fine)

I removed the Legacy Network Adapter

Now the machine is working just as expected.

Hyper-V and DPM – Some issues that you may see

We have several (15 or so) Hyper-V hosts running a number (126 or so) guests.  We use DPM to backup our servers, but only a few of our VMs are backed up at the host level.  Most are backed up as regular clients.  I have been having trouble with a couple of the ones that we do backup at the host level and just got to looking for the answer to what is going on.  Lucky for me I waited long enough for the Core Team to come up with some suggestions:

Ask the Core Team : DPM 2007 – Troubleshooting protection for Hyper-V

This post is about Windows Server 2008 with the Hyper-V role installed, that are being protected by System Center Data Protection Manager 2007.  There may be one or many Virtual Machines on each Host/Parent Partition, and they may be running Windows 2003 and/or Windows 2008. 

Ask the Core Team : DPM 2007 – Troubleshooting protection for Hyper-V

Time Travel?

Evidently, Windows Server 2003, running on Hyper-V, is confused and thinks it must be time traveling.  Read below for the details…

(Event ID 1054)
Windows cannot obtain the domain controller name for your computer network. (An unexpected network error occurred.). Group Policy processing aborted.

The customer explained to us that if he removes one of the Hyper-V virtual processors from his Windows Server 2003 Guest, the issue goes away. Based on this statement we asked the customer to gather a userenv log while forcing a group policy refresh with the additional virtual processor enabled, and this what we saw during the initial ping test before we process group policy:

USERENV(15c.858) 15:55:09:080 PingComputer: First time: 2069
USERENV(15c.858) 15:55:09:080 PingComputer: Second time: 2069
USERENV(15c.858) 15:55:09:080 PingComputer: First and second times match.
USERENV(15c.858) 15:55:09:080 PingComputer: First time: 2069
USERENV(15c.858) 15:55:09:080 PingComputer: Second time: -2069
USERENV(15c.858) 15:55:09:080 PingComputer: First time: -2069
USERENV(15c.858) 15:55:09:080 PingComputer: Second time: 0

We have a knowledgebase article that pertains to this issue on servers that uses dual-core or multiprocessor AMD Opteron processors:

938448 A Windows Server 2003-based server may experience time-stamp counter drift if the server uses dual-core AMD Opteron processors or multiprocessor AMD Opteron processors

http://support.microsoft.com/default.aspx?scid=kb;EN-US;938448

Now in the case of our customer they were not running an AMD Processor server so they felt this resolution did not apply to them. Even though the article did not apply to the type of processor in their servers, the behavior was identical so we applied the resolution outlined in the knowledgebase article and this resolved the customer’s issue. I am in the process of having a knowledgebase article created to specifically address this issue with Windows Server 2003 virtual machines running in Hyper-V.

So we did a little digging and found the following blog post from Tony Voellm, who is a Principal Software Test Engineer in the Windows Kernel development team:

Negative ping times in Windows VM’s – whats up?

http://blogs.msdn.com/tvoellm/archive/2008/06/05/negative-ping-times-in-windows-vm-s-whats-up.aspx

The following is from the above blog post:

If you see negative ping times in multiprocessor W2k3 guest OSes you might consider setting the /usepmtimer in the boot.ini file.

The root issue comes about from the Win32 QueryPerformanceCounter function.  By default it uses a time source called the TSC.  This is a CPU time source that essentially counts CPU cycles.  The TSC for each (virtual) processor can be different so there is no guarantee that reading TSC on one processor has anything to do with reading TSC on another processor.  This means back to back reads of TSC on different VP’s can actually go backwards. Hyper-V guarantees that TSC will not go backwards on a single VP.

So here the problem with negative ping times is the time source is using QueryPerformanceCounter which is using TSC.  By using the /usepmtimer boot.ini flag you change the time source for QueryPerformanceCounter from TSC to the PM timer which is a global time source.

Ask the Directory Services Team : Userenv 1054 events as a result of time-stamp counter drift on Windows Server 2003 guests running in Hyper-V

P2V fails at Copy Hard Disk

I have been trying to get a P2V of a production system to use in our DR plan.  I have limited opportunity to do this, because I am not allowed to impact performance during production hours for this system, and the definition of production hours is fairly broad.  I have been trying for a couple of months to get this figured out.

We have our regularly scheduled maintenance once a month on the third Thursday of the month.  This is pretty awesome in that we are at liberty (most months) to take everything down from 6PM until 6AM.  I look at it as giving the company an evening off. 🙂

So, that being tonight, I had it in my mind that I was going to beat the OAS boxes.  (Oracle Application Servers, part of our new JD Edwards ERP system.)  They are an interesting setup, because they are using Apache, which as great as it may be, isn’t something I have much experience with.  They have a loopback adapter for use with the load balancing setup that they are in.  The load balancing is performed using our Cisco switches, which as great as they are, I don’t know very much about.  All in all, they are pretty complicated to troubleshoot in this case, because there are so many pieces that I am not completely familiar with. 

Such is life…

Anyway,  after a lot of hunting and a lot of posting in forums, I found an event that actually led to a solution. I probably should have found this before, and maybe I did, but didn’t pay enough attention… 

This is the exact symptoms that I had, and the errors in the event log were there, but the machine that I am trying to convert is a Windows 2003 Server, not Windows XP:

The P2V process fails at 40% when you try to run the P2V process by using Microsoft System Center Virtual Machine Manager 2008 on a source computer that is running Windows XP

You use Microsoft System Center Virtual Machine Manager 2008 to run the Physical-to-Virtual (P2V) process on a source computer that is running Windows XP. However, the process fails at 40% complete, and the following error is logged in the event log on the computer that has System Center Virtual Machine Manager (SCVMM) 2008 installed:

Type:		Warning
Date:		<Date>
Time:		<Time>
Event:		1706
Source:		Virtual Machine Manager
Category:	None
Computer:	<Computer Name>
Event Msg:	Job 7bfcd14a-884e-4a71-9984-3274622adeb7 (Physical-to-virtual conversion) failed to complete. 7bfcd14a-884e-4a71-9984-3274622adeb7 Physical-to-virtual conversion TaskFailed    

Additionally, you will find the following error logged in the event log on the source computer:

Type:		Error
Date:		<Date>
Time:		<Time>
Event:		15005
Source:		HTTP
Category:	None
Computer:	<Computer Name>
Event Msg:	Unable to bind to the underlying transport for 0.0.0.0:443. The IP Listen-Only list may contain a reference to an interface which may not exist on this machine.  The data field contains the error number.
Data:
 00 00 04 00 02 00 52 00 00 00 00 00 9D 3A 00 C0		 . . . . . . R . . . . . . . . À
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00		 . . . . . . . . . . . . . . . .
 00 00 00 00 00 00 00 00 43 00 00 C0				 . . . . . . . . C . . À

The P2V process fails at 40% when you try to run the P2V process by using Microsoft System Center Virtual Machine Manager 2008 on a source computer that is running Windows XP