November 28, 2010

Moving an ESX host from VC 2.5 to VC 4.1 fails with "Failed to install the VirtualCenter Agent Service"

One time, I was preparing to do some testing on upgrading of ESX 3.0 to ESX 4. I had three hosts (ESX 2.0 - ESX 3.0) to be removed from the existing VC2.5 and joined to our new VC4.1. Two of them completed successfully, but one failed during the process and VC showed the error message "Failed to install the VirtualCenter Agent Service".

Turns out that an important directory needed for agent installation was missing. Once it was added, the host could be added successfully to VC4.1. An article on VMwolf describes this.

In case you are wondering, I tried to upgrade one of the 3.0 hosts using the vSphere Host Update Utility, but failed. The console showed errors trying to find some device UUIDs, which I could not trace in the host. Initially I thought it could be due to phantom SAN connections. Luckily I could still bootup the old ESX 3.0 OS, so I removed all the SAN connected datastores and even unplugged the FC connections, but the upgrade still failed.

Problems accessing a shared folder using UNC path with alias name (Part 2)

This is a follow up to my earlier post on issues accessing a server share with UNC path using alias.

Windows XP SP2 and Windows 2003 SP1 introduced a new Loopback check security feature that is meant to prevent reflection attacks on a computer. What this means is that if an authentication is made to the computer with credentials used does not match the server's hostname.

While the intention is good, it also resulted in share access using alias failing with "Access denied" or "No network provider accepted the given network path" errors. I had a few separate occasions where application teams or DBA team giving feedback on such errors.

The end result was we disabled this feature by registry settings, even though it is not as secure as adding each alias that references the server, as doing this across all the servers in our domain poses a big administrative challenge.

Problems accessing a shared folder using UNC path with alias name (Part 1)

This is a 2 part problem in that it was originally resolved by Microsoft but reintroduced again as a security feature.

Previously I encountered some issues where users are unable to connect to a Windows 2003 server share using \\alias\share. The errors were "A duplicate name exists on the network". However the normal \\servername\share works fine. After some digging, it was actually because the server was not listening to the alias name and hence not accepting any connections to that name, even though the alias points to itself. The issue is resolved with some registry edits specified below.

This issue only occurs in servers using SMB 1.0 protocol (2000/XP/2003). SMB 2.0 protocol has been rolled out on Vista/2008/7 and hence the registry edits are not required.

Why can't I see the unconnected devices (eg network adapters) in Device Manager?

I bumped into this issue while I was using VMConverter to move a VM from my ESX 3.5 server to another ESX 4 host (there are some other issues along the way, which I will probably create a separate thread for upgrading of VMs). 

I had upgraded the VM to Virtual Machine version 7. The old virtual NIC did not show up in Network Connections anymore. I added a new virtual NIC and tried to assign the old IP, but kept receiving the error that an existing NIC with the same IP has been assigned. Well, I can force it to accept because I know the old one is already disabled and redundant as a result of the VM upgrade, but to make a clean upgrade, I had still wanted to remove the old NIC totally. Also, previously we had some issues with lingering old NIC drivers that caused instability in the server.

Removal was not possible because in the Device Manager, I could not see the old NIC there even after turning on hidden devices. Turns out that this was a design by Microsoft to prevent "phantom" or disconnected devices from showing up in Device Manager.

November 27, 2010

Windows Update on 2008 failed for KB967723 with error code 80070490

We have begun rolling out Windows 2008 servers in our domain since last year. Patching has been generally smooth, other than the new Windows Update GUI which sometimes require a few manual initiated scans before new pending updates from Microsoft are detected (we patch all new servers with all latest Microsoft Updates before adding into our WSUS).

Few months back, while setting up a new 2008 server, I encountered one KB967723 update which kept failing to be patched, even after a few reboots and retries. The error code was 80070490.

As this update is listed critical and fixes vulnerabilities in TCP/IP processing, it is important to be patched so as to prevent any unwanted security loopholes. After some researching, it seems that the update fails due to the way the Windows Automatic Update in Windows 2008 runs this patch. Manual download and running the patch will fix this issue, though Microsoft did not list any known problems in Windows Automatic Update for this patch.

My console screen is black!!

This is the second time such an issue occurred, that I thought I should document it down. Both times, when I tried to rdp into a server, the screen is almost black except for small parts of the screen where I could still make out that I am at the login screen. Checking on the console screen also showed the same black screen. However, if you had keyed in your ID and password before initiating the logon, or key in your password in the black screen, you can still press "Enter" and proceed to logon successfully.

Checking online, others had similar issues but no good solutions, till we stumbled across a post where someone suggested to check some registry settings. That proved to be the case and after reverting the settings, everything was ok.

It seems that if the OS drive got too full, somehow the registry settings for the logon screen colour gets corrupted. This occurred in my case, because I was in fact trying to rdp into the server to cleanup diskspace and was alerted to the black screen.

VMware discussion

Today I had the chance to join in a discussion with some VMware engineers. Discussion topic was about monitoring, capacity planning, backup. It was a fruitful discussion and I got to learn quite some new stuff, particularly new offerings from VMware.

VMware Alive - This tool is supposed to be able to monitor health/capacity of the esx environment and flag out any issues before host/VM performance are impacted. We were told this tool is a new offering from VMware. Have never heard of this tool previously, but a quick google online shows this is a product of the company Integrien which has been recently acquired by VMware. Looks like a cool tool, but too bad it is not free.

VMware Application Discovery Manager (ADM) - This tool provides real time monitoring of applications. I was told it monitors by processes at OS level and seems to be able to map out dependencies of any application across the virtual environment. Yet another must pay product.

November 23, 2010

Where did my gpmc.msc go to in Windows 2008?

Today I had an issue which required me to run group policy management console on a windows 2008 server. Different from Windows 2003 where it has to be installed separately, Windows 2008 comes with this tool, but it is a "feature" that needs to be added.

Symptons
  • running gpmc.msc command in Windows 2008 does not show the group policy management console

What does "Access this computer from the network" really do?

When I first started out, the different local security policies would confuse me and the lengthy but useless Microsoft descriptions do not help much too.

One of the common rights I have to frequently get in touch with is Access this computer from the network. Sometimes, users will have issues accessing their servers. When told this right is needed, they will ask what is it for. This is where the dreaded time comes to explain in layman terms to users.

November 22, 2010

Why are my VMs are not reporting to WSUS?

We have been deploying VMs in our esx environment using VM templates without any issues. One day however, we discovered that there were quite a few 2000 and 2003 servers not reporting into the WSUS server. No matter how we restarted the windows update services or force detect through commandline, the servers would still not to be detected in WSUS. 

After some troubleshooting, we discovered that only our VMs were having such problems. The physical servers were reporting properly. Furthermore, we discovered they had the same SUSClientId values.
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\WindowsUpdate]

Yeah its UP!

Welcome to my new blog!

As a relatively new tech guy, I must say I had a great opportunity to land in a big, hectic customer environment as my first job. 

There, I get to come across many interesting or tough issues that always requires time to troubleshoot and resolve.  It is always satisfying to figure out what happened and the steps to resolve the issues, but more importantly, I have always wanted to setup a little techlog to jot down all these little experiences. The bookmarks and printed pages of documentation are helpful but are just not as portable as having an online techlog. It took me sometime to get going but I am glad to have finally get this up.

Hence this (not so) technical blog will form my repository to store and maybe share my random experiences with close friends and colleagues.

Together with this blog, I have also bought a domain name (my name!), which I have it setup to redirect to this page. I am still awaiting to see the results, hopefully it will work. 

While still awaiting for the invisible www hand to propagate my tiny website name across global dns servers, I am still as yet undecided whether to have this indexed and searchable in the big www. It may seem counter-intuitive having a blog but not wanting to have it exposed on the www, but to me, I really intend this for personal use, sort of like a online tech diary, and I am not an attention guy, so this is really something that I will need to decide in the next few days.

But in the meantime, I shall get started on my first real tech post!