My wife has the amazing ability to pack huge quantities of "essential items" into every kitchen cupboard, nook and cranny. This is usually fine, albeit occasionally hard to find things, but occasionally it results in a minor avalanche, as upper shelves deluge their contents on to the unsuspecting cupboard user - usually me.
Last night my wife left a high cupboard open and the avalanche began a short time later. A pair of plastic freezer boxes slid of some cookbooks and onto the top of the microwave, where they caught the edge of the egg basket. The egg basket flipped over and catapaulted 3 eggs into the air to land in the middle of the floor.
Because the freezer boxes landed on something soft and the noises in the house masked the rest, all we heard was the slightest clatter - which engendered none of the usual urgency we all feel when something in the kitchen falls. We returned to the kitchen to see a cupboard door open, the egg basket upside down, the freezer boxes on the unit, and 3 eggs in the middle of the floor.
I was reminded of that childrens game mousetrap, where the players start by constructing a complicated set of interconnecting levers, rails and ratchets that form an unlikely machine for trapping the mouse. Our scenario seemed equally unlikely, but there was no other possible explanation for the flying eggs.
Page : 1/1
Thursday, 2 Jun 2005
Wednesday, 15 Jun 2005
Well, then in-place upgrade went fine, the biggest delay on the day being the closure of the end of the M40 (I was avoiding the A4 as Chelsea had a match and parade on that day). We eventually started the upgrade at 10am.
During the day and night we worked at a relaxed pace, having planned the whole task in some detail. By about 11pm we had four new Windows 2003 SP1 Domain Controllers in place and replicating perfectly. DNS was flawless and with the infrastructure behaving we started on configuration of the GPOs.
Once the GPOs were complete, the AD was ready for testing. We logged on to a few machines and found extremely slow logon performance - sometimes minutes before the desktop would appear. This was bad.
After about 30 minutes of testing, Tim spotted an entry in DNS that shouldn't be there. Earlier on, I had moved the FSMO roles off the server we used to do the in-place upgrade and run DCPROMO to remove AD from it. The decommissioning process had gone perfectly, but it had left LDAP and KERBEROS SRV records all over DNS and these were causing the slow logons. After 10 minutes of cleaning up the DNS database we retested and logon performance was back to expectations.
Back to testing and we discovered that although my machine was able to administer the AD with the adminpak tools, almost none of the others could. The error mentioned the server not being operational and this is usually a sign that the PC cannot see the server on the relevant ports. I disabled the proxy client on the client in question and voila!. Why the proxy client should choose to redirect LDAP traffic to the proxy server instead of allowing it to pass to the internal network is beyond me - especially as all the IP ranges were correctly configured in the LAT table on the proxy server. The client was running Proxy 2 though, so who knows how it might react to AD. The newly installed DNS Services allow DNS lookups to the internet, so the proxy client was no longer necessary and was disabled. Another little problem sorted.
The upgrade and testing was finally complete at about 5am and had taken around 19hours all in. We settled back to wait for the helpdesk to get in and do the support handover before we went home for some much-needed sleep. By 8am I was the only one of the team left in and so was rather worried when reports started coming in that the network was down and worse, people thought the upgrade was responsible. This was the nightmare scenario and was inexplicable - it had been working perfectly since about 11pm the previous night.
The situation quickly worsened and it was plain I wasn't going to be able to go home for a while. Every part of the network was down and it was as if all the switches had been stolen. It was 9am before one of the network team started to investigate and reported duplicate IP addresses and massive amounts of broadcast chatter being logged at the switches. More investigation, and the network supplier was called in. By the end of the day the network had been up and down 3 times and we were still none the wiser. A few overzealous techies even tried switching off the new Domain Controllers to no avail. The problem went away overnight and did not return, and we still do not know why it happened. Even after extensive research and reading about a recent similar issue at the New York Stock Exchange, the only theory I can come up with is misconfigured port grouping on one of the switches, causing a packet storm of such intensity that all bandwidth was used on all switches. Frightening.
Since then, the AD has continued to perform correctly and the new services are certainly appreciated by the users. The next step is to deal with the aftermath of the upgrade and start working on integration with other systems to finally get the business benefits AD promises.

