Wednesday, 16 January 2008
Imagine the scenario. You are called on your mobile in the middle of the night. Apparently the system is down, you try and ping the Server remotely and it works ok. However, try and connect to the server and you get nothing. You could remotely reboot the server by connecting to the remote power distribution unit but you’re not sure what the problem may be so a drive out to site is required. When you get to site the server is happily blinking away but the console is froze. By froze I mean it does nothing – it is simply blank. The keyboard doesn’t work and neither does the mouse. Trying desperately different avenues of connecting to the server nothing happens. No web interface, no remote computer management, nothing.
Now, you have to make a decision. That decision is most likely to be pressing the power button and keeping it down to force the server off immediately. When the server comes back up it goes back to running quite happily – however there is nothing in the event logs and no clue on the machine as to what the problem is. The machine hasn’t overheated, there is plenty of disk space, you have the latest patches and fixes installed and the users don’t have Internet access and their local machines are thin clients so nothing untoward has gotten onto the machine. The only thing left is to wait until it happens again.
Over the years I have experienced a number of server freezes and have not satisfactorily got to the bottom of it as event logs have never really yielded sufficient or consistent information to this intermittent problem. I have resorted to the old “switch it off and on again” routine which is really frustrating but necessary as users often don’t appreciate the server being down for hours while you investigate the problem. Searching the Internet for clues seem to indicate that it may be memory faults, or even worse memory slot faults. I have found that it is sometimes worth checking the positioning of the memory and stripping them out and trying different ones or one at a time. Also, trying the memory in different slots in case the problem is a memory slot problem is worth a go. But I do hope that one day I may be able to get to the bottom of this irritating problem.
Related
- Exchange Server 2007 and Office Sharepoint Server 2007
- MOSS on Exchange Server 2007
- Exchange Server 2007 Curiosities
- Restart Issues With Virtual Server 2005
- Unable to obtain Terminal Server User Configuration. Error: The RPC server is unavailable.



Ah ha, we are having a similar problem with one of our clients servers. We have started going down the route of switching the memory around too.
Jamie
Hi,
I have the same problem, Did you find a solution ?
Ori
Ori, the only solution that seems to work consistently (but not all the time!) is swapping out the memory.
Hi,
well its not the hardware, we got a test server from HP and it happen there as well.
we think it somthing to do with the mail server. software.
ori
i am agree with you, cause it’s so difficult to find the root problem, time’s running out if we search in internet when problems occur, so the smart thing to do is “Hard ShutDown” then after server up, we try to analize the problem.