Help, please?

Wolfgang Schulze-Zachau wolfgangs at manticoreit.com
Mon Dec 28 17:20:07 CET 2009


Hi all,

I am a bit stuck with a problem, and I am wondering whether I could call
on the group for some help?

I have a HP Netserve 2000R, with 2 x 1133MHz Intel P3 CPUs, 2.5GB of RAM
and 3 x 36GB SCSI drives in RAID5, run by the internal HP NetRAID card
of the server. I bought the box about 3 years ago and it has given me
excellent service over the years. In fact, the last reboot was almost
1000 days ago. I use this server mostly to host a few websites, mostly
Drupal or WordPress based, plus the associated MySQL databases. And it
also runs my live LATRIX site and a demo site.

A few days ago it started giving me headaches. I first noticed that the
web sites seemed down. So I ssh into the box, and want to restart
apache. Wouldn't work. I had to kill all sorts of seemingly weird other
processes before I could get that to work. 
The next day I noticed that my remote shells all took terribly long to
respond to any command, or rather, the command output was produced as
usual, but the prompt would only reappear minutes later. 
So I thought, OK, over Christmas I'll do a major upgrade and cleanup.

Well, Fortuna cought me out. On 22/12, the sites were down again, and I
decided to reboot the server (remotely, it is hosted in a proper hosting
center) and it did not come back up. I have since taken the box home,
and now the really weird stuff starts:

I put a Knoppix CD in (5.0.1), the box boots, the KDE comes up and I can
do all sorts of things. I have network access, I can browse the web. All
fine. So I start working on the recovery of my data, and as soon as I do
anything with any of those partitions, sooner or later the server hangs.
Completely, totally. No keyboard, no mouse, no network, dead. Reboot.
Memory test, comes up 100% fine. Hard drive consistency check, 100%
fine. Reboot. Same again.
So far I figured out that some of the superblocks in the partitions are
damaged, but that's not a big issue, I can rebuild them from the backup
superblocks. And my /etc/ folder in the root partition is now a file (I
wonder how that happened, really), but a) I can probably fix it and b)
there wasn't anything in /etc/ that couldn't be rebuilt from scratch.

I've got backups of most of the stuff on the box. However I would a)
really like to figure out what's wrong here and b) try and recover some
of the stuff that wasn't included in the backups. Would save me a lot of
time.

So, if anyone could venture any guesses or point me in any useful
direction, I would really appreciate any help I can get right now. 

cheers
Wolfgang





More information about the CLUG mailing list