Eventually all computer systems fail, even very stable ones such as network servers. Since it is particularly important...
to isolate the cause of a server crash, Red Hat has added what it calls its Network Crash Dump utility to Red Hat Advanced Server 2.1. NETDUMP provides what Red Hat calls first fault analysis, or the ability to correct a problem without having to recreate the problem or suffer a second crash. Linux provides a signature of a crash by storing the processor state, a stack trace, and part of the instruction trace and any OOPS, BUG, or PANIC messages. This system will often provide the needed clues to ascertain the cause of the problem involved.
The network console utility offers a log of all Kernel messages and crash signature messages to the network syslog server, and that syslog server can be on any Linux server. The network console also adds a memory dump of the kernel image to the aforementioned messages, and in difficult-to-determine events, such as hardware errors, this memory dump can be valuable in ascertaining the cause of your crash. The rationale for storing the dump to the network instead of the more traditional Unix swap volume is that in some cases you can overwrite important file data, or even fail to write the memory dump in the first place, due to a hardware error. Thus it is claimed that a network dump is safer and more effective when using Linux.
Some Unix versions store the memory dump to a data file and then have a second known-good kernel load use the memory dump after rebooting. Since PC hardware almost always clears memory during a reboot, it is not considered an effective method for Linux. There are chipsets that can store a memory dump on a PC, but they aren't widely used and you can't generally rely on them.
Barrie Sosinsky is president of consulting company Sosinsky and Associates (Medfield MA). He has written extensively on a variety of computer topics. His company specializes in custom software (database and Web related), training and technical documentation.