I've often found questions on this forum where the problems encountered by users are hard to diagnose as either a software issue or a hardware issue. In particular, I thought I'd share some of my thoughts on what makes a problem more likely to be a hardware problem. While rare, hardware problems do occur on modern machines because of the increasing number of transistors choked into smaller and smaller boards and the enormous heat generated by multi-gigahertz processors with advanced features.
Here is a small checklist which will help you determine if an issue is hardware related or not.
- Random lockups -- almost always a hardware problem. If you dual boot and the machine locks up randomly on both operating systems, you're 100% certain that it's hardware related. In these cases it might be a processor or motherboard problem.
- Rebooting issue -- on many modern motherboards, a hardware reset or shutdown will occur if there's overheating of the processor or motherboard. Very rarely it might point to a RAM problem, but RAM problems are generally not associated with system shutting down completely. A faulty PSU might also occasionally lead to random reboots or system shutdowns.
- Applications crash without any particular sequence of actions -- many applications crash because of software problems -- memory leaks, faulty program logic or merely because it doesn't use the correct version of a library. But rarely when apps tend to crash without any particular order or sequence to it and when more than one application is affected by this problem, it might point to a physical problem in your RAM.
- Stress related problems -- If you stress your CPU to the max and find that it's leading to shutdowns or reboots, check if the heat is dissipating properly. Also if you find that the CPU is working properly, it might be a motherboard problem though this is rare.
- Video card related problems -- Mostly video card related bugs result in unbootable systems or occasionally freezes at random when playing heavy duty 3d games. It might be an AGP related issue so you might want to try reducing the AGP capabilities of your system and seeing if that removes the issue.
Many hardware problems are hard to diagnose. The underlying fact is that Linux and other *nix are more sensitive when it comes to detecting faulty hardware as opposed to Windows, so a system that was running perfectly under Windows for ages may not necessarily be free of hardware problems.
The bottom line is that if there's any problem that's not easily reproducible or occurs seemingly at random without appearing to have an underlying cause, you should start suspecting a hardware problem.