Sunday, February 5, 2017

Jaeger - Postmortem

Symptoms

My gaming rig of more than three years (Jaeger - i7-4770K - GTX780) started to restart unexpectedly during light loads (surfing or playing Stardew Valley). I monitored the system using MSI Afterburner but did not notice anything out of the ordinary like overheating or memory usage spikes. Windows Memory Diagnostic reported no issues with the RAM. Windows Event Viewer proved useless and I was absolutely clueless as to what was causing the issue.

The issue happened more frequently and eventually, a standstill - the system was stuck in a never-ending loop of turning on and off. When it turned on, all the fans and pump started spinning, only to turn off immediately after. It didn't even get to POST (Power-on Self-test).


Troubleshooting

Time to put on the old PC building hat! I took the whole tower upstairs to my room, brushed off the dust from my PC building kit and prepared myself to hunker down until it's fixed. All my components were conveniently out of warranty and I did not have any spare parts to swap in. As per usual, my goal was to identify which part(s) is/are causing the issue.

  • Any software or the operating system was out of the picture as it won't even get past POST, more so to get it to boot.
  • Any peripherals were also disconnected.
  • I unplugged each remaining component one by one. The video card, then the optical drive, then my SSD and HDD storage. The only components left were the processor (and its cooler), the memory, the power supply, the motherboard and the case itself.
  • I heard that the case itself can sometimes short something out if the wrong metals are touching the wrong components, so I took the system out of the case as well.
  • I switched around the two memory sticks and tried plugging them in the other two unused DIMM slots. I cleaned both sticks using the pencil eraser trick. I tried removing one. Then I removed both.
By this time, I had a bare-bones system with only the motherboard, the processor (and its cooler) and the power supply. The issue is still happening!

  • I removed the CMOS battery. I went out and bought a brand new one.
  • The motherboard had dual BIOS (Basic Input/Output System). I forced it to use the backup BIOS, then back to the main BIOS.
  • To test the power supply, I plugged it out of the system and jumped it to run on its own using the paperclip trick. It was able to provide constant 12 volts to one of the case fans I attached to it. I went out and bought a multimeter then tested each pin for 12, 5, 3.3 and -12 volt readings.
I was almost certain that the power supply is not the culprit. But among the remaining three components, it was the power supply that was the most likely to have failed so I needed to make extra sure. I bought a cheap low-wattage power supply (EVGA 400W ATX Power Supply) and swapped that in just to make sure. True enough, the power supply was innocent. Now I have a spare power supply for future endeavors.

It was down to the motherboard or the processor. I didn't know a way to further test the processor and between the two, it was more likely that it was the motherboard. I finally decided to buy a replacement motherboard (MSI Z97-GAMING 5 ATX LGA1150 Motherboard). It arrived and with it, finally, the system POSTs!


Conclusion

Looking back, I'm still not certain what exactly was wrong with the motherboard. The motherboard itself is composed of multiple components like chipsets, capacitors, etc. But I have a hunch that it may be the BIOS getting corrupted, both main and backup BIOS. I've read that the BIOS chip can be replaced with a little solder and a fresh chip from the manufacturer or from some guy in ebay but it's too late to know now. It took two weeks, but Jaeger is now back to life and recovering (link to motherboard replacement).

No comments:

Post a Comment