2 year old 5700 rig crashes but stuck on rebooting

I’ve had this 6x5700 rig running since I bought the cards new. I’ve always had minor issues with it: random reboots, TRM crashing, PCI I/O errors. However, after flashing 0.2-203, it has resolved most of the issues.

All of a sudden, I noticed that this rig keeps freezing up, not even able to restart by itself. I have to power cycle it probably 3 times a day. Not sure what could be the issue. I replaced all the risers and USB cables last year as well as cleaning the cards. Weirdly enough, running a single card in a desktop PC results in zero issues. I can play games for hours at a time.

Anyways, here is the error message on HiveOS when it freezes. The shutdown/restart sequence never initiates, even leaving it overnight. It’s a major problem since I’m not able to be at my rig at all times. I have a 120V smart plug that I can control through an app, however I do not believe there is a cost-efficient equivalent for a 240V outlet which my rigs are all on.

I’d really appreciate some insight on some of the more knowledgeable members on here.

Set the ocs for that card (10:00) to very conservative and see if it helps.

Have you tried flashing the latest stable image? What clocks and voltages are you running?

Hi, I’ve tried every single image from 0.6-204 all the way to 0.6-218. Fresh SSD flash every time. Anything newer than 0.6-203 won’t even boot into the HiveOS splash screen, some sort of PCI I/O error within the 1st minute of booting up, hence why I’ve been using 0.6-203.

All 6 cards at the moment are not OC’d at all at this moment, running stock clocks and voltages.

Currently I’m back on 0.6-203, the crashing/freezing issues isn’t as bad, so far I only had to hard reboot it once in the past 24 hours, but the error screen in my first post still appears often… Voltages range from 755-775VDD, 800VDDCI, 1350MVDD, mem clocks ranging from 905-920.

is it only that one card thats causing issues?

Yes, it seems that the card at 0000:03:00:0 is causing all the issues.

I will take that card out of the rig and see how it goes. I may test it in my desktop pc as a gaming card and stress test it a bit as well.

