Hello guys and girls,
I have a very big problem. I own 7 rigs, and all are set up via T-rex on ERGO. Out of the 7 rigs, only 3 rigs crash at the SAME second.
They go offline on HiveOS and they make a weird fan speed up noise for 1 second and then the GPUs flash some color (I have ZOTACs). The rig works, just no picture on the monitor or online in HiveOS. And no, it is not working as in 2miners I see it as 0 MH/s.
I have the rigs setup like this:
2 rigs = 6 x 3080Ti Zotac on 2000w Dell Server PSU Platinum Edition (pulls 1680w from the wall)
1 rig = 4 x 3070 and 3 x 3060Ti on 1600w HP Server PSU Platinum Edition (pulls 980w from the wall)
- They are WELL within the 20%-25% difference from the wattage rated on the PSU from the wall.
- All 7 rigs are flashed on Kingston USB 3.2 (same model) and all have same Kernel/NVIDIA Driver/Trex Miner Version.
- The 3 rigs are on the chinese motherboards for 8 slots, which I give power on the MOBO via 2x6pin connectors.
- The CPU temp on all Mobos is between 45c and 55c.
What I have tried:
- New RISERS on all gpus on the 3 rigs
- Reset the CMOS battery and bios has 4g enabled with always on if power off
- PCIe lanes are default, not set
- All GPUs are run via seperate 6pin->dual 8pin cables (288w rated) AWG18
This does NOT happen on a fixed time, but rather it can happen once or twice a day, or it will not happen even for 2-3 days or even more… What’s SUPER weird is that at the exact same second ONLY these 3 rigs go offline.
They have in common:
- Same motherboard: AFHM65-ETH8EX (with 4GB RAM)
- Same risers (on all 7 rigs)
- Same PSU (2000w Dell Server on 4 rigs)
They are with a different power breaker (3 different houses/locations) and with a different internet provider.
No, OC is not the problem here. They are LOW and I have the same OC on different rigs and NOT even once have a problem. If it’s a problem of the OC, it would display an error or crash log…
PLEASE HELP ME TROUBLESHOOT THIS, IT IS DRIVING ME NUTZ.