Hello, guys. I have a problem that give me headaches from serval days.
H110 Pro BTC+ ASRock (P1.60 03/23/2018), Intel® Celeron® CPU G3930 @ 2.90GHz AES, ATA KINGSTON SA400S3 240GB
5.4.0-hiveos #108, T-Rex miner
4x RTX 3070 Gigabyte OC, 1x RTX 3070 Aours Master, 2x EVGA, 1x ROG Strix, 1x TUF gaming, 1x MSI Gaming Trio Z
Power supply 2.700W DELL Platinum (I have power supply with 3 PCIe/card. 2 goes to video card and 1 to riser)
OC: CORE 1060, MEM 2300, PL 130
P.S. All rigs stay in a cooled room, and the card temps are between 35-45 degree celsius.
So, I build this rigs few days ago with TB360 BTC PRO+ Biostar instead of ASRock H110 and from there problems start. I opened the system and every time I got the following errors
- System starts but does not recognize cards as RTX 3070. The name of the cards was GA104
- The system not even once starts with total numbers of cards. He starts with 5, 3, 7, 9 random number of cards. If, by chance, it started with all 10 video cards, it took about 5 minutes until they started to fall one by one.
- The following error apears when the rig starts “[nvidia_drm] [GPU ID X] Failed to allocate NvKmsKapiDevice drm:nv_drm_load ERROR”
- If I wrote in console nvidia-info got “Nvidia driver is disabled (use --force)”
So, after 24 hours of chaging components, settings, OC, etc I changed few risers and switched to ASRock H110 BTC PRO and system started to work just fine. I let the system at my home 2 days and worked without any problem. For 2 days I restarted it once every 3-4 hours to make sure that the error with GA104 or Nvidia drivers not found does not appear and everything went well.
Part of the problem, I think it was from the risers, one or two risers had problems and ruined the rig completely. I chose a much more popular motherboard like the ASRock H110 BTC PRO + because the TB360 is new and I thought it might have bios problems.
But as bad things don’t end easily, this morning I woke up with half a cards offline.
5 video cards appeared with X and the other 5 video cards were online, but they didn’t mine. I did not receive any errors in the system, they were simply offline. I rebooted the rig and it started to mine without any problems. The error with GA104 or Drivers not found I think is gone but now my video cards are falling for no reason.
HAVE, Somebody, ANY IDEEA, WHY?!