I decided to dive into mining a couple of months ago and since I could only find cheap Vega 56 cards I tried building a rig around those.
Hardware so far:
Motherboard - MSI Z390-A Pro (as I’ve seen on my research it can hold up to 7 cards using an M.2 adapter and is fairly cheap)
PSU - Be Quiet Straight Power 11 1200W Full Modular 80 Plus Platinum
SSD - Kingston A400 120GB
CPU - Intel Core i5-9600KF 3.7GHz
RAM - Mushkin Essential 8GB DDR4
3x MSI Vega 56
1x Asrock Vega 56
Along with some risers and splitters I got from aliexpress.
When I got my first 2 MSI cards (came together) I installed them directly on my motherboard since it has 2 PCIE x16 slots and I hadn’t built my rack just yet, and everything seemed to be working fine with HiveOS. When my 3rd card came, the Asrock one, I put together my mining rig and therefore connected all 3 cards at the time to risers. At that time I had issues only with the Asrock card since regardless of my oc settings I would often get the teamredminer error “autofan gpu temperature 511 is unreal driver error”. I had believed that this was a card issue since the other 2 cards had been running smoothly without a problem even though the asrock one would oftenly “die” and restart the rig. When my 4th card arrived, which was similar to the first two, I installed it on a riser and just plugged it on board along the other 3 and this is where the crazy stuff started happening.
What I noticed is that when my 4th card got plugged ,again, 1 of my cards was not working. However it was not the new card or the Asrock one, it was one of the first 2 MSI cards that had been running smoothly since the beggining. And to further explain on “not working”, all of the vega 56 cards have 7 (or 8?) red lights next to their power plug, so when I turn on the rig, the card that is on slot 1 (MSI one) has one of those 7 lights lit, then the other 2 cards (Asrock and latest MSI) as well have 1 led lit and once they start mining all cards that work fine have 3 of those lights lit which after some research I found that it’s how much “load” the card is having, however I may have misanderstood, i’m not quite sure.
Ever since I got my 4th card, I have checked all of my cards individualy, all my risers and all my splitters as well as my PSU generic cables and all seem to be working perfectly, however I cannot get the 4th card to steadily run on the rig. On some tests the card did boot and stayed up for up to 9 hours in one test, but without changing anything I get the “GPU detected DEAD” error on hive os, the system reboots and sometimes the card is seen by the system but most commonly it’s not and my rig just keeps mining with the remaining 3 cards. Also one paradox i’ve run into is that when I change the riser connection and for example plug the riser that was on the 2nd PCIE slot into the 5th PCIE slot, different outcomes may result and I will further expand on that.
Let’s say i’ve got 4 cards in this specific order:
1.MSI - 2.MSI (the one that has stopped working since I got my 4th card) - 3.MSI - 4.Asrock
When I connect their risers on the motherboard in this exact same order:
Card 1 on 1st PCIE slot, Card 2 on 2nd PCIE slot and so on and so forth
Only the 1st and the 3rd card seem to be running. After trying a bunch of different combinations I’ve managed to run steady on 3 cards in these 2 different orders.
Card 1 on 1st PCIE slot, Card 2 on 3rd PCIE slot (16 pin one), Card 3 on 6th PCIE slot and Card 4 on 5th PCIE slot.
And the other combination was
Card 4 on 1st PCIE slot, Card 2 on 2nd PCIE slot, Card 1 on 3rd PCIE slot and Card 3 on 4th PCIE slot.
In both of those setups, initially all 4 cards would start mining but after a while, maybe overnight the 2nd card is “detected dead” and stops mining and it’s impossible to bring it back up. Maybe after 10 or 15 reboots the card is detected again and starts mining until it is detected dead again.
I have tried everything, from different oc settings, from enabling-disabling specific BIOS settings as seen in numerous guides as well as replacing my splitters and risers with new ones and nothing seems to work. I am 95% sure that there is no problem with the card since first of all it was running perfectly until I got my 4th card and also when I plug it in the place of my 1st card it’s running without a problem as well. It’s like my motherboard refuses to run 4 GPUs and I do realize that vega 56s are very power thirsty, I doubt that I’m in watt shortage with only 4 cards running. On hive os my power consumption with 4 cards doesn’t exceed 600 Watts when 4 cards are running and I’m using a 1200W PSU as mentioned before.
One of the errors I may be doing is splitting my PSU cables 2 times each since the vega 56 cards need 2 8-pin power cables and my PSU has only 6 8-pin cables, I am forced to use 2 splitters on each cable (so i’m turning 1 8-pin to 3 8-pins) and therefore use each cable to power each individual GPU along with its riser. I know i’m not supposed to double split cables but I cannot do differently since those cards require 3 each. I’m hesitant however to get a 2nd PSU since I doubt that this will fix the problem I’m having with the 2nd GPU, which also makes me hesitant to get more GPUs since I don’t know if they will be working properly either.
If anybody has any ideas or reasons as to why I’m having this issue I would love to hear from you and tips or advice are always welcome. I’m fairly new to this and this is my first rig so if you can point out errors I would appreciate it. I can further expand on details you may need to greater understand the issue if requested. I apologize in advance for any mistakes on my writing or my using of terms (both grammatical and technical), English is not my native language after all. Can’t wait to hear your thoughts.