I got issue in one rig from yesterday. One of cards (RX 580 Nitro+ 4 GB) has been hung up. TRM has been restarted but only mine with 5 cards instead of 6.
I reboot rig and after that on dmesg I got errors:
failed to send message 254 ret is 0
[powerplay] SMU load firmware failed
[powerplay] fw load failed
smu firmware loading failed
amdgpu: amdgpu_device_ip_init failed
amdgpu: Fatal error during GPU init
amdgpu: probe of 0000:06:00.0 failed with error -22
Card itself is detected by hiveos and using lspci command , but on hiveos not showing any temp , memory size or fan speeds so it can not be used to mine . I check card in other riser and issue is the same . Does card is dead? I did not do any updates / upgrades / bios flashing.
I will be grateful for any help.