More
referral
Increase your income with Hive. Invite your friends and earn real cryptocurrency!

Hive OS Freezing "RT throttling activated"

Hi, I recently started having this issue after upgrading to [email protected]
Unfortunately, I do not remember what version I was on prior to updating.
I will get a notification that my rig went offline but it will not come back online.
Each time this happens, I go check on the rig and everything is still running.
I connected a monitor to see what was happening, and HiveOS just completely freezes.

What have I done:

  1. Reset bios and redid the settings.
  2. Updated bios.
  3. Swapped to a different Motherboard / CPU. (X470 Taichi and 5800X)
  4. Swapped to a brand new SSD.
  5. Swapped to a brand new set of memory. (2x8GB kit)
  6. Flashed a new hive image multiple times.
  7. Updated to the newest hive version.
  8. Swapped riser with know working risers.
  9. Reducing and removing overclocks.

My system:

  1. Motherboard (B450 Gaming K4 ASRock P4.80)
  2. CPU (AMD Ryzen 7 2700X)
  3. Memory (Crucial Ballistix Sport 2x8GB)
  4. SSD (PNY CS900 120GB)
  5. Kernel (5.10.0-hiveos #72)

Cards:

  1. 3080 (Evga)
  2. 3080 (Zotac)
  3. 3070 (Evga)
  4. 3070 (MSI)
  5. 3070 (Zotac)
  6. 3070 (Zotac)
  7. 6700 (AMD)
  8. 6800 (AMD)

I am running T-Rex and PhoenixMiner and have tried NBMiner as well.
When I run the commands to check for errors in the syslog there are none.
I was able to see what happened just before the rig froze by using the command ‘motd watchdog’.

kernel: [72580.057791][ T1171] NVRM: GPU at PCI:0000:04:00: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx                                                                                  
kernel: [72580.057793][ T1171] NVRM: Xid (PCI:0000:04:00): 62, pid=1171, 0000(0000) 00000000 00000000                                                                                   
kernel: [72580.093233][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000010                                                                                                    
kernel: [72581.503222][   C10] sched: RT throttling activated                                                                                                                           
kernel: [72585.094636][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000011                                                                                                    
kernel: [72585.098915][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000012                                                                                                    
kernel: [72585.103196][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000013                                                                                                    
kernel: [72585.107478][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000014                                                                                                    
kernel: [72585.111761][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000015                                                                                                    
kernel: [72585.116045][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000016                                                                                                    
kernel: [72585.120329][ T1171] NVRM: Xid (PCI:0000:04:00): 45, pid=4560, Ch 00000017realloc(): invalid pointer

I decided to look up the error “sched: RT throttling activated” because I have no idea what it means.
All I could find was the following:

This issue does involve my 3080 (Zotac) on bus (PCI:0000:04:00) but I’m not sure what the problem is.
I thought the card was thermal throttling but I have monitored the temps and they are normal. (<100c)
The system freezes right after getting the “invalid pointer” error.

If you have any ideas on what can be causing the issue please let me know.
I can provide all my logs if you would like to take a look at them.

same problem for me, with T-rex and 3080ti

my syslog :

I try this card on windows, in my living room (+23°) core was @57°, mem @ 98°

actually the card is on hiveos in my garage (+11°), core is @42°, its impossible that mem temp higher than 98°

what can do this crash ?

The problem went away after replacing thermal pads and paste on the Zotac 3080.

damn, it must be same for me, my card is really a piece of shit.
thanks