Hello,
I want to use my 5090, but I see that Linux does not support monitoring RAM temperature. How do you know that the memory is not overheating and that the card will not burn out?
5090s have all of the memory on the heatsink side of the pcb, so overheating memory isnt near as much of an issue as it was with something like the 3090, they will throttle themselves as well if temps become an issue, though i havent heard of anyone having issues with 50 series.
Although it doesn’t show in the hiveos GUI, if you use Rigel Miner Nvidia 5000 series GPUs will have a vram temperature reported in the miner’s graphical output.
I can confirm this works on a 5080 as well as a 5090. I had to add config line arguments for the miner config to set the miner graphical output refresh to match the 15s (IIRC) of the hiveOS/xserver that orchestrates the console output.
I just hook an old LCD or portable monitor to the rig.
The vram temperature values exist somewhere. So, maybe watchdog can’t monitor, but there is some cause for confidence that the GPUs themselves will throttle of the vram gets too hot.
Ive no clue what the operating temps before throttling for gddr7 are, but the vram temps reported on my 5000 series nvidia GPUs indicates new cards with ample airflow operate much cooler with vram tracking with core temps and only offset above by 10-15°C.