I have been recently seeing some cards crash and even when they seem to be working. The only indication that there is a problem, is that the cards report 0 for memory and core clocks and 0% fan speed.
The 0 clock would be more reliable if the card crashed, but not sure hive monitors the current clocks. The other option would be to track the fan, and when it stayed at 0 for a period, then reboot the rig. Has to reset the timer, if the fan works for a short period. Have some cool cards that only run the fan for a short time. Especially when mining at low power
This reminds me there is another option to reboot the rig, which is when a card draws more power than what was set for pl.