I believe p8icer is correct. The rig doesn’t just stop mining or go offline, the OS completely freezes. A couple days ago I tried plugging the frozen rig into a display, and the screen just flickered on and off. I have spent some time writing a script that is supposed to reboot a rig if it’s offline or if there are issues with the GPUs, mostly to troubleshoot what’s happening when the rig freezes. It seems to work sometimes, but other times it doesn’t. I think the only reliable way is probably to have an external system ping the hiveos api to see if the rig is online, and if it’s not, send a command to the smart plug to restart.
The script is at the below link if anyone’s interested in testing it out. It runs as a systemd service 5 min after boot and every 7 min after that.