All Rigs keep offline and back online and missing GPU 0

Hi, All. Need some serious help!!!
I have about 8 Rigs. Most of them are 8 x 3060 ti Full Hash Rate.
All running perfect until two weeks ago.
Keep going offline and back online in about 5 mins.
And more interestingly, If I connect to a monitor or go to remote access and hive shell. GPU 0 is missing. I don’t know whether or not it’s related.
It happens to all different versions: 0.6-212@211130 to 0.6-213@220320
And you can tell the pool only report 60% of hash rate for the whole rig
I almost changed all my network gear like new modem, router and switch. But the internet connectivity is very good.
Have tried Trex 25.8 or latest NBminer. All are the same. Anyone have any idea

First I would start by using locked core clocks instead of offsets, no need for power limits there either.

Wouldn’t hurt to try a fresh latest hive image to rule out any software issues, and then add a card at a time and see when you experience issues.

Thanks for your advises. I have tried a new fresh with the latest image. It’s running good for one day and start to offline and online again. I will try to locked core clock 1100 and no power limit and try.

1100 is gonna be too low. You need to find the lowest core clock that maintains full hashrate

ok, I will try. Can you explain why the GPU #0 is missing? Almost all my rigs have the same issue. Some of them are running fine, don’t have issue of up and down. But GPU #0 are missing on display

Not sure why, but that’s why I’m saying to troubleshoot in order of easiest to hardest

I tried to remove all video cards expect one. Tried lower the overclock and even no over lock. Still offline. The funny thing is that I have one rig that running very good on the other location for a few days and move back, after one day, On the pool showing the hash rate is slowly getting lower and lower, from 495 MHR decrease to 460 after 12 hours. GPU #0 is missing on that good rig. Seems that location has virus or hacker. I have another rig was running bad in that location, and move to another good location, refresh OS and running good for about 12 hours now. It’s not the over clocking issue or video card issue for sure. Something is related to that location. As an IT guy for 15 years, I never seem like this.

