More
referral
Increase your income with Hive. Invite your friends and earn real cryptocurrency!

Can't overclock GPU - nvidia-info showing GFXCore 0 MHz, MEMCore 0 MHz

Hey. Been mining flux with this card without problems for 6 months. Then i started to have this error with GPU crashing. Gminer and lolminer were stopping all the rig so i went to miniZ, which let me mine with the rest of the rig while this card having a problem

error shows :

[FATAL ] GPU[3]: CUDA error 77 ‘an illegal memory access was encountered’ in line 300


also sometimes following by this error message :

/hive/miners/miniz/h-run.sh: line 20: 29791 Segmentation fault (core dumped) ./miniZ $(< ${MINER_NAME}.conf) --logfile $MINER_LOG_BASENAME.log --telemetry 127.0.0.1:${MINER_API_PORT} --gpu
-line

(actually saying core dumped and some kind of telemetry error but i have no clue what that means)

i spent like 5 whole days on google trying to find out answers with this error, basic answers were “OC settings too high” or “virtual memory errors”.
so i tried lower my OC, then i saw that absolutely nothing changed. trying lowering coreclock, it was staying the same every time, tried lowering memoryclock, it was staying exactly the same everytime, and tried lowering power limit to 100W, it was not even changing. i can’t overclock my GPU.
GPU is still mining, but gets error every 20 seconds, shutdowns&resets the whole rig, every 20 seconds.
I guess my GPU is stuck on stock settings then.

( clk=1665MHz mclk=7000MHz)

Also, HiveOS does not show %temp and Watts for this GPU

so i run “nvidia-info” and this show up :

=== GPU 0, 07:00.0 GeForce RTX 2080 Ti 11264 MB ===
Bios 90.02.0B.00.4A, PCIE Link Gen 3, PCIE Link Width 1x
UUID GPU-bfc6ba91-b24f-88b0-0f00-b22ac8462383, ROM flash: Supported
Power Limit: Min 100.0 W, Default 260.0 W, Max 300.0 W, Current 260.0 W
Frequency: GFXCore 0 MHz, MEMCore 0 MHz
Memory: Total 11264 MB, Used 4246 MB, Free 7017 MB, Micron GDDR6
Utilization: GPU 0 %, MEM 0 %, Throttle:
PSTATE P2, PWR 0.0 W, Temp 68 °C, Fan 0 %

Everything is showing “0”, just like the gpu was dead.
i can’t change the overclock settings.
i tried changing the riser, switching risers positions with other gpus, unplug, replug every cables, manual reboot, created a new flight sheet.

Looking for help there, or any other lookalike experience with solutions.
If you plan on answering “gpu OC too high” you just didn’t read the post.

Thanks.

test that the card is working as intended. display output work? no lines/artifacts?

rule out all variables, riser, power cable, psu, usb cable, usb pcie adapter test on another rig etc etc.

if youve done all that and it behaves the same then its likely a faulty card