More
referral
Increase your income with Hive. Invite your friends and earn real cryptocurrency!

CPU soft lockup

Just about every boot now, I get what seems to be a CPU soft lockup. Does anyone know what this means, and how to correct it? Pasted below is a copy of the log from when the issue starts until the crash when the system locks up.

Mar 28 01:42:03 Worker hive-watchdog[1043]: OK ethash 641372 kHs >= 0.625 kHs
Mar 28 01:42:07 Worker kernel: [ 1164.246585][ T939] NVRM: GPU at PCI:0000:0c:00: GPU-1d633e82-f3eb-a2a7-7442-8ce523d729cf
Mar 28 01:42:07 Worker kernel: [ 1164.246592][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000010
Mar 28 01:42:08 Worker kernel: [ 1164.594332][ T939] NVRM: Xid (PCI:0000:0c:00): 62, pid=3256, 0000(0000) 00000000 00000000
Mar 28 01:42:08 Worker kernel: [ 1164.594522][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000010
Mar 28 01:42:08 Worker kernel: [ 1164.595349][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000011
Mar 28 01:42:08 Worker kernel: [ 1164.596145][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000012
Mar 28 01:42:08 Worker kernel: [ 1164.596937][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000013
Mar 28 01:42:08 Worker kernel: [ 1164.597727][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000014
Mar 28 01:42:08 Worker kernel: [ 1164.598526][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000015
Mar 28 01:42:08 Worker kernel: [ 1164.599317][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000016
Mar 28 01:42:08 Worker kernel: [ 1164.600107][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000017
Mar 28 01:42:08 Worker kernel: [ 1164.722364][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000011
Mar 28 01:42:08 Worker kernel: [ 1164.723205][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000012
Mar 28 01:42:08 Worker kernel: [ 1164.724026][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000013
Mar 28 01:42:08 Worker kernel: [ 1164.724842][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000014
Mar 28 01:42:08 Worker kernel: [ 1164.725657][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000015
Mar 28 01:42:08 Worker kernel: [ 1164.726478][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000016
Mar 28 01:42:08 Worker kernel: [ 1164.727296][ T939] NVRM: Xid (PCI:0000:0c:00): 45, pid=3256, Ch 00000017
Mar 28 01:42:09 Worker kernel: [ 1166.038342][ C0] sched: RT throttling activated
Mar 28 01:42:13 Worker hive-watchdog[1043]: OK LA(5m): 0.54 < 10.0, LA(1m): 1.14 < 20.0
Mar 28 01:42:22 Worker avg_khs[2227]: {“params”:{“avg_khs”:{“ethash”:[640433,195617]}}}
Mar 28 01:42:35 Worker kernel: [ 1192.043167][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [irq/151-nvidia:939]
Mar 28 01:42:35 Worker kernel: [ 1192.043170][ C0] Modules linked in: nvidia_uvm(POE) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm
_kms_helper cec drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops syscopyarea sysfillrect sysimgblt fb fbdev in
tel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd cryptd glue_helper rapl intel_cstate eeepc_wmi asus_wmi sparse_keymap wmi_bmof efi_pstore mei_me mei intel_lpss_pci intel_lpss i
dma64 virt_dma input_leds acpi_pad mac_hid sch_fq_codel sunrpc droptcpsock(OE) ip_tables x_tables autofs4 hid_generic usbhid hid nvme e1000e
ptp pps_core nvme_core ahci libahci video wmi
Mar 28 01:42:35 Worker kernel: [ 1192.043198][ C0] CPU: 0 PID: 939 Comm: irq/151-nvidia Tainted: P OE 5.10.0-hiveos #83.hiv
eos.211201
Mar 28 01:42:35 Worker kernel: [ 1192.043198][ C0] Hardware name: ASUS System Product Name/PRIME B560-PLUS, BIOS 1410 01/28/2022
Mar 28 01:42:35 Worker kernel: [ 1192.043442][ C0] RIP: 0010:_nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:42:35 Worker kernel: [ 1192.043443][ C0] Code: 24 b0 03 00 00 48 89 df 48 8b 86 60 05 00 00 e8 c3 bd 84 ed be 00 00 9c 02 bf 95
df 5e 0e 31 c0 e8 22 72 c7 ff e8 1d d9 3b 00 fe 66 2e 0f 1f 84 00 00 00 00 00 90 41 56 41 55 41 54 53 49 89
Mar 28 01:42:35 Worker kernel: [ 1192.043444][ C0] RSP: 0018:ffffb699c5913d88 EFLAGS: 00000286
Mar 28 01:42:35 Worker kernel: [ 1192.043445][ C0] RAX: 0000000000000000 RBX: ffff9a55c7b98008 RCX: 0000000000000020
Mar 28 01:42:35 Worker kernel: [ 1192.043446][ C0] RDX: 0000000000000001 RSI: ffff9a55c73d2cf4 RDI: 00000000000007ff
Mar 28 01:42:35 Worker kernel: [ 1192.043446][ C0] RBP: ffff9a55c73d2d00 R08: 0000000000000020 R09: ffff9a55c73d2ce8
Mar 28 01:42:35 Worker kernel: [ 1192.043447][ C0] R10: 0000000000000000 R11: 000000000000006d R12: ffff9a55c8188008
Mar 28 01:42:35 Worker kernel: [ 1192.043447][ C0] R13: ffff9a55c80a1808 R14: ffff9a55c6899008 R15: 000000000001ffdf
Mar 28 01:42:35 Worker kernel: [ 1192.043448][ C0] FS: 0000000000000000(0000) GS:ffff9a5713c00000(0000) knlGS:0000000000000000
Mar 28 01:42:35 Worker kernel: [ 1192.043449][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 28 01:42:35 Worker kernel: [ 1192.043449][ C0] CR2: 00005653b702beb0 CR3: 000000021640a003 CR4: 00000000003706f0
Mar 28 01:42:35 Worker kernel: [ 1192.043450][ C0] Call Trace:
Mar 28 01:42:35 Worker kernel: [ 1192.043686][ C0] ? _nv030804rm+0x101/0x140 [nvidia]
Mar 28 01:42:35 Worker kernel: [ 1192.043919][ C0] ? _nv029378rm+0xaf5/0xd90 [nvidia]
Mar 28 01:42:35 Worker kernel: [ 1192.044148][ C0] ? _nv029386rm+0x161/0x440 [nvidia]
Mar 28 01:42:35 Worker kernel: [ 1192.044271][ C0] ? _nv000726rm+0xb1/0x250 [nvidia]
Mar 28 01:42:35 Worker kernel: [ 1192.044274][ C0] ? irq_finalize_oneshot.part.49+0xf0/0xf0
Mar 28 01:42:35 Worker kernel: [ 1192.044396][ C0] ? rm_isr_bh+0x1c/0x60 [nvidia]
Mar 28 01:42:35 Worker kernel: [ 1192.044460][ C0] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Mar 28 01:42:35 Worker kernel: [ 1192.044461][ C0] ? irq_thread_fn+0x21/0x60
Mar 28 01:42:35 Worker kernel: [ 1192.044463][ C0] ? irq_thread+0xe7/0x170
Mar 28 01:42:35 Worker kernel: [ 1192.044464][ C0] ? irq_forced_thread_fn+0x80/0x80
Mar 28 01:42:35 Worker kernel: [ 1192.044465][ C0] ? irq_thread_check_affinity+0xe0/0xe0
Mar 28 01:42:35 Worker kernel: [ 1192.044467][ C0] ? kthread+0x117/0x130
Mar 28 01:42:35 Worker kernel: [ 1192.044467][ C0] ? kthread_park+0x90/0x90
Mar 28 01:42:35 Worker kernel: [ 1192.044469][ C0] ? ret_from_fork+0x1f/0x30
Mar 28 01:42:44 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 43 seconds
Mar 28 01:42:54 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 53 seconds
Mar 28 01:43:03 Worker kernel: [ 1220.043987][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [irq/151-nvidia:939]
Mar 28 01:43:03 Worker kernel: [ 1220.043990][ C0] Modules linked in: nvidia_uvm(POE) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm
_kms_helper cec drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops syscopyarea sysfillrect sysimgblt fb fbdev in
tel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd cryptd glue_helper rapl intel_cstate eeepc_wmi asus_wmi sparse_keymap wmi_bmof efi_pstore mei_me mei intel_lpss_pci intel_lpss i
dma64 virt_dma input_leds acpi_pad mac_hid sch_fq_codel sunrpc droptcpsock(OE) ip_tables x_tables autofs4 hid_generic usbhid hid nvme e1000e
ptp pps_core nvme_core ahci libahci video wmi
Mar 28 01:43:03 Worker kernel: [ 1220.044017][ C0] CPU: 0 PID: 939 Comm: irq/151-nvidia Tainted: P OEL 5.10.0-hiveos #83.hiv
eos.211201
Mar 28 01:43:03 Worker kernel: [ 1220.044018][ C0] Hardware name: ASUS System Product Name/PRIME B560-PLUS, BIOS 1410 01/28/2022
Mar 28 01:43:03 Worker kernel: [ 1220.044263][ C0] RIP: 0010:_nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:43:03 Worker kernel: [ 1220.044265][ C0] Code: 24 b0 03 00 00 48 89 df 48 8b 86 60 05 00 00 e8 c3 bd 84 ed be 00 00 9c 02 bf 95
df 5e 0e 31 c0 e8 22 72 c7 ff e8 1d d9 3b 00 fe 66 2e 0f 1f 84 00 00 00 00 00 90 41 56 41 55 41 54 53 49 89
Mar 28 01:43:03 Worker kernel: [ 1220.044266][ C0] RSP: 0018:ffffb699c5913d88 EFLAGS: 00000286
Mar 28 01:43:03 Worker kernel: [ 1220.044267][ C0] RAX: 0000000000000000 RBX: ffff9a55c7b98008 RCX: 0000000000000020
Mar 28 01:43:03 Worker kernel: [ 1220.044267][ C0] RDX: 0000000000000001 RSI: ffff9a55c73d2cf4 RDI: 00000000000007ff
Mar 28 01:43:03 Worker kernel: [ 1220.044268][ C0] RBP: ffff9a55c73d2d00 R08: 0000000000000020 R09: ffff9a55c73d2ce8
Mar 28 01:43:03 Worker kernel: [ 1220.044268][ C0] R10: 0000000000000000 R11: 000000000000006d R12: ffff9a55c8188008
Mar 28 01:43:03 Worker kernel: [ 1220.044269][ C0] R13: ffff9a55c80a1808 R14: ffff9a55c6899008 R15: 000000000001ffdf
Mar 28 01:43:03 Worker kernel: [ 1220.044269][ C0] FS: 0000000000000000(0000) GS:ffff9a5713c00000(0000) knlGS:0000000000000000
Mar 28 01:43:03 Worker kernel: [ 1220.044270][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 28 01:43:03 Worker kernel: [ 1220.044271][ C0] CR2: 00005653b702beb0 CR3: 000000021640a003 CR4: 00000000003706f0
Mar 28 01:43:03 Worker kernel: [ 1220.044271][ C0] Call Trace:
Mar 28 01:43:03 Worker kernel: [ 1220.044502][ C0] ? _nv030804rm+0x101/0x140 [nvidia]
Mar 28 01:43:03 Worker kernel: [ 1220.044732][ C0] ? _nv029378rm+0xaf5/0xd90 [nvidia]
Mar 28 01:43:03 Worker kernel: [ 1220.044961][ C0] ? _nv029386rm+0x161/0x440 [nvidia]
Mar 28 01:43:03 Worker kernel: [ 1220.045082][ C0] ? _nv000726rm+0xb1/0x250 [nvidia]
Mar 28 01:43:03 Worker kernel: [ 1220.045084][ C0] ? irq_finalize_oneshot.part.49+0xf0/0xf0
Mar 28 01:43:03 Worker kernel: [ 1220.045206][ C0] ? rm_isr_bh+0x1c/0x60 [nvidia]
Mar 28 01:43:03 Worker kernel: [ 1220.045270][ C0] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Mar 28 01:43:03 Worker kernel: [ 1220.045271][ C0] ? irq_thread_fn+0x21/0x60
Mar 28 01:43:03 Worker kernel: [ 1220.045272][ C0] ? irq_thread+0xe7/0x170
Mar 28 01:43:03 Worker kernel: [ 1220.045274][ C0] ? irq_forced_thread_fn+0x80/0x80
Mar 28 01:43:03 Worker kernel: [ 1220.045275][ C0] ? irq_thread_check_affinity+0xe0/0xe0
Mar 28 01:43:03 Worker kernel: [ 1220.045276][ C0] ? kthread+0x117/0x130
Mar 28 01:43:03 Worker kernel: [ 1220.045277][ C0] ? kthread_park+0x90/0x90
Mar 28 01:43:03 Worker kernel: [ 1220.045279][ C0] ? ret_from_fork+0x1f/0x30
Mar 28 01:43:04 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 63 seconds
Mar 28 01:43:08 Worker kernel: [ 1224.724118][ C0] rcu: INFO: rcu_sched self-detected stall on CPU
Mar 28 01:43:08 Worker kernel: [ 1224.724122][ C0] rcu: 0-…: (14999 ticks this GP) idle=b22/1/0x4000000000000000 softirq=80171/80171
fqs=7498
Mar 28 01:43:08 Worker kernel: [ 1224.724123][ C0] (t=15000 jiffies g=100101 q=13932)
Mar 28 01:43:08 Worker kernel: [ 1224.724124][ C0] NMI backtrace for cpu 0
Mar 28 01:43:08 Worker kernel: [ 1224.724126][ C0] CPU: 0 PID: 939 Comm: irq/151-nvidia Tainted: P OEL 5.10.0-hiveos #83.hiv
eos.211201
Mar 28 01:43:08 Worker kernel: [ 1224.724126][ C0] Hardware name: ASUS System Product Name/PRIME B560-PLUS, BIOS 1410 01/28/2022
Mar 28 01:43:08 Worker kernel: [ 1224.724127][ C0] Call Trace:
Mar 28 01:43:08 Worker kernel: [ 1224.724128][ C0]
Mar 28 01:43:08 Worker kernel: [ 1224.724131][ C0] dump_stack+0x6d/0x88
Mar 28 01:43:08 Worker kernel: [ 1224.724134][ C0] nmi_cpu_backtrace+0x99/0xb0
Mar 28 01:43:08 Worker kernel: [ 1224.724136][ C0] ? lapic_can_unplug_cpu+0xa0/0xa0
Mar 28 01:43:08 Worker kernel: [ 1224.724137][ C0] nmi_trigger_cpumask_backtrace+0xd2/0x100
Mar 28 01:43:08 Worker kernel: [ 1224.724138][ C0] rcu_dump_cpu_stacks+0xab/0xd9
Mar 28 01:43:08 Worker kernel: [ 1224.724140][ C0] rcu_sched_clock_irq+0x5c6/0x810
Mar 28 01:43:08 Worker kernel: [ 1224.724142][ C0] ? trigger_load_balance+0x52/0x220
Mar 28 01:43:08 Worker kernel: [ 1224.724144][ C0] ? tick_sched_handle.isra.19+0x60/0x60
Mar 28 01:43:08 Worker kernel: [ 1224.724145][ C0] update_process_times+0x55/0x80
Mar 28 01:43:08 Worker kernel: [ 1224.724146][ C0] tick_sched_handle.isra.19+0x1d/0x60
Mar 28 01:43:08 Worker kernel: [ 1224.724147][ C0] ? tick_sched_do_timer+0x53/0x60
Mar 28 01:43:08 Worker kernel: [ 1224.724149][ C0] tick_sched_timer+0x65/0x80
Mar 28 01:43:08 Worker kernel: [ 1224.724150][ C0] __hrtimer_run_queues+0x10d/0x250
Mar 28 01:43:08 Worker kernel: [ 1224.724151][ C0] hrtimer_interrupt+0xe5/0x240
Mar 28 01:43:08 Worker kernel: [ 1224.724153][ C0] __sysvec_apic_timer_interrupt+0x5d/0xf0
Mar 28 01:43:08 Worker kernel: [ 1224.724155][ C0] asm_call_irq_on_stack+0xf/0x20
Mar 28 01:43:08 Worker kernel: [ 1224.724156][ C0]
Mar 28 01:43:08 Worker kernel: [ 1224.724157][ C0] sysvec_apic_timer_interrupt+0x5a/0x80
Mar 28 01:43:08 Worker kernel: [ 1224.724159][ C0] asm_sysvec_apic_timer_interrupt+0x12/0x20
Mar 28 01:43:08 Worker kernel: [ 1224.724397][ C0] RIP: 0010:_nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.724398][ C0] Code: 24 b0 03 00 00 48 89 df 48 8b 86 60 05 00 00 e8 c3 bd 84 ed be 00 00 9c 02 bf 95
df 5e 0e 31 c0 e8 22 72 c7 ff e8 1d d9 3b 00 fe 66 2e 0f 1f 84 00 00 00 00 00 90 41 56 41 55 41 54 53 49 89
Mar 28 01:43:08 Worker kernel: [ 1224.724399][ C0] RSP: 0018:ffffb699c5913d88 EFLAGS: 00000286
Mar 28 01:43:08 Worker kernel: [ 1224.724400][ C0] RAX: 0000000000000000 RBX: ffff9a55c7b98008 RCX: 0000000000000020
Mar 28 01:43:08 Worker kernel: [ 1224.724400][ C0] RDX: 0000000000000001 RSI: ffff9a55c73d2cf4 RDI: 00000000000007ff
Mar 28 01:43:08 Worker kernel: [ 1224.724401][ C0] RBP: ffff9a55c73d2d00 R08: 0000000000000020 R09: ffff9a55c73d2ce8
Mar 28 01:43:08 Worker kernel: [ 1224.724401][ C0] R10: 0000000000000000 R11: 000000000000006d R12: ffff9a55c8188008
Mar 28 01:43:08 Worker kernel: [ 1224.724402][ C0] R13: ffff9a55c80a1808 R14: ffff9a55c6899008 R15: 000000000001ffdf
Mar 28 01:43:08 Worker kernel: [ 1224.724630][ C0] ? _nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.724840][ C0] ? _nv030804rm+0x101/0x140 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.725067][ C0] ? _nv029378rm+0xaf5/0xd90 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.725295][ C0] ? _nv029386rm+0x161/0x440 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.725419][ C0] ? _nv000726rm+0xb1/0x250 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.725421][ C0] ? irq_finalize_oneshot.part.49+0xf0/0xf0
Mar 28 01:43:08 Worker kernel: [ 1224.725537][ C0] ? rm_isr_bh+0x1c/0x60 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.725597][ C0] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Mar 28 01:43:08 Worker kernel: [ 1224.725598][ C0] ? irq_thread_fn+0x21/0x60
Mar 28 01:43:08 Worker kernel: [ 1224.725599][ C0] ? irq_thread+0xe7/0x170
Mar 28 01:43:08 Worker kernel: [ 1224.725601][ C0] ? irq_forced_thread_fn+0x80/0x80
Mar 28 01:43:08 Worker kernel: [ 1224.725602][ C0] ? irq_thread_check_affinity+0xe0/0xe0
Mar 28 01:43:08 Worker kernel: [ 1224.725603][ C0] ? kthread+0x117/0x130
Mar 28 01:43:08 Worker kernel: [ 1224.725604][ C0] ? kthread_park+0x90/0x90
Mar 28 01:43:08 Worker kernel: [ 1224.725606][ C0] ? ret_from_fork+0x1f/0x30
Mar 28 01:43:14 Worker hive-watchdog[1043]: OK LA(5m): 1.72 < 10.0, LA(1m): 4.85 < 20.0
Mar 28 01:43:14 Worker hive-watchdog[1043]: GPU are lost, waiting
Mar 28 01:43:14 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 73 seconds
Mar 28 01:43:22 Worker avg_khs[2227]: Preparing algorithm statistics for upload
Mar 28 01:43:22 Worker avg_khs[2227]: Uploading ethash statistic saved 2022-03-28 01:42:32.655169987
Mar 28 01:43:24 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 83 seconds
Mar 28 01:43:29 Worker avg_khs[2227]: Uploading algorithm statistics completed
Mar 28 01:43:29 Worker avg_khs[2227]: {“params”:{“avg_khs”: null}}
Mar 28 01:43:34 Worker hive-watchdog[1043]: GPU are lost, waiting
Mar 28 01:43:34 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 93 seconds
Mar 28 01:43:35 Worker kernel: [ 1252.044850][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [irq/151-nvidia:939]
Mar 28 01:43:35 Worker kernel: [ 1252.044853][ C0] Modules linked in: nvidia_uvm(POE) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm
_kms_helper cec drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops syscopyarea sysfillrect sysimgblt fb fbdev in
tel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd cryptd glue_helper rapl intel_cstate eeepc_wmi asus_wmi sparse_keymap wmi_bmof efi_pstore mei_me mei intel_lpss_pci intel_lpss i
dma64 virt_dma input_leds acpi_pad mac_hid sch_fq_codel sunrpc droptcpsock(OE) ip_tables x_tables autofs4 hid_generic usbhid hid nvme e1000e
ptp pps_core nvme_core ahci libahci video wmi
Mar 28 01:43:35 Worker kernel: [ 1252.044878][ C0] CPU: 0 PID: 939 Comm: irq/151-nvidia Tainted: P OEL 5.10.0-hiveos #83.hiv
eos.211201
Mar 28 01:43:35 Worker kernel: [ 1252.044878][ C0] Hardware name: ASUS System Product Name/PRIME B560-PLUS, BIOS 1410 01/28/2022
Mar 28 01:43:35 Worker kernel: [ 1252.045120][ C0] RIP: 0010:_nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:43:35 Worker kernel: [ 1252.045121][ C0] Code: 24 b0 03 00 00 48 89 df 48 8b 86 60 05 00 00 e8 c3 bd 84 ed be 00 00 9c 02 bf 95
df 5e 0e 31 c0 e8 22 72 c7 ff e8 1d d9 3b 00 fe 66 2e 0f 1f 84 00 00 00 00 00 90 41 56 41 55 41 54 53 49 89
Mar 28 01:43:35 Worker kernel: [ 1252.045122][ C0] RSP: 0018:ffffb699c5913d88 EFLAGS: 00000286
Mar 28 01:43:35 Worker kernel: [ 1252.045123][ C0] RAX: 0000000000000000 RBX: ffff9a55c7b98008 RCX: 0000000000000020
Mar 28 01:43:35 Worker kernel: [ 1252.045123][ C0] RDX: 0000000000000001 RSI: ffff9a55c73d2cf4 RDI: 00000000000007ff
Mar 28 01:43:35 Worker kernel: [ 1252.045124][ C0] RBP: ffff9a55c73d2d00 R08: 0000000000000020 R09: ffff9a55c73d2ce8
Mar 28 01:43:35 Worker kernel: [ 1252.045124][ C0] R10: 0000000000000000 R11: 000000000000006d R12: ffff9a55c8188008
Mar 28 01:43:35 Worker kernel: [ 1252.045125][ C0] R13: ffff9a55c80a1808 R14: ffff9a55c6899008 R15: 000000000001ffdf
Mar 28 01:43:35 Worker kernel: [ 1252.045126][ C0] FS: 0000000000000000(0000) GS:ffff9a5713c00000(0000) knlGS:0000000000000000
Mar 28 01:43:35 Worker kernel: [ 1252.045126][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 28 01:43:35 Worker kernel: [ 1252.045127][ C0] CR2: 00005653b702beb0 CR3: 000000021640a003 CR4: 00000000003706f0
Mar 28 01:43:35 Worker kernel: [ 1252.045127][ C0] Call Trace:
Mar 28 01:43:35 Worker kernel: [ 1252.045356][ C0] ? _nv030804rm+0x101/0x140 [nvidia]
Mar 28 01:43:35 Worker kernel: [ 1252.045587][ C0] ? _nv029378rm+0xaf5/0xd90 [nvidia]
Mar 28 01:43:35 Worker kernel: [ 1252.045814][ C0] ? _nv029386rm+0x161/0x440 [nvidia]
Mar 28 01:43:35 Worker kernel: [ 1252.045937][ C0] ? _nv000726rm+0xb1/0x250 [nvidia]
Mar 28 01:43:35 Worker kernel: [ 1252.045940][ C0] ? irq_finalize_oneshot.part.49+0xf0/0xf0
Mar 28 01:43:35 Worker kernel: [ 1252.046062][ C0] ? rm_isr_bh+0x1c/0x60 [nvidia]
Mar 28 01:43:35 Worker kernel: [ 1252.046122][ C0] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Mar 28 01:43:35 Worker kernel: [ 1252.046123][ C0] ? irq_thread_fn+0x21/0x60
Mar 28 01:43:35 Worker kernel: [ 1252.046124][ C0] ? irq_thread+0xe7/0x170
Mar 28 01:43:35 Worker kernel: [ 1252.046125][ C0] ? irq_forced_thread_fn+0x80/0x80
Mar 28 01:43:35 Worker kernel: [ 1252.046127][ C0] ? irq_thread_check_affinity+0xe0/0xe0
Mar 28 01:43:35 Worker kernel: [ 1252.046128][ C0] ? kthread+0x117/0x130
Mar 28 01:43:35 Worker kernel: [ 1252.046129][ C0] ? kthread_park+0x90/0x90
Mar 28 01:43:35 Worker kernel: [ 1252.046130][ C0] ? ret_from_fork+0x1f/0x30
Mar 28 01:43:44 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 103 seconds
Mar 28 01:43:54 Worker hive-watchdog[1043]: GPU are lost, waiting
Mar 28 01:43:54 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 113 seconds
Mar 28 01:44:03 Worker kernel: [ 1280.045546][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [irq/151-nvidia:939]
Mar 28 01:44:03 Worker kernel: [ 1280.045548][ C0] Modules linked in: nvidia_uvm(POE) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm
_kms_helper cec drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops syscopyarea sysfillrect sysimgblt fb fbdev in
tel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd cryptd glue_helper rapl intel_cstate eeepc_wmi asus_wmi sparse_keymap wmi_bmof efi_pstore mei_me mei intel_lpss_pci intel_lpss i
dma64 virt_dma input_leds acpi_pad mac_hid sch_fq_codel sunrpc droptcpsock(OE) ip_tables x_tables autofs4 hid_generic usbhid hid nvme e1000e
ptp pps_core nvme_core ahci libahci video wmi
Mar 28 01:44:03 Worker kernel: [ 1280.045573][ C0] CPU: 0 PID: 939 Comm: irq/151-nvidia Tainted: P OEL 5.10.0-hiveos #83.hiv
eos.211201
Mar 28 01:44:03 Worker kernel: [ 1280.045574][ C0] Hardware name: ASUS System Product Name/PRIME B560-PLUS, BIOS 1410 01/28/2022
Mar 28 01:44:03 Worker kernel: [ 1280.045817][ C0] RIP: 0010:_nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:44:03 Worker kernel: [ 1280.045818][ C0] Code: 24 b0 03 00 00 48 89 df 48 8b 86 60 05 00 00 e8 c3 bd 84 ed be 00 00 9c 02 bf 95
df 5e 0e 31 c0 e8 22 72 c7 ff e8 1d d9 3b 00 fe 66 2e 0f 1f 84 00 00 00 00 00 90 41 56 41 55 41 54 53 49 89
Mar 28 01:44:03 Worker kernel: [ 1280.045819][ C0] RSP: 0018:ffffb699c5913d88 EFLAGS: 00000286
Mar 28 01:44:03 Worker kernel: [ 1280.045820][ C0] RAX: 0000000000000000 RBX: ffff9a55c7b98008 RCX: 0000000000000020
Mar 28 01:44:03 Worker kernel: [ 1280.045820][ C0] RDX: 0000000000000001 RSI: ffff9a55c73d2cf4 RDI: 00000000000007ff
Mar 28 01:44:03 Worker kernel: [ 1280.045821][ C0] RBP: ffff9a55c73d2d00 R08: 0000000000000020 R09: ffff9a55c73d2ce8
Mar 28 01:44:03 Worker kernel: [ 1280.045821][ C0] R10: 0000000000000000 R11: 000000000000006d R12: ffff9a55c8188008
Mar 28 01:44:03 Worker kernel: [ 1280.045822][ C0] R13: ffff9a55c80a1808 R14: ffff9a55c6899008 R15: 000000000001ffdf
Mar 28 01:44:03 Worker kernel: [ 1280.045823][ C0] FS: 0000000000000000(0000) GS:ffff9a5713c00000(0000) knlGS:0000000000000000
Mar 28 01:44:03 Worker kernel: [ 1280.045823][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 28 01:44:03 Worker kernel: [ 1280.045824][ C0] CR2: 00005653b702beb0 CR3: 000000021640a003 CR4: 00000000003706f0
Mar 28 01:44:03 Worker kernel: [ 1280.045824][ C0] Call Trace:
Mar 28 01:44:03 Worker kernel: [ 1280.046054][ C0] ? _nv030804rm+0x101/0x140 [nvidia]
Mar 28 01:44:03 Worker kernel: [ 1280.046286][ C0] ? _nv029378rm+0xaf5/0xd90 [nvidia]
Mar 28 01:44:03 Worker kernel: [ 1280.046514][ C0] ? _nv029386rm+0x161/0x440 [nvidia]
Mar 28 01:44:03 Worker kernel: [ 1280.046637][ C0] ? _nv000726rm+0xb1/0x250 [nvidia]
Mar 28 01:44:03 Worker kernel: [ 1280.046639][ C0] ? irq_finalize_oneshot.part.49+0xf0/0xf0
Mar 28 01:44:03 Worker kernel: [ 1280.046762][ C0] ? rm_isr_bh+0x1c/0x60 [nvidia]
Mar 28 01:44:03 Worker kernel: [ 1280.046821][ C0] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Mar 28 01:44:03 Worker kernel: [ 1280.046822][ C0] ? irq_thread_fn+0x21/0x60
Mar 28 01:44:03 Worker kernel: [ 1280.046823][ C0] ? irq_thread+0xe7/0x170
Mar 28 01:44:03 Worker kernel: [ 1280.046825][ C0] ? irq_forced_thread_fn+0x80/0x80
Mar 28 01:44:03 Worker kernel: [ 1280.046826][ C0] ? irq_thread_check_affinity+0xe0/0xe0
Mar 28 01:44:03 Worker kernel: [ 1280.046827][ C0] ? kthread+0x117/0x130
Mar 28 01:44:03 Worker kernel: [ 1280.046828][ C0] ? kthread_park+0x90/0x90
Mar 28 01:44:03 Worker kernel: [ 1280.046830][ C0] ? ret_from_fork+0x1f/0x30
Mar 28 01:44:04 Worker hive-watchdog[1043]: BARK ethash 0 kHs < 0.625 kHs for 123 seconds
Mar 28 01:44:04 Worker hive-watchdog[1043]: /hive/bin/message: line 61: echo: write error: Broken pipe
Mar 28 01:44:04 Worker hive-watchdog[1043]: #033[0;36m> Sending #033[1;37mwarning#033[0;36m with payload to #033[1;36mhttp://api.hiveos.farm
#033[0m
Mar 28 01:44:11 Worker hive-watchdog[1043]: #033[0;31mError: #033[1;31mCURLE_COULDNT_CONNECT (7) Failed to connect() to host or proxy.#033[0
m
Mar 28 01:44:13 Worker hive-watchdog[1043]: #033[0;36m> Sending #033[1;37mwarning#033[0;36m with payload to #033[1;36mhttp://api.hiveos.farm
#033[0m
Mar 28 01:44:20 Worker hive-watchdog[1043]: #033[0;31mError: #033[1;31mCURLE_COULDNT_CONNECT (7) Failed to connect() to host or proxy.#033[0
m
Mar 28 01:44:22 Worker hive-watchdog[1043]: #033[0;36m> Sending #033[1;37mwarning#033[0;36m with payload to #033[1;36mhttp://api.hiveos.farm
#033[0m
Mar 28 01:44:29 Worker hive-watchdog[1043]: #033[0;31mError: #033[1;31mCURLE_COULDNT_CONNECT (7) Failed to connect() to host or proxy.#033[0
m
Mar 28 01:44:29 Worker hive-watchdog[1043]: —
Mar 28 01:44:29 Worker hive-watchdog[1043]: Restarting ethash after 2 minutes
Mar 28 01:44:29 Worker hive-watchdog[1043]: —
Mar 28 01:44:29 Worker hive-watchdog[1043]: #033[0;33mRestarting miner#033[0m
Mar 28 01:44:29 Worker hive-watchdog[1043]: Sending Ctrl+C to screen session 3171
Mar 28 01:44:29 Worker avg_khs[2227]: {“params”:{“avg_khs”: null}}
Mar 28 01:44:31 Worker kernel: [ 1308.046194][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [irq/151-nvidia:939]
Mar 28 01:44:31 Worker kernel: [ 1308.046196][ C0] Modules linked in: nvidia_uvm(POE) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm
_kms_helper cec drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops syscopyarea sysfillrect sysimgblt fb fbdev in
tel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd cryptd glue_helper rapl intel_cstate eeepc_wmi asus_wmi sparse_keymap wmi_bmof efi_pstore mei_me mei intel_lpss_pci intel_lpss i
dma64 virt_dma input_leds acpi_pad mac_hid sch_fq_codel sunrpc droptcpsock(OE) ip_tables x_tables autofs4 hid_generic usbhid hid nvme e1000e
ptp pps_core nvme_core ahci libahci video wmi
Mar 28 01:44:31 Worker kernel: [ 1308.046223][ C0] CPU: 0 PID: 939 Comm: irq/151-nvidia Tainted: P OEL 5.10.0-hiveos #83.hiv
eos.211201
Mar 28 01:44:31 Worker kernel: [ 1308.046223][ C0] Hardware name: ASUS System Product Name/PRIME B560-PLUS, BIOS 1410 01/28/2022
Mar 28 01:44:31 Worker kernel: [ 1308.046466][ C0] RIP: 0010:_nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:44:31 Worker kernel: [ 1308.046468][ C0] Code: 24 b0 03 00 00 48 89 df 48 8b 86 60 05 00 00 e8 c3 bd 84 ed be 00 00 9c 02 bf 95
df 5e 0e 31 c0 e8 22 72 c7 ff e8 1d d9 3b 00 fe 66 2e 0f 1f 84 00 00 00 00 00 90 41 56 41 55 41 54 53 49 89
Mar 28 01:44:31 Worker kernel: [ 1308.046468][ C0] RSP: 0018:ffffb699c5913d88 EFLAGS: 00000286
Mar 28 01:44:31 Worker kernel: [ 1308.046469][ C0] RAX: 0000000000000000 RBX: ffff9a55c7b98008 RCX: 0000000000000020
Mar 28 01:44:31 Worker kernel: [ 1308.046470][ C0] RDX: 0000000000000001 RSI: ffff9a55c73d2cf4 RDI: 00000000000007ff
Mar 28 01:44:31 Worker kernel: [ 1308.046470][ C0] RBP: ffff9a55c73d2d00 R08: 0000000000000020 R09: ffff9a55c73d2ce8
Mar 28 01:44:31 Worker kernel: [ 1308.046471][ C0] R10: 0000000000000000 R11: 000000000000006d R12: ffff9a55c8188008
Mar 28 01:44:31 Worker kernel: [ 1308.046471][ C0] R13: ffff9a55c80a1808 R14: ffff9a55c6899008 R15: 000000000001ffdf
Mar 28 01:44:31 Worker kernel: [ 1308.046472][ C0] FS: 0000000000000000(0000) GS:ffff9a5713c00000(0000) knlGS:0000000000000000
Mar 28 01:44:31 Worker kernel: [ 1308.046473][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 28 01:44:31 Worker kernel: [ 1308.046473][ C0] CR2: 00005653b702beb0 CR3: 000000021640a003 CR4: 00000000003706f0
Mar 28 01:44:31 Worker kernel: [ 1308.046474][ C0] Call Trace:
Mar 28 01:44:31 Worker kernel: [ 1308.046704][ C0] ? _nv030804rm+0x101/0x140 [nvidia]
Mar 28 01:44:31 Worker kernel: [ 1308.046937][ C0] ? _nv029378rm+0xaf5/0xd90 [nvidia]
Mar 28 01:44:31 Worker kernel: [ 1308.047167][ C0] ? _nv029386rm+0x161/0x440 [nvidia]
Mar 28 01:44:31 Worker kernel: [ 1308.047287][ C0] ? _nv000726rm+0xb1/0x250 [nvidia]
Mar 28 01:44:31 Worker kernel: [ 1308.047289][ C0] ? irq_finalize_oneshot.part.49+0xf0/0xf0
Mar 28 01:44:31 Worker kernel: [ 1308.047413][ C0] ? rm_isr_bh+0x1c/0x60 [nvidia]
Mar 28 01:44:31 Worker kernel: [ 1308.047478][ C0] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Mar 28 01:44:31 Worker kernel: [ 1308.047480][ C0] ? irq_thread_fn+0x21/0x60
Mar 28 01:44:31 Worker kernel: [ 1308.047481][ C0] ? irq_thread+0xe7/0x170
Mar 28 01:44:31 Worker kernel: [ 1308.047482][ C0] ? irq_forced_thread_fn+0x80/0x80
Mar 28 01:44:31 Worker kernel: [ 1308.047483][ C0] ? irq_thread_check_affinity+0xe0/0xe0
Mar 28 01:44:31 Worker kernel: [ 1308.047484][ C0] ? kthread+0x117/0x130
Mar 28 01:44:31 Worker kernel: [ 1308.047485][ C0] ? kthread_park+0x90/0x90
Mar 28 01:44:31 Worker kernel: [ 1308.047487][ C0] ? ret_from_fork+0x1f/0x30
Mar 28 01:44:45 Worker hive-watchdog[1043]: Waiting 15s for miners to exit. . . . . . . . . . . . . . .
Mar 28 01:44:45 Worker hive-watchdog[1043]: Stopping screen session 3171
Mar 28 01:44:46 Worker hive-watchdog[1043]: Starting #033[0;36mt-rex#033[0m
Mar 28 01:44:46 Worker kernel: [ 1323.026780][ T5005] DTS: killing sk:000000009c3c17cb (127.0.0.1:43828 → 127.0.0.1:4059) state 6
Mar 28 01:44:46 Worker kernel: [ 1323.026783][ T5005] DTS: killing sk:000000005e94ba6b (127.0.0.1:43832 → 127.0.0.1:4059) state 6
Mar 28 01:44:46 Worker kernel: [ 1323.026784][ T5005] DTS: killing sk:00000000df8d1d54 (127.0.0.1:43836 → 127.0.0.1:4059) state 6
Mar 28 01:44:46 Worker kernel: [ 1323.026786][ T5005] DTS: killing sk:000000002751c2ac (127.0.0.1:4059 → 127.0.0.1:43844) state 8
Mar 28 01:44:57 Worker hive-watchdog[1043]: OK LA(5m): 3.76 < 10.0, LA(1m): 8.14 < 20.0
Mar 28 01:44:57 Worker hive-watchdog[1043]: GPU are lost, rebooting
Mar 28 01:44:57 Worker hive-watchdog[1043]: /hive/bin/message: line 61: echo: write error: Broken pipe
Mar 28 01:44:59 Worker kernel: [ 1336.046797][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [irq/151-nvidia:939]
Mar 28 01:44:59 Worker kernel: [ 1336.046799][ C0] Modules linked in: nvidia_uvm(POE) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm
_kms_helper cec drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops syscopyarea sysfillrect sysimgblt fb fbdev in
tel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd cryptd glue_helper rapl intel_cstate eeepc_wmi asus_wmi sparse_keymap wmi_bmof efi_pstore mei_me mei intel_lpss_pci intel_lpss i
dma64 virt_dma input_leds acpi_pad mac_hid sch_fq_codel sunrpc droptcpsock(OE) ip_tables x_tables autofs4 hid_generic usbhid hid nvme e1000e
ptp pps_core nvme_core ahci libahci video wmi
Mar 28 01:44:59 Worker kernel: [ 1336.046826][ C0] CPU: 0 PID: 939 Comm: irq/151-nvidia Tainted: P OEL 5.10.0-hiveos #83.hiv
eos.211201
Mar 28 01:44:59 Worker kernel: [ 1336.046827][ C0] Hardware name: ASUS System Product Name/PRIME B560-PLUS, BIOS 1410 01/28/2022
Mar 28 01:44:59 Worker kernel: [ 1336.047074][ C0] RIP: 0010:_nv019874rm+0x233/0x240 [nvidia]
Mar 28 01:44:59 Worker kernel: [ 1336.047075][ C0] Code: 24 b0 03 00 00 48 89 df 48 8b 86 60 05 00 00 e8 c3 bd 84 ed be 00 00 9c 02 bf 95
df 5e 0e 31 c0 e8 22 72 c7 ff e8 1d d9 3b 00 fe 66 2e 0f 1f 84 00 00 00 00 00 90 41 56 41 55 41 54 53 49 89
Mar 28 01:44:59 Worker kernel: [ 1336.047076][ C0] RSP: 0018:ffffb699c5913d88 EFLAGS: 00000286
Mar 28 01:44:59 Worker kernel: [ 1336.047077][ C0] RAX: 0000000000000000 RBX: ffff9a55c7b98008 RCX: 0000000000000020
Mar 28 01:44:59 Worker kernel: [ 1336.047077][ C0] RDX: 0000000000000001 RSI: ffff9a55c73d2cf4 RDI: 00000000000007ff
Mar 28 01:44:59 Worker kernel: [ 1336.047078][ C0] RBP: ffff9a55c73d2d00 R08: 0000000000000020 R09: ffff9a55c73d2ce8
Mar 28 01:44:59 Worker kernel: [ 1336.047078][ C0] R10: 0000000000000000 R11: 000000000000006d R12: ffff9a55c8188008
Mar 28 01:44:59 Worker kernel: [ 1336.047079][ C0] R13: ffff9a55c80a1808 R14: ffff9a55c6899008 R15: 000000000001ffdf
Mar 28 01:44:59 Worker kernel: [ 1336.047079][ C0] FS: 0000000000000000(0000) GS:ffff9a5713c00000(0000) knlGS:0000000000000000
Mar 28 01:44:59 Worker kernel: [ 1336.047080][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 28 01:44:59 Worker kernel: [ 1336.047080][ C0] CR2: 00005653b702beb0 CR3: 000000021640a003 CR4: 00000000003706f0
Mar 28 01:44:59 Worker kernel: [ 1336.047081][ C0] Call Trace:
Mar 28 01:44:59 Worker kernel: [ 1336.047311][ C0] ? _nv030804rm+0x101/0x140 [nvidia]
Mar 28 01:44:59 Worker kernel: [ 1336.047546][ C0] ? _nv029378rm+0xaf5/0xd90 [nvidia]
Mar 28 01:44:59 Worker kernel: [ 1336.047773][ C0] ? _nv029386rm+0x161/0x440 [nvidia]
Mar 28 01:44:59 Worker kernel: [ 1336.047893][ C0] ? _nv000726rm+0xb1/0x250 [nvidia]
Mar 28 01:44:59 Worker kernel: [ 1336.047896][ C0] ? irq_finalize_oneshot.part.49+0xf0/0xf0
Mar 28 01:44:59 Worker kernel: [ 1336.048015][ C0] ? rm_isr_bh+0x1c/0x60 [nvidia]
Mar 28 01:44:59 Worker kernel: [ 1336.048074][ C0] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Mar 28 01:44:59 Worker kernel: [ 1336.048075][ C0] ? irq_thread_fn+0x21/0x60
Mar 28 01:44:59 Worker kernel: [ 1336.048076][ C0] ? irq_thread+0xe7/0x170
Mar 28 01:44:59 Worker kernel: [ 1336.048078][ C0] ? irq_forced_thread_fn+0x80/0x80
Mar 28 01:44:59 Worker kernel: [ 1336.048079][ C0] ? irq_thread_check_affinity+0xe0/0xe0
Mar 28 01:44:59 Worker kernel: [ 1336.048080][ C0] ? kthread+0x117/0x130
Mar 28 01:44:59 Worker kernel: [ 1336.048081][ C0] ? kthread_park+0x90/0x90
Mar 28 01:44:59 Worker kernel: [ 1336.048083][ C0] ? ret_from_fork+0x1f/0x30
Mar 28 01:45:02 Worker CRON[5291]: (root) CMD (/hive/bin/miner logtruncateall)
Mar 28 01:45:02 Worker CRON[5292]: (root) CMD (/hive/sbin/logrotate)
Mar 28 01:45:02 Worker CRON[5300]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Mar 28 01:45:22 Worker hive-watchdog[1043]: > Preparing for reboot
Mar 28 01:45:22 Worker hive-watchdog[1043]: > Unmounting disks

…Mar 28 10:38:36 Worker systemd-modules-load
[233]: Inserted module ‘droptcpsock’
Mar 28 10:38:36 Worker systemd-sysctl[261]: Couldn’t write ‘2’ to ‘net/ipv6/conf/all/use_tempaddr’, ignoring: No such file or directory

Thanks to Grea for posting in another thread, I tried this command

nvidia-smi dmon -s et -d 10 -o DT

Turns out I had 2 bad Risers. I changed them out and I’ve been running for almost 7 hours now.

Thanks Grea!

1 Like

Glad it worked for you! It is a nice command to trouble shoot the riser path in nVidia systems.

Keep an old nVidia GPU(post Maxwell) around to test risers/adapters/USB cables. Don’t be shocked when brand new ones fail.

:thinking:

This topic was automatically closed 416 days after the last reply. New replies are no longer allowed.