lost connection to sbc due to remote timeout error
-
We recently got an error message
lost connection to sbc due to remote timeout
on a duet 3 mini 5+ running in SBC mode. Never saw this error before and found nothing - can anyone (@dc42) point me in the right direction?
-
@benecito Send M122 after the timeout occurred and check what device caused the timeout (somewhere at the end of the diagnostics). If it is the SBC, probably slow IO caused the main service to hang and you'll see a timeout by the SBC. That particular error message cannot come from DSF, though.
-
@chrishamm That's the M122. DWC shows disconnected so I ran it over USB. So I assume the error is also sent from the firmware and not DSF
What are you thinking about if talking about "slow IO"?Recv:16:47:15.911: === Diagnostics === Recv:16:47:15.911: RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1 (2023-08-31 16:16:56) running on Duet 3 Mini5plus WiFi (SBC mode) Recv:16:47:15.911: Board ID: 4YP38-PR6KL-K65J0-409ND-JSW1Z-RWVQQ Recv:16:47:15.911: Used output buffers: 1 of 40 (34 max) Recv:16:47:15.911: === RTOS === Recv:16:47:15.911: Static ram: 102836 Recv:16:47:15.912: Dynamic ram: 106008 of which 0 recycled Recv:16:47:15.912: Never used RAM 28348, free system stack 66 words Recv:16:47:15.912: Tasks: SBC(2,nWait,7.3%,434) HEAT(3,nWait,0.3%,323) Move(4,nWait,26.9%,255) CanReceiv(6,nWait,0.0%,939) CanSender(5,nWait,0.0%,337) CanClock(7,delaying,0.1%,342) TMC(4,nWait,3.4%,74) MAIN(1,running,58.1%,822) IDLE(0,ready,0.0%,29) AIN(4,delaying,3.9%,264), total 100.0% Recv:16:47:15.912: Owned mutexes: USB(MAIN) Recv:16:47:15.912: === Platform === Recv:16:47:15.912: Last reset 06:04:11 ago, cause: software Recv:16:47:15.912: Last software reset at 2023-11-05 10:43, reason: User, Platform spinning, available RAM 28460, slot 0 Recv:16:47:15.912: Software reset code 0x2000 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00000000 BFAR 0xe000ed38 SP 0x00000000 Task SBC Freestk 0 n/a Recv:16:47:15.913: Error status: 0x00 Recv:16:47:15.913: MCU revision 3, ADC conversions started 21852847, completed 21852845, timed out 0, errs 0 Recv:16:47:15.913: MCU temperature: min 35.6, current 61.1, max 65.0 Recv:16:47:15.913: Supply voltage: min 23.4, current 23.8, max 24.2, under voltage events: 0, over voltage events: 0, power good: yes Recv:16:47:15.913: Heap OK, handles allocated/used 99/8, heap memory allocated/used/recyclable 2048/120/0, gc cycles 0 Recv:16:47:15.913: Events: 0 queued, 0 completed Recv:16:47:15.913: Driver 0: standstill, SG min 0, read errors 0, write errors 1, ifcnt 42, reads 33239, writes 19, timeouts 0, DMA errors 0, CC errors 0 Recv:16:47:15.914: Driver 1: ok, SG min 0, read errors 0, write errors 1, ifcnt 42, reads 33237, writes 21, timeouts 0, DMA errors 0, CC errors 0 Recv:16:47:15.914: Driver 2: ok, SG min 0, read errors 0, write errors 1, ifcnt 42, reads 33236, writes 21, timeouts 0, DMA errors 0, CC errors 0 Recv:16:47:15.914: Driver 3: ok, SG min 0, read errors 0, write errors 1, ifcnt 42, reads 33236, writes 21, timeouts 0, DMA errors 0, CC errors 0 Recv:16:47:15.914: Driver 4: standstill, SG min 0, read errors 0, write errors 1, ifcnt 36, reads 33239, writes 19, timeouts 0, DMA errors 0, CC errors 0 Recv:16:47:15.914: Driver 5: not present Recv:16:47:15.914: Driver 6: not present Recv:16:47:15.915: Date/time: 2023-11-05 16:47:14 Recv:16:47:15.915: Cache data hit count 4294967295 Recv:16:47:15.915: Slowest loop: 67.66ms; fastest: 0.11ms Recv:16:47:15.915: === Storage === Recv:16:47:15.915: Free file entries: 20 Recv:16:47:15.915: SD card 0 not detected, interface speed: 0.0MBytes/sec Recv:16:47:15.915: SD card longest read time 0.0ms, write time 0.0ms, max retries 0 Recv:16:47:15.915: === Move === Recv:16:47:15.916: DMs created 83, segments created 57, maxWait 1480432ms, bed compensation in use: none, height map offset 0.000, ebfmin -1.00, ebfmax 1.00 Recv:16:47:15.916: next step interrupt due in 1059 ticks, enabled Recv:16:47:15.916: Moves shaped first try 3252, on retry 15564, too short 426412, wrong shape 1142318, maybepossible 389406 Recv:16:47:15.916: === DDARing 0 === Recv:16:47:15.916: Scheduled moves 2936143, completed 2936103, hiccups 0, stepErrors 0, LaErrors 0, Underruns [340521, 0, 1], CDDA state 3 Recv:16:47:15.916: === DDARing 1 === Recv:16:47:15.916: Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 Recv:16:47:15.916: === Heat === Recv:16:47:15.917: Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Recv:16:47:15.917: === GCodes === Recv:16:47:15.917: Movement locks held by null, null Recv:16:47:15.917: HTTP* is idle in state(s) 0 Recv:16:47:15.917: Telnet is idle in state(s) 0 Recv:16:47:15.917: File is idle in state(s) 0 Recv:16:47:15.917: USB is ready with "M122" in state(s) 0 Recv:16:47:15.917: Aux is idle in state(s) 0 Recv:16:47:15.917: Trigger* is idle in state(s) 0 Recv:16:47:15.917: Queue* is idle in state(s) 0 Recv:16:47:15.918: LCD is idle in state(s) 0 Recv:16:47:15.918: SBC is idle in state(s) 0 Recv:16:47:15.918: Daemon is idle in state(s) 0 Recv:16:47:15.918: Aux2 is idle in state(s) 0 Recv:16:47:15.918: Autopause is idle in state(s) 0 Recv:16:47:15.918: File2 is idle in state(s) 0 Recv:16:47:15.918: Queue2 is idle in state(s) 0 Recv:16:47:15.918: Q0 segments left 1, axes/extruders owned 0x0000807 Recv:16:47:15.918: Code queue 0 is empty Recv:16:47:15.919: Q1 segments left 0, axes/extruders owned 0x0000000 Recv:16:47:15.919: Code queue 1 is empty Recv:16:47:15.919: === CAN === Recv:16:47:15.919: Messages queued 196649, received 0, lost 0, boc 0 Recv:16:47:15.919: Longest wait 0ms for reply type 0, peak Tx sync delay 0, free buffers 26 (min 26), ts 109257/0/0 Recv:16:47:15.919: Tx timeouts 0,0,109256,0,0,87391 last cancelled message type 30 dest 127 Recv:16:47:15.919: === SBC interface === Recv:16:47:15.919: Transfer state: 0, failed transfers: 0, checksum errors: 0 Recv:16:47:15.919: RX/TX seq numbers: 0/1 Recv:16:47:15.920: SPI underruns 0, overruns 0 Recv:16:47:15.920: State: 0, disconnects: 1, timeouts: 1 total, 1 by SBC, IAP RAM available 0x0dadc Recv:16:47:15.920: Buffer RX/TX: 0/0-0, open files: 0
-
@benecito Thanks, that confirms my assumption:
State: 0, disconnects: 1, timeouts: 1 total, 1 by SBC, IAP RAM available 0x0dadc
RRF lost connection to the Duet because the SBC didn't respond in time, which can happen when the SBC is temporarily overloaded. Consider exchanging the microSD card with an A1- or A2-rated card if you haven't done this already.