Random Hard Fault resets on Duet 3 Mini WiFi
-
@Exerqtor With both debug builds I logged a memory difference message within a couple of hours. The first debug build was under an hour, the second took maybe 1-2 hours to show up.
-
For comparison, my system has logged anything from 0 to 4 memory differences in 24 hours. I have two instances of DWC connected (from Firefox and Chrome running on the same PC) and I reduced the DWC polling interval to 100ms.
-
@omtek said in Random Hard Fault resets on Duet 3 Mini WiFi:
@Exerqtor With both debug builds I logged a memory difference message within a couple of hours. The first debug build was under an hour, the second took maybe 1-2 hours to show up.
@dc42 said in Random Hard Fault resets on Duet 3 Mini WiFi:
For comparison, my system has logged anything from 0 to 4 memory differences in 24 hours. I have two instances of DWC connected (from Firefox and Chrome running on the same PC) and I reduced the DWC polling interval to 100ms.
Hmm ok, in that case my machine is either really good behaved or I've setup something wrong
- Installed the 3.6-debug fw.
- Hooked a computer with USB.
- Installed the newest YAT from sourceforge.
- Connected it to the printer (tested by sending G28 in the termnial and the printer homed).
- After testing connection I sent
M111 P8 S1
through YAT. - Opened three instances(not three tabs, if thats any difference) or DWC in chrome + one instance of OrcaSlicer on one pc, plus i check in on my phone every once i a while too.
It's now been sitting for roughly 22 hours without anything happening. No output in YAT, and no reboot/reset.
-
@Exerqtor Did you...
Leave it running with at least one DWC session connected
I think DC42 said he had two DWC instances running (in different browsers).
-
@gloomyandy Added that info to the previous post
🙈
-
@Exerqtor have you ever had one of these resets since you upgraded to 3.6 beta 1?
-
@dc42 Not really, after I stopped using chrome (other than the occasionall check-in with chrome on my phone) I haven't really had any resets at all that I've noticed.
The laptop have restarted (forced Windows update
🤦♂️
) within the last two hours. So i don't know if anything got output in that timespan (no resets though). I've got it back up logging now and plan to have it runnning until the weekend.I see the debug log has reached 3.21gb by now though. So idk if I maybe should have disabled that.
Going on 22hours since the laptop rebooted, and it's still no output on my end.
-
Another memory difference message this afternoon. No resets to speak of, either.
*** Memory difference at line 2228 offset 12: original 0a0d392e copy 20032128, original changed, copy ok, fix=yes
and another
*** Memory difference at line 2228 offset 52: original 20036658 copy 0d0a0d6d, original ok, copy changed, fix=no
edit #3 - busy day...
*** Memory difference at line 2228 offset 60: original 2001882c copy 0a0d392e, original ok, copy changed, fix=no
-
Two more memory difference messages logged overnight. Printer was idle but has been printing nicely. Still no resets to speak of.
*** Memory difference at line 2228 offset 60: original 2001882c copy 0a0d392e, original ok, copy changed, fix=no
*** Memory difference at line 2228 offset 56: original 2002c5d8 copy 0a0d656e, original ok, copy changed, fix=no
-
No output in my end what so ever. Has anyone else had outputs with the 3.6 build?
🤔
-
@Exerqtor I had no output or resets using 3.6 either. I reverted to 3.5.3 and got two memory difference reports within a few hours.
-
@dc42Ok, should i do that too? At least until tomorrow, or do you have enough datapoints on 3.5?
Reverted to 3.5.3 debug and have it running now. Straight off the bat I noticed a difference, when I sent G28Z in YAT to comfirm connection i got
ok<LF>
in return, but I didn't get anything in return when I input something in the 3.6 debug. -
@Exerqtor do your machines run in Marlin compat mode?
-
@Exerqtor bear in mind that recent versions of YAT do not reconnect automatically when the Duet restarts. I always have to disconnect and reconnect YAT manually using the icons in the menu bar.
-
@dc42
Let me point out one more thing here: The printer I see this problem does not have this problem since some weeks anymore. But all of the print which the printer performed are below 100mm in Z. I'm not sure whether that is time ore high related. I just wanted to let you know my observation. -
@oliof Uuuuhm, nope:
25.10.2024, 06:51:35 M555 Output mode: RepRapFirmware I didn't know it was a thing even
😆
@dc42 said in Random Hard Fault resets on Duet 3 Mini WiFi:
@Exerqtor bear in mind that recent versions of YAT do not reconnect automatically when the Duet restarts. I always have to disconnect and reconnect YAT manually using the icons n the menu bar.
Yeah i've made sure to home the printer, and it's still homed. So it haven't been any resets
😅
-
Started a long print last night and woke up to two messages in the console:
*** Memory difference at line 2228 offset 4: original 392e303d copy 00000168, original changed, copy ok, fix=yes
*** Memory difference at line 2228 offset 4: original 392e303d copy 00000168, original changed, copy ok, fix=yes
(the same message occurs twice; potential crash caught?) Printer has been stable otherwise.
-
Mine had a reset today while I was @work, but the only output in YAT is:
ok<LF> HTTP is enabled on port 80<LF> FTP is enabled on port 21<LF> TELNET is disabled<LF> CAN response timeout: board 121, req type 6054, RID 8<LF> Warning: Heater 1 predicted maximum temperature at full power is 543°C<LF> Log level is : debug<LF> Done!<LF> RepRapFirmware for Duet 3 Mini 5+ is up and running.<LF> WiFi module started<LF> WiFi module is connected to access point IoT, IP address 192.168.30.x<LF> HTTP is enabled on port 80<LF> FTP is enabled on port 21<LF> TELNET is disabled<LF> CAN response timeout: board 121, req type 6054, RID 8<LF> Warning: Heater 1 predicted maximum temperature at full power is 543°C<LF> Log level is : debug<LF> Done!<LF> RepRapFirmware for Duet 3 Mini 5+ is up and running.<LF> WiFi module started<LF> WiFi module is connected to access point IoT, IP address 192.168.30.x<LF> Printed in one line on YAT, I just split it up for readabilty.
This is the M122 report:
2024-10-25 15:56:08 [debug] === Diagnostics === 2024-10-25 15:56:08 [debug] RepRapFirmware for Duet 3 Mini 5+ version 3.5.3+1dbg (2024-10-20 14:13:46) running on Duet 3 Mini5plus WiFi (standalone mode) 2024-10-25 15:56:08 [debug] Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ 2024-10-25 15:56:08 [debug] Used output buffers: 1 of 40 (39 max) 2024-10-25 15:56:08 [debug] === RTOS === 2024-10-25 15:56:08 [debug] Static ram: 103368 2024-10-25 15:56:08 [debug] Dynamic ram: 123956 of which 0 recycled 2024-10-25 15:56:08 [debug] Never used RAM 11572, free system stack 142 words 2024-10-25 15:56:08 [debug] Tasks: 2024-10-25 15:56:08 [debug] NETWORK(1,ready,18.9%,209) 2024-10-25 15:56:08 [debug] HEAT(3,nWait 6,0.0%,325) 2024-10-25 15:56:08 [debug] Move(4,nWait 6,0.0%,341) 2024-10-25 15:56:08 [debug] CanReceiv(6,nWait 1,0.1%,773) 2024-10-25 15:56:08 [debug] CanSender(5,nWait 7,0.0%,336) 2024-10-25 15:56:08 [debug] CanClock(7,delaying,0.0%,348) 2024-10-25 15:56:08 [debug] TMC(4,nWait 6,0.8%,101) 2024-10-25 15:56:08 [debug] MAIN(1,running,78.5%,665) 2024-10-25 15:56:08 [debug] IDLE(0,ready,0.8%,29) 2024-10-25 15:56:08 [debug] AIN(4,delaying,0.8%,259) 2024-10-25 15:56:08 [debug] , total 100.0% Owned mutexes: 2024-10-25 15:56:08 [debug] WiFi(NETWORK) 2024-10-25 15:56:08 [debug] USB(MAIN) 2024-10-25 15:56:08 [debug] === Platform === 2024-10-25 15:56:08 [debug] Last reset 05:58:26 ago, cause: software 2024-10-25 15:56:08 [debug] Last software reset at 2024-10-25 09:57, reason: HardFault ibus, Gcodes spinning, available RAM 11596, slot 0 2024-10-25 15:56:08 [debug] Software reset code 0x0063 HFSR 0x40000000 CFSR 0x00000100 ICSR 0x00487803 BFAR 0xe000ed38 SP 0x20012038 Task NETW Freestk 494 ok 2024-10-25 15:56:08 [debug] Stack: 2002c620 200324a0 200014e4 00000000 20033689 000304b5 0d0a0d30 61010000 0d0a0d31 00000000 00000000 00000000 200120b8 00000014 000a0ea3 00000102 eeb20050 380aa8c0 080001ad 00000003 00034db1 2002c3b8 2002c3b8 00000001 0002fbf1 20011800 200114f8 2024-10-25 15:56:08 [debug] Error status: 0x00 2024-10-25 15:56:08 [debug] Aux0 errors 0,0,0 2024-10-25 15:56:08 [debug] MCU revision 3, ADC conversions started 16129917, completed 16129915, timed out 0, errs 0 2024-10-25 15:56:08 [debug] MCU temperature: min 35.5, current 35.9, max 38.8 2024-10-25 15:56:08 [debug] Supply voltage: min 3.8, current 24.2, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes 2024-10-25 15:56:08 [debug] Heap OK, handles allocated/used 99/33, heap memory allocated/used/recyclable 2048/580/152, gc cycles 4679 2024-10-25 15:56:08 [debug] Events: 1 queued, 1 completed 2024-10-25 15:56:08 [debug] Driver 0: standstill, SG min 0, read errors 0, write errors 0, ifcnt 13, reads 17785, writes 13, timeouts 0, DMA errors 0, CC errors 0 2024-10-25 15:56:08 [debug] Driver 1: standstill, SG min 0, read errors 0, write errors 0, ifcnt 13, reads 17784, writes 13, timeouts 1, DMA errors 0, CC errors 0, failedOp 0x6f 2024-10-25 15:56:08 [debug] Driver 2: standstill, SG min 2, read errors 0, write errors 1, ifcnt 50, reads 17783, writes 13, timeouts 0, DMA errors 0, CC errors 0 2024-10-25 15:56:08 [debug] Driver 3: standstill, SG min 2, read errors 0, write errors 1, ifcnt 51, reads 17783, writes 13, timeouts 0, DMA errors 0, CC errors 0 2024-10-25 15:56:08 [debug] Driver 4: standstill, SG min 0, read errors 0, write errors 0, ifcnt 13, reads 17785, writes 13, timeouts 0, DMA errors 0, CC errors 0 2024-10-25 15:56:08 [debug] Driver 5: not present 2024-10-25 15:56:08 [debug] Driver 6: not present 2024-10-25 15:56:08 [debug] Date/time: 2024-10-25 15:56:08 [debug] 2024-10-25 15:56:08 2024-10-25 15:56:08 [debug] Cache data hit count 4294967295 2024-10-25 15:56:08 [debug] Slowest loop: 13.34ms; fastest: 0.16ms 2024-10-25 15:56:08 [debug] === Storage === Free file entries: 18 2024-10-25 15:56:08 [debug] SD card 0 detected, interface speed: 22.5MBytes/sec 2024-10-25 15:56:08 [debug] SD card longest read time 8.0ms, write time 4.4ms, max retries 0 2024-10-25 15:56:08 [debug] === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, max steps late 0, min interval 0, bad calcs 0, ebfmin 0.00, ebfmax 0.00 2024-10-25 15:56:08 [debug] no step interrupt scheduled 2024-10-25 15:56:08 [debug] Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 2024-10-25 15:56:08 [debug] === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 2024-10-25 15:56:08 [debug] === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 2024-10-25 15:56:08 [debug] === Heat === 2024-10-25 15:56:08 [debug] Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 2024-10-25 15:56:08 [debug] Heater 1 is on, I-accum = 0.0 2024-10-25 15:56:08 [debug] === GCodes === 2024-10-25 15:56:08 [debug] Movement locks held by null, null 2024-10-25 15:56:08 [debug] HTTP is idle in state(s) 0 2024-10-25 15:56:08 [debug] Telnet is idle in state(s) 0 2024-10-25 15:56:08 [debug] File is idle in state(s) 0 2024-10-25 15:56:08 [debug] USB is ready with "m122" in state(s) 0 2024-10-25 15:56:08 [debug] Aux is idle in state(s) 0 2024-10-25 15:56:08 [debug] Trigger is idle in state(s) 0 2024-10-25 15:56:08 [debug] Queue is idle in state(s) 0 2024-10-25 15:56:08 [debug] LCD is idle in state(s) 0 2024-10-25 15:56:08 [debug] SBC is idle in state(s) 0 2024-10-25 15:56:08 [debug] Daemon is doing "G4 P250" in state(s) 0 0, running macro 2024-10-25 15:56:08 [debug] Aux2 is idle in state(s) 0 2024-10-25 15:56:08 [debug] Autopause is idle in state(s) 0 2024-10-25 15:56:08 [debug] File2 is idle in state(s) 0 2024-10-25 15:56:08 [debug] Queue2 is idle in state(s) 0 2024-10-25 15:56:08 [debug] Q0 segments left 0, axes/extruders owned 0x0000803 2024-10-25 15:56:08 [debug] Code queue 0 is empty 2024-10-25 15:56:08 [debug] Q1 segments left 0, axes/extruders owned 0x0000000 2024-10-25 15:56:08 [debug] Code queue 1 is empty 2024-10-25 15:56:08 [debug] === Filament sensors === check 0 clear 0 2024-10-25 15:56:08 [debug] Extruder 0 sensor: no filament 2024-10-25 15:56:08 [debug] === CAN === 2024-10-25 15:56:08 [debug] Messages queued 193567, received 440879, lost 0, errs 383, boc 0 2024-10-25 15:56:08 [debug] Longest wait 2ms for reply type 6053, peak Tx sync delay 10353, free buffers 26 (min 25), ts 107533/107532/0 2024-10-25 15:56:08 [debug] Tx timeouts 0,0,0,0,0,0 2024-10-25 15:56:08 [debug] === Network === 2024-10-25 15:56:08 [debug] Slowest loop: 13.86ms; fastest: 0.00ms 2024-10-25 15:56:08 [debug] Responder states: 2024-10-25 15:56:08 [debug] MQTT(0) 2024-10-25 15:56:08 [debug] HTTP(2) 2024-10-25 15:56:08 [debug] HTTP(0) 2024-10-25 15:56:08 [debug] HTTP(0) 2024-10-25 15:56:08 [debug] HTTP(0) 2024-10-25 15:56:08 [debug] FTP(0) 2024-10-25 15:56:08 [debug] Telnet(0) 2024-10-25 15:56:08 [debug] HTTP sessions: 6 of 8 2024-10-25 15:56:08 [debug] === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 2024-10-25 15:56:08 [debug] Firmware version 2.1.0 2024-10-25 15:56:08 [debug] MAC address c4:5b:be:ce:91:93 2024-10-25 15:56:08 [debug] Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 39616 2024-10-25 15:56:08 [debug] WiFi IP address 192.168.30.x 2024-10-25 15:56:08 [debug] Signal strength -51dBm, channel 6, mode 802.11n, reconnections 0 2024-10-25 15:56:08 [debug] Clock register 00002001 2024-10-25 15:56:08 [debug] Socket states: 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] 0 2024-10-25 15:56:08 [debug] ok -
@omtek yes, potential crash caught twice.
-
Three more messages overnight and this morning. They occurred after the printer was cooling down. Looks like another two potential crashes caught and averted. Still no resets to speak of.
*** Memory difference at line 2228 offset 4: original 392e303d copy 00000168, original changed, copy ok, fix=yes
*** Memory difference at line 2228 offset 52: original 20036658 copy 0d0a0d73, original ok, copy changed, fix=no
*** Memory difference at line 2228 offset 4: original 392e303d copy 00000168, original changed, copy ok, fix=yes