Toolboard keeps loosing heater 1
-
@wdenker looks like you have a power issue as the toolboard reset 1 min 42 before the M122.
in RRF 3.4, the firmware doesn't know if a toolboard/expansion board has been lost. This functionality has been added in 3.5b4. you could update to that and keep an eye on things (3.5b4 prints ok for me) -
@jay_s_uk sadly the customer likes to pull the power to reset. So that is not the issue.
-
@wdenker they pull the power on the toolboard but not the mainboard?
anyway, if they do that then my comment about the mainboard not knowing still stands -
@jay_s_uk The heater loss doesn't persist after the pull of power. Because the customer knows hey if I pull power it'll come back and I can start over. He then grabs the m122 separately from each other, one before the power loss and one after.
-
@wdenker so why does the M122 for mainboard show an uptime of 27 hours and the M122 for the toolboard shows 1 minute?
-
@jay_s_uk because the m122 was pulled for one before the power was pulled and one after.
-
@wdenker its probably a communication/cable issue anyway but you won't know with 3.4.5 as it doesn't tell you
-
@jay_s_uk the can cable communication light never blinks irratic. We originally had an issue with the can connection. We have resolved it since and now getting this error vs stuff just stopping like it lost connection briefly. Which is how a bad can cable presents itself.
-
@wdenker update to 3.5b4 so you can see if there are any reports of a CAN board not communicating.
-
@jay_s_uk found a good indicator... I2C bus errors 1 is what shows when it is having the issue, errors 0 when not having the issue. So am I safe to assume it is can connection again even though nothing else indicating can connection issues? I also updated to 3.5.0-beta.4 and am not getting any notifications about can timeout or connection issues which I would have expected.
-
-
@wdenker Do you have the opportunity to swap the toolboard in case its a hardware issue? The I2C bus error is not directly related to the heater (its used for the LIS3DH only, however its interesting that they happen at the same time.
How hot are you running the toolboard?
-
@T3P3Tony we swapped to new board and still getting the same issue just not as frequent. So we then decided to swap the thermistor ports to a phoenix connector instead of the JST. Seems like this has resolved the issue. What else does that error indicate or work with other than the accelerometer?
-
@wdenker the message "Board 121 does not have heater 1" implies that the tool board has reset. The M122 B121 report indicates that it was not a firmware crash or other software reset, because the Last Software Reset Data is shown as "not available". So it must have been a watchdog reset, hardware reset, or power failure. Therefore, we need to see M122 reports for both boards before pulling the power, to determine the reason for the tool board reset. Please ask your customer to provide this next time the fault occurs.
-
@dc42 or @T3P3Tony Here are the M122 from each board prior to power restart. Looks like toolboard ran out of memory. 2023-06-21 10:39, reason: OutOfMemory, How do I resolve that?
M122
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.5.0-beta.4 (2023-06-08 23:41:30) running on Duet 3 MB6HC v1.02 or later (standalone mode)
Board ID: 08DJM-9P63L-DJ3S0-7J9D4-3SN6J-9UMZA
Used output buffers: 9 of 40 (40 max)
=== RTOS ===
Static ram: 155012
Dynamic ram: 121824 of which 52 recycled
Never used RAM 67352, free system stack 122 words
Tasks: NETWORK(1,ready,33.0%,139) ETHERNET(5,nWait,2.1%,317) HEAT(3,nWait,1.3%,323) Move(4,nWait,88.6%,214) CanReceiv(6,nWait,1.7%,642) CanSender(5,nWait,2.0%,326) CanClock(7,delaying,0.3%,349) TMC(4,nWait,131.2%,59) MAIN(1,running,137.7%,137) IDLE(0,ready,0.5%,30), total 398.4%
Owned mutexes:
=== Platform ===
Last reset 50:22:42 ago, cause: power up
Last software reset at 2023-06-17 02:19, reason: User, Gcodes spinning, available RAM 68440, slot 1
Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0043c000 BFAR 0x00000000 SP 0x00000000 Task MAIN Freestk 0 n/a
Error status: 0x14
Aux0 errors 0,6,0
MCU temperature: min 30.3, current 40.2, max 43.8
Supply voltage: min 23.8, current 24.1, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 11.9, current 12.2, max 12.6, under voltage events: 0
Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0
Events: 13 queued, 13 completed
Driver 0: standstill, SG min 0, mspos 976, reads 2699, writes 208 timeouts 0
Driver 1: standstill, SG min n/a, mspos 8, reads 2896, writes 11 timeouts 0
Driver 2: standstill, SG min 0, mspos 80, reads 2707, writes 200 timeouts 0
Driver 3: standstill, SG min 0, mspos 112, reads 2738, writes 169 timeouts 0
Driver 4: standstill, SG min 0, mspos 880, reads 2738, writes 169 timeouts 0
Driver 5: standstill, SG min 0, mspos 464, reads 2700, writes 208 timeouts 0
Date/time: 2023-06-21 11:11:07
Slowest loop: 1000.15ms; fastest: 0.05ms
=== Storage ===
Free file entries: 18
SD card 0 detected, interface speed: 25.0MBytes/sec
SD card longest read time 4.5ms, write time 380.9ms, max retries 0
=== Move ===
DMs created 125, segments created 73, maxWait 5933869ms, bed compensation in use: mesh, height map offset 0.000, ebfmin 0.00, ebfmax 0.00
no step interrupt scheduled
=== DDARing 0 ===
Scheduled moves 102436, completed 102436, hiccups 0, stepErrors 0, LaErrors 0, Underruns [341, 0, 24], CDDA state -1
=== DDARing 1 ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters 2 -1 -1 -1, ordering errs 0
Heater 0 is on, I-accum = 0.0
Heater 2 is on, I-accum = 0.0
=== GCodes ===
Movement locks held by null, null
HTTP is idle in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
File2 is idle in state(s) 0
Queue2 is idle in state(s) 0
Q0 segments left 0, axes/extruders owned 0x80000007
Code queue 0 is empty
Q1 segments left 0, axes/extruders owned 0x0000000
Code queue 1 is empty
=== CAN ===
Messages queued 7095317, received 5098900, lost 0, boc 56
Longest wait 5ms for reply type 6029, peak Tx sync delay 623, free buffers 50 (min 47), ts 906815/906786/0
Tx timeouts 0,0,0,0,0,0
=== Network ===
Slowest loop: 8427.39ms; fastest: 0.03ms
Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0)
HTTP sessions: 1 of 8
= Ethernet =
Interface state: active
Error counts: 0 0 0 1 0 0
Socket states: 5 2 2 2 2 2 0 2
= WiFi =
Interface state: disabled
Module is disabled
Failed messages: pending 0, notready 0, noresp 0
Socket states: 0 0 0 0 0 0 0 0
=== Multicast handler ===
Responder is inactive, messages received 0, responses 0M122 B121
Diagnostics for board 121:
Duet TOOL1LC rev 1.1 or later firmware version 3.5.0-beta.4 (2023-06-08 16:22:30)
Bootloader ID: SAMC21 bootloader version 2.4 (2021-12-10)
All averaging filters OK
Never used RAM 1812, free system stack 88 words
Tasks: Move(3,nWait,0.7%,111) HEAT(2,nWait,0.1%,101) CanAsync(5,nWait,0.0%,54) CanRecv(3,nWait,0.1%,75) CanClock(5,nWait,0.0%,66) ACCEL(3,nWait,0.0%,53) TMC(2,delaying,3.1%,57) MAIN(1,running,90.8%,444) IDLE(0,ready,0.0%,27) AIN(2,delaying,5.1%,142), total 100.0%
Last reset 00:10:46 ago, cause: power up
Last software reset at 2023-06-21 10:39, reason: OutOfMemory, available RAM 12, slot 0
Software reset code 0x01c0 ICSR 0x00000000 SP 0x200054a8 Task Move Freestk 137 ok
Stack: 20005600 000062df 00000000 2000554c a5a5a5a5 00009f99 00000000 00008a67 0001f1c6 00000000 477e8400 477e8400 0000005b 007e8400 a625a5a5 a5a5a5a5 0001ee68 477e8400 477fa700 2e57b417 2e57b417 2000554c 20007100 20005108 3627cde7 00008c77 bdd53594
Driver 0: pos 0, 80.0 steps/mm, standstill, SG min 0, read errors 0, write errors 0, ifcnt 9, reads 59686, writes 9, timeouts 3, DMA errors 0, CC errors 0, failedOp 0x72, steps req 0 done 1157273
Moves scheduled 13624, completed 13624, in progress 0, hiccups 468, step errors 0, maxPrep 602, maxOverdue 2022819261, maxInc 2022780880, mcErrs 0, gcmErrs 0, ebfmin 0.00, ebfmax 1.00
Peak sync jitter 0/5, peak Rx sync delay 261, resyncs 0/0, no timer interrupt scheduled
VIN voltage: min 24.6, current 24.7, max 24.8
MCU temperature: min 58.7C, current 60.5C, max 61.1C
Last sensors broadcast 0x00000000 found 0 30 ticks ago, 0 ordering errs, loop time 0
CAN messages queued 5200, send timeouts 0, received 19467, lost 0, free buffers 18, min 16, error reg 0
dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 837, adv -2022819048/74672
Accelerometer: LIS3DH, status: 00
I2C bus errors 0, naks 3, other errors 0Diagnostics for board 1:
Duet EXP3HC rev 1.02 or later firmware version 3.5.0-beta.4 (2023-06-08 16:24:05)
Bootloader ID: SAME5x bootloader version 2.3 (2021-01-26b1)
All averaging filters OK
Never used RAM 156016, free system stack 172 words
Tasks: Move(3,nWait,1.1%,104) HEAT(2,nWait,1.2%,82) CanAsync(5,nWait,0.0%,67) CanRecv(3,nWait,1.4%,79) CanClock(5,nWait,0.4%,70) TMC(2,nWait,135.7%,69) MAIN(1,running,46.7%,456) IDLE(0,ready,0.0%,40) AIN(2,delaying,61.7%,265), total 248.2%
Last reset 50:23:10 ago, cause: power up
Last software reset data not available
Driver 0: pos -6382700, 320.0 steps/mm, standstill, SG min 0, mspos 464, reads 3139, writes 165 timeouts 0, steps req 6 done 4042678
Driver 1: pos -6382777, 320.0 steps/mm, standstill, SG min 0, mspos 880, reads 3140, writes 165 timeouts 0, steps req 6 done 4042745
Driver 2: pos 0, 80.0 steps/mm, standstill, SG min 0, mspos 8, reads 3295, writes 11 timeouts 0, steps req 0 done 0
Moves scheduled 1377083, completed 1377083, in progress 0, hiccups 0, step errors 0, maxPrep 52, maxOverdue 41, maxInc 11, mcErrs 0, gcmErrs 0, ebfmin 0.00, ebfmax 0.00
Peak sync jitter -8/9, peak Rx sync delay 188, resyncs 0/0, no timer interrupt scheduled
VIN voltage: min 24.2, current 24.3, max 24.4
V12 voltage: min 12.3, current 12.3, max 12.4
MCU temperature: min 43.1C, current 47.2C, max 51.2C
Last sensors broadcast 0x00000000 found 0 30 ticks ago, 0 ordering errs, loop time 0
CAN messages queued 1451192, send timeouts 0, received 3716073, lost 0, free buffers 38, min 38, error reg ff0000
dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 420, adv 8041/74573 -
-
@jay_s_uk already have tried that.
-
@wdenker the out of memory issue is new in firmware 3.5.0-beta.4. If you have already tried using a less demanding input shaping method then I suggest you revert to firmware 3.4.5 on the main board (and 3.4.4 on the tool board, which is the version included in the 3.4.5 release). Then when the problem happens again, get a M122 B121 report again before powering down.
-
@dc42 Reverted as it just happened again after trying yet another input shaper. Will get those M122's as soon as it happens again.
-
@dc42 @T3P3Tony Here are the latest M122's
M122
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.4.5 (2022-11-30 19:35:23) running on Duet 3 MB6HC v1.02 or later (standalone mode)
Board ID: 08DJM-9P63L-DJ3S0-7J9D4-3SN6J-9UMZA
Used output buffers: 3 of 40 (40 max)
=== RTOS ===
Static ram: 152760
Dynamic ram: 99104 of which 0 recycled
Never used RAM 97800, free system stack 112 words
Tasks: NETWORK(ready,160.6%,230) ETHERNET(notifyWait,0.3%,566) HEAT(notifyWait,0.1%,322) Move(notifyWait,7.6%,245) CanReceiv(notifyWait,0.1%,774) CanSender(notifyWait,0.1%,328) CanClock(delaying,0.0%,339) TMC(notifyWait,42.8%,57) MAIN(running,270.6%,925) IDLE(ready,0.0%,30), total 482.3%
Owned mutexes:
=== Platform ===
Last reset 02:00:49 ago, cause: software
Last software reset at 2023-06-21 12:35, reason: User, GCodes spinning, available RAM 66660, slot 2
Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task MAIN Freestk 0 n/a
Error status: 0x04
Aux0 errors 0,1,0
Step timer max interval 202
MCU temperature: min 40.4, current 44.4, max 44.9
Supply voltage: min 23.9, current 24.0, max 24.2, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 11.9, current 12.2, max 12.5, under voltage events: 0
Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0
Events: 0 queued, 0 completed
Driver 0: standstill, SG min 0, mspos 976, reads 64534, writes 29 timeouts 0
Driver 1: standstill, SG min n/a, mspos 8, reads 64563, writes 0 timeouts 0
Driver 2: standstill, SG min 0, mspos 112, reads 64534, writes 29 timeouts 0
Driver 3: standstill, SG min 0, mspos 784, reads 64540, writes 23 timeouts 0
Driver 4: standstill, SG min 0, mspos 880, reads 64540, writes 23 timeouts 0
Driver 5: standstill, SG min 0, mspos 464, reads 64534, writes 29 timeouts 0
Date/time: 2023-06-21 17:16:04
Slowest loop: 209.40ms; fastest: 0.05ms
=== Storage ===
Free file entries: 9
SD card 0 detected, interface speed: 25.0MBytes/sec
SD card longest read time 3.2ms, write time 2.6ms, max retries 0
=== Move ===
DMs created 125, segments created 42, maxWait 164204ms, bed compensation in use: mesh, comp offset 0.000
=== MainDDARing ===
Scheduled moves 155342, completed 155342, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 2], CDDA state -1
=== AuxDDARing ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters 2 -1 -1 -1, ordering errs 0
Heater 0 is on, I-accum = 0.0
Heater 2 is on, I-accum = 0.0
=== GCodes ===
Segments left: 0
Movement lock held by null
HTTP is idle in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
Code queue is empty
=== CAN ===
Messages queued 218281, received 163468, lost 0, boc 0
Longest wait 5ms for reply type 6024, peak Tx sync delay 398, free buffers 50 (min 38), ts 36122/36122/0
Tx timeouts 0,0,0,0,0,0
=== Network ===
Slowest loop: 97.78ms; fastest: 0.03ms
Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0)
HTTP sessions: 1 of 8
= Ethernet =
State: active
Error counts: 0 0 0 0 0 0
Socket states: 5 2 2 2 2 2 0 2
= WiFi =
Network state is disabled
WiFi module is disabled
Failed messages: pending 2779096485, notready 2779096485, noresp 2779096485
Socket states: 0 0 0 0 0 0 0 0
=== Multicast handler ===
Responder is inactive, messages received 0, responses 0M122 B1
Diagnostics for board 1:
Duet EXP3HC rev 1.02 or later firmware version 3.4.4 (2022-10-14 11:45:56)
Bootloader ID: SAME5x bootloader version 2.3 (2021-01-26b1)
All averaging filters OK
Never used RAM 158896, free system stack 173 words
Tasks: Move(notifyWait,0.1%,100) HEAT(notifyWait,0.1%,88) CanAsync(notifyWait,0.0%,69) CanRecv(notifyWait,0.1%,80) CanClock(notifyWait,0.0%,71) TMC(notifyWait,36.7%,65) MAIN(running,56.6%,441) IDLE(ready,0.0%,40) AIN(delaying,6.3%,263), total 100.0%
Last reset 02:01:37 ago, cause: software
Last software reset data not available
Driver 0: pos -1065745, 320.0 steps/mm,standstill, SG min 0, mspos 176, reads 16761, writes 21 timeouts 0, steps req 1235155 done 169673
Driver 1: pos -1065716, 320.0 steps/mm,standstill, SG min 0, mspos 272, reads 16760, writes 21 timeouts 0, steps req 1235126 done 169644
Driver 2: pos 0, 80.0 steps/mm,standstill, SG min n/a, mspos 8, reads 16782, writes 0 timeouts 0, steps req 0 done 0
Moves scheduled 42040, completed 42040, in progress 0, hiccups 0, step errors 0, maxPrep 66, maxOverdue 35, maxInc 35, mcErrs 0, gcmErrs 0
Peak sync jitter -7/7, peak Rx sync delay 181, resyncs 0/0, no step interrupt scheduled
VIN voltage: min 24.2, current 24.2, max 24.3
V12 voltage: min 12.3, current 12.3, max 12.4
MCU temperature: min 47.0C, current 50.8C, max 51.4C
Last sensors broadcast 0x00000000 found 0 97 ticks ago, 0 ordering errs, loop time 0
CAN messages queued 58031, send timeouts 0, received 123113, lost 0, free buffers 37, min 37, error reg 0
dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 418, adv 26976/74562M122 B121
Diagnostics for board 121:
Duet TOOL1LC rev 1.1 or later firmware version 3.4.4 (2022-10-14 11:46:33)
Bootloader ID: SAMC21 bootloader version 2.4 (2021-12-10)
All averaging filters OK
Never used RAM 3080, free system stack 45 words
Tasks: Move(notifyWait,0.8%,91) HEAT(notifyWait,0.1%,115) CanAsync(notifyWait,0.0%,65) CanRecv(notifyWait,0.2%,74) CanClock(notifyWait,0.0%,65) ACCEL(notifyWait,0.0%,61) TMC(notifyWait,3.1%,57) MAIN(running,90.8%,441) IDLE(ready,0.0%,26) AIN(delaying,5.0%,142), total 100.0%
Last reset 00:26:20 ago, cause: power up
Last software reset at 2023-06-21 14:12, reason: OutOfMemory, available RAM 8, slot 1
Software reset code 0x01c0 ICSR 0x00000000 SP 0x20005280 Task Move Freestk 155 ok
Stack: 20005390 000062df 200053a8 200074c0 200053a8 00009f99 200052dc 00008d73 be2b253d 47c27f00 47c27f00 be2b253d 000184fe 000184fe 00000000 000184fe 00046115 200074c0 20007700 200052dc 00000001 0000939f 00000001 3dd96ede 3f64d224 47c27f00 488c22a0
Driver 0: pos 2363156, 80.0 steps/mm,standstill, SG min 0, read errors 1, write errors 0, ifcnt 10, reads 64693, writes 10, timeouts 1, DMA errors 0, CC errors 0, failedOp 0x41, steps req 2938952 done 2938952
Moves scheduled 34177, completed 34177, in progress 0, hiccups 0, step errors 0, maxPrep 674, maxOverdue 0, maxInc 0, mcErrs 0, gcmErrs 0
Peak sync jitter 0/5, peak Rx sync delay 222, resyncs 0/0, no step interrupt scheduled
VIN voltage: min 24.6, current 24.6, max 24.8
MCU temperature: min 60.9C, current 62.3C, max 63.0C
Last sensors broadcast 0x00000000 found 0 24 ticks ago, 0 ordering errs, loop time 0
CAN messages queued 12812, send timeouts 0, received 48548, lost 0, free buffers 37, min 36, error reg 0
dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 695, adv 35663/74657
Accelerometer: LIS3DH, status: 00
I2C bus errors 0, naks 3, other errors 0