3.5.0-B4+ 1LC Stack Overflow causing reset
-
@engikeneer said in 3.5.0-B4+ 1LC Stack Overflow causing reset:
Thanks for that, is there anything different about your two tools? Extra fans, heaters? One has Neopixels or anything like that?
I can't see anything obvious when looking at the config.g and the m122 outputs, but the "after" case for T1 does show a smaller amount of free memory and also the following:
CAN messages queued 2954, send timeouts 0, received 2937, lost 1983, free buffers 18, min 0, error reg 0
dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 47454, adv -180786252/188917586It is hard to be sure if this is part of the problem or simply the result of the board crashing and restarting. Perhaps @dc42 will have some thoughts?
-
@gloomyandy no T0 and T1 are the same (it's an e3D TC with hemeras, 0.4mm v6 etc). Only difference I can think is that T0 has a hotend grounding resistor wired in, which T1 doesn't, but that has not been an issue in the past...
-
@engikeneer I think what may be happening is that a debug message is being generated, and owing to a change in the way that debug messages are processed the Heat task stack is now too small when this happens. Please try the new tool board firmware at https://www.dropbox.com/scl/fo/sj9kuloenbp6e70asnxue/h?rlkey=upykfuquc574l61deqwo94xr3&dl=0.
-
@dc42 Thanks for providing this. I have tested and it does solve the crashing issue, and explains why there was an issue with the heat task stack... I was getting a heater fault occurring on T1. It may have also not helped that I didn't have a heater-fault.g file. Now I just need to work out why the fault was happening...
-
@engikeneer thanks. A debug message was being generated for some types of heater fault. In previous builds they were suppressed if the build configuration didn't specify a serial debug port, but now they are relayed over CAN to the main board instead.
The amount of free RAM will have decreased because of the increased stack size, so please monitor "Never used RAM" in the M122 report for the tool board.
-
@dc42 Good news is that I don't seem to be having any RAM issues Never used RAM seems to be steady at 2728:
M122 B22 Diagnostics for board 22: Duet TOOL1LC rev 1.1 or later firmware version 3.5.0-beta.4+ (2023-07-26 12:19:38) Bootloader ID: SAMC21 bootloader version 2.3 (2021-01-26b1) All averaging filters OK Never used RAM 2728, free system stack 88 words Tasks: Move(3,nWait,0.0%,71) HEAT(2,nWait,0.3%,87) CanAsync(5,nWait,0.0%,54) CanRecv(3,nWait,0.1%,75) CanClock(5,nWait,0.0%,66) ACCEL(3,nWait,0.0%,53) TMC(2,delaying,3.0%,57) MAIN(1,running,91.8%,324) IDLE(0,ready,0.0%,27) AIN(2,delaying,4.8%,114), total 100.0% Last reset 00:45:46 ago, cause: power up Last software reset at 2023-07-21 22:45, reason: StackOverflow, available RAM 3864, slot 2 Software reset code 0x0100 ICSR 0x0000000e SP 0x20007f44 Task HEAT Freestk 3644 bad marker Stack: 200045f8 2000462c 0001c55b 00000000 43520000 000000ab 0001b701 20003bc4 fffffffd f6bd5ddf 00000000 00000002 00000000 00000002 0001c1af 00000000 200019b4 20001958 20001a80 000226ec 20001958 200019b4 00000032 20001ad4 00005f95 00000001 20001b30 Driver 0: pos 0, 410.0 steps/mm, ok, SG min 0, read errors 0, write errors 0, ifcnt 25, reads 61617, writes 25, timeouts 4, DMA errors 0, CC errors 0, failedOp 0x6a, steps req 0 done 205224 Moves scheduled 2427, completed 2427, in progress 0, hiccups 45, segs 41, step errors 0, maxPrep 1361, maxOverdue 60251, maxInc 13299, mcErrs 0, gcmErrs 0, ebfmin -0.72 max 1.00 Peak sync jitter -5/8, peak Rx sync delay 300, resyncs 0/0, no timer interrupt scheduled VIN voltage: min 24.3, current 24.3, max 24.5 MCU temperature: min 23.2C, current 46.6C, max 46.6C Last sensors broadcast 0x00000008 found 1 241 ticks ago, 0 ordering errs, loop time 1 CAN messages queued 55026, send timeouts 0, received 60157, lost 0, free buffers 18, min 17, error reg 0 dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 483, adv 35679/74655 Accelerometer: LIS3DH, status: 00 Inductive sensor: not found I2C bus errors 0, naks 6, other errors 0
One thing I have noticed with the 3.5.0-B4+ build you provided is that I get a false error appearing when homing my printer during a print. I think this was present in the previous B4+ build you provided but can't be sure.
Error: Failed to home axes
I created a print file with only G28 in it, and I get that error (despite all axes homing fine). If I run a G28 normally, then I get no error.
This isn't causing any particular issue, it's just annoying getting the false error messages. My homing files below - happy to provide the sub-macros if needed
Thankshomeall
; homeall.g ; called to home all axes ; ; generated by RepRapFirmware Configuration Tool v3.2.3 on Sat May 29 2021 21:13:21 GMT+0100 (British Summer Time) M98 P"homec.g" ; Home C (ToolHead) M98 P"homea.g" ; Home A (Pebble Wiper) M98 P"homey.g" ; Home Y M98 P"homex.g" ; Home X M98 P"homez.g" ; Home Z
homec
; homec.g ; called to home the C axis ; ;from old G92 C260 M913 C40 ; C MOTOR TO 40% CURRENT G1 C-260 F2400 ; drive the C-axis to the stop M913 C100 ; C MOTOR TO 100% CURRENT G1 C1 F50000 G92 C0 ;Open Coupler M98 P"/macros/Tool - Unlock"
homea
;called to home axis A G91 ; relative positioning G1 H1 A50 F1500 ; move quickly to A axis endstop and stop there (first pass) G1 A-2 F2000 ; go back a few mm G1 H1 A50 F360 ; move slowly to A axis endstop once more (second pass) G90 ; absolute positioning
homex
; homex.g ; called to home the X axis ; G91 ; use relative positioning M98 P"/macros/Set Machine Limits" M400 ; make sure everything has stopped before we make changes M569 P0.0 D3 V10 M569 P0.1 D3 V10 G4 P100 ; wait 100ms M17 X Y ; Energize Motors X Y M913 X50 Y50 ; drop motor currents to 50% M915 H200 X Y S0 R0 F0 ; set X and Y to sensitivity 0, do nothing when stall, unfiltered G4 P100 G1 H2 Z3 F5000 ; lift Z 3mm G1 H2 X1 Y1 F3000 ; energise to avoid false triggers during sensorless homing M400 G4 P100 G1 H1 X-400 F6000 ; move left 400mm, stopping at the endstop G1 H1 X2 F2000 ; move away from endstop G1 H2 Z-3 F1200 ; lower Z G90 ; back to absolute positioning M400 ; make sure everything has stopped before we reset the motor currents M913 X100 Y100 ; motor currents back to 100% M569 P0.0 D2 M569 P0.1 D2
homey
; homey.g ; called to home the Y axis ; G91 ; use relative positioning M98 P"/macros/Set Machine Limits" M400 ; make sure everything has stopped before we make changes M569 P0.0 D3 V10 M569 P0.1 D3 V10 G4 P100 ; wait 100ms M17 X Y ; Energize Motors X Y M913 X50 Y50 ; drop motor currents to 50% M915 H200 X Y S0 R0 F0 ; set X and Y to sensitivity 0, do nothing when stall, unfiltered G4 P100 G1 H2 Z3 F5000 ; lift Z 3mm G1 H2 X1 Y1 F3000 ; energise to avoid false triggers during sensorless homing M400 G4 P100 G1 H1 Y-400 F6000 ; move left 400mm, stopping at the endstop G1 H1 Y2 F2000 ; move away from end G1 H2 Z-3 F1200 ; lower Z G90 ; back to absolute positioning M400 ; make sure everything has stopped before we reset the motor currents M913 X100 Y100 ; motor currents back to 100% M569 P0.0 D2 M569 P0.1 D2
homez
; homez.g ; called to home the Z axis T-1 ;just in case there is a tool coupled, go try to drop it at the dock M116 H0 S1 ; Wait for bed to reach temp; G29 S2 ; Disable mesh compensation G91 ; Relative mode G1 H2 Z5 F5000 ; Lower the bed G90 ; back to absolute positioning G1 X150 Y100 F50000 ; Position the endstop above the bed centre G91 ; Relative mode G4 P1000 ; wait 1000msec G30 ; probe G1 Z20 F5000 ; Drop the Bed G90 ; Back to absolute positioning
-
@engikeneer I think it would help if you try manually homing each axis in turn to identify which one is producing that error message.
-
@gloomyandy good thinking - I can confirm it is the Z homing that causes the issue. I homed all axes then one-by-one ran print jobs with just G28 X, G28 Y etc in and it was G28 Z that triggered it. I also ran it with commenting out a few lines in my homez and I think the one causing the issue is the M116:
; homez.g ; called to home the Z axis T-1 ;just in case there is a tool coupled, go try to drop it at the dock M116 H0 S1 ; Wait for bed to reach temp; <<<< THIS ONE G29 S2 ; Disable mesh compensation G91 ; Relative mode G1 H2 Z5 F5000 ; Lower the bed G90 ; back to absolute positioning G1 X150 Y100 F50000 ; Position the endstop above the bed centre G91 ; Relative mode G4 P1000 ; wait 1000msec G30 ; probe G1 Z20 F5000 ; Drop the Bed G90 ; Back to absolute positioning
-
@engikeneer when you get that error message, is the bed heating up?
Do you also get that error message if you command the bed to heat manually and then command Z homing manually?
-
@dc42 I get the error message whether the bed is off, heating up, or already at temp. I also changed it to a simple M116 (i.e. no H0 S1) and same result.
If I do any of this not in a print (i.e. just manually by hitting homez in DWC, or putting G28 Z in the console), I do not get the error.
-
@engikeneer I believe this is fixed in 3.5.0-rc.1.