Resuming after power failure - strange behavior arrays

modl

Hi everyone

So I had a powerfailure in the middle of a print today, I was a bit anxious because i had changed some of the code recently for handling that. But everything went extremely well: it did the homing (with zmax homing) and extruder cleaning moves, retraction, unretraction, a few print moves in the right place for a couple of seconds, right height, right extruder etc until... all of a sudden it started a homing routine, like x y z(min) i didn't understand fast enough so it crashed into the print...

Running in standalone mode, recently updated to RRF3.5 RC3 in order to be able to use arrays

Edit: I checked the resurrect file, it looks good, went to the job gcode where it restarted and there is no G28 or homeall.g being called. it was right in the middle of a long layer.
Edit 2: the print was actually in paused state and parked when power was cut off
Edit 3:
Here is one thing running in deamon.g, the fact that it happened a few seconds into the resuming of the print tells me there might be something here

if state.status == "processing"
	if !exists(global.fileTimeLeft)
		global fileTimeLeft = 1
	if !exists(global.fileTimeLeftarray)
		global fileTimeLeftarray = {0,0,0,0}
	if !exists(global.counter1)
		global counter1 = 0
	if !exists(global.counter2)
		global counter2 = 0
	if global.counter2 == 0           
		if job.timesLeft.file != null
			;if job.timesLeft.file < global.fileTimeLeftarray[global.counter1] || global.fileTimeLeftarray[global.counter1] == 0                  ; garder la plus petite estimation attention indenter la ligne suivante
			set global.fileTimeLeftarray[global.counter1] = job.timesLeft.file
			if global.fileTimeLeftarray[0] != 0 & global.fileTimeLeftarray[1] != 0 & global.fileTimeLeftarray[2] != 0 & global.fileTimeLeftarray[3] != 0
				set global.fileTimeLeft = (global.fileTimeLeftarray[0] + global.fileTimeLeftarray[1] + global.fileTimeLeftarray[2] + global.fileTimeLeftarray[3]) / 4
			else 
				set global.fileTimeLeft = job.timesLeft.file
			set global.counter1 = global.counter1 + 1
		        if global.counter1 > 3			 ; cycle through array
			     set global.counter1 = 0
	set global.counter2 = global.counter2 + 1
	if global.counter2 > 2 			 ; run only about every 30 seconds (3 deamon iterations), the bigger the value the wider the average is
		set global.counter2 = 0

;resurrect-prologue.g
if !exists(global.endTime) && !exists(global.endTime)
	M98 P"0:/macros/resetglobals.g"
if !exists(global.layerNumber)
	global layerNumber = 0
if !exists (global.totalLayerCount)
	global totalLayerCount = 10000
M98 P"0:/macros/layernumberglobal.g"
M98 P"0:/macros/totallayercountglobal.g"


M116 ; wait for temperatures
G28 X Y ; home X and Y
M98 P"0:/sys/homezmax.g"    ; home Z using max endstop

G91
G1 H2 Z-10

T{abs(state.currentTool - 1)}       ; select inactive tool for priming
M83 ; relative extrusion
G90 ; absolute positioning
G1 X300 
G1 X350 E20 F600 ; undo the retraction that was done in the M911 power fail script
G1 Y10.5
G1 E-10         ; retract
G1 X300

T{abs(state.currentTool - 1)}  ; reselect active tool
G1 X350 E20 F600
G1 Y11
G1 E-10      ; retract
G1 X300
G1 E10        ; unretracting here for now because can't do it at the end of resurrect

Edit 4 : I tried keeping only a small part of deamon.g , but still same outcome. And test processing echoed in console

if state.status == "processing"
     echo "test processing"
      if !exists(global.fileTimeLeft)
          global fileTimeLeft = 1

Edit 5: If i assume G28 is being called for whatever reason, I shall try something like that https://forum.duet3d.com/topic/27054/method-to-disable-homing-during-print ?

Edit 6: SO i added a global to resurrect-prologue after homing :
global alreadyHomed = 1 ,
and at the beginning of homeall.g an

echo "Homing XYZ..."
 if global.alreadyHomed = 1
     M99

This time, homing didn't occur but the print did stop after a couple seconds of printing again. Strangely the echo "Homing XYZ..." didn't return anything in console but the M99 did the trick

Any idea what could have happened ?

Thank you in advance

modl

Update: tried different things like setting aside my main macros, clearing deamon.g, nothing changed, i keep getting kicked out of the print after a couple seconds.

I'm trying to think about this and maybe my main question is : during a print, what would cause it to get cancelled AND execute G28?
I've had prints get cancelled because of errors in macros, but it didn't cause the printer to homeall, and would be logged in eventlog.txt

modl

Hello everyone, so I did some new testing and what is actually happening is that the job restarts at line 0 after a few moves.
I added some instructions at the beginning of the gcode file

G4 S1
echo "Print job (re)started..."
G4 S1

and edited M26 S... accordingly , by finding the new offset with notepad++

And indeed i got echoed that print job was restarted

N.B. I had forgotten to update DWC version 3.5.0rc3 and was still on 4.6.6. Update is done but no change in the situation

N.B.2 : The gcode file is 1.1gb (i know...) could this be the issue (like ram not happy with that or something)?

modl

M122 output, sorry I'm not used to using it

M122
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.5.0-rc.3 (2024-01-24 17:58:49) running on Duet 3 MB6HC v1.01 (standalone mode)
Board ID: 08DJM-9P63L-DJ3S0-7JTD2-3S86S-KAJ78
Used output buffers: 3 of 40 (29 max)
=== RTOS ===
Static ram: 155184
Dynamic ram: 123092 of which 208 recycled
Never used RAM 64388, free system stack 136 words
Tasks: NETWORK(1,ready,34.0%,158) ETHERNET(5,nWait 7,0.9%,117) HEAT(3,nWait 6,0.0%,323) Move(4,nWait 6,0.0%,239) CanReceiv(6,nWait 1,0.0%,940) CanSender(5,nWait 7,0.0%,334) CanClock(7,delaying,0.0%,334) TMC(4,nWait 6,8.8%,54) MAIN(1,running,56.2%,444) IDLE(0,ready,0.0%,30), total 100.0%
Owned mutexes:
=== Platform ===
Last reset 00:59:08 ago, cause: software
Last software reset at 2024-02-11 15:58, reason: User, Gcodes spinning, available RAM 64484, slot 1
Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0044a000 BFAR 0x00000000 SP 0x00000000 Task MAIN Freestk 0 n/a
Error status: 0x00
MCU temperature: min 26.6, current 29.3, max 30.2
Supply voltage: min 23.5, current 23.8, max 24.0, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 12.0, current 12.1, max 12.2, under voltage events: 0
Heap OK, handles allocated/used 99/7, heap memory allocated/used/recyclable 2048/1304/1196, gc cycles 0
Events: 0 queued, 0 completed
Driver 0: standstill, SG min 0, mspos 728, reads 25278, writes 31 timeouts 0
Driver 1: standstill, SG min 0, mspos 344, reads 25286, writes 23 timeouts 0
Driver 2: standstill, SG min 0, mspos 120, reads 25270, writes 39 timeouts 0
Driver 3: standstill, SG min 0, mspos 936, reads 25270, writes 39 timeouts 0
Driver 4: standstill, SG min 0, mspos 152, reads 25270, writes 39 timeouts 0
Driver 5: standstill, SG min 0, mspos 904, reads 25278, writes 31 timeouts 0
Date/time: 2024-02-11 16:57:35
Slowest loop: 537.01ms; fastest: 0.06ms
=== Storage ===
Free file entries: 19
SD card 0 detected, interface speed: 25.0MBytes/sec
SD card longest read time 2.3ms, write time 42.3ms, max retries 0
=== Move ===
DMs created 125, segments created 11, maxWait 1408958ms, bed compensation in use: mesh, height map offset 0.000, max steps late 1, ebfmin -0.74, ebfmax 1.00
no step interrupt scheduled
Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0
=== DDARing 0 ===
Scheduled moves 61, completed 61, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== DDARing 1 ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0
=== GCodes ===
Movement locks held by null, null
HTTP is idle in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
File2 is idle in state(s) 0
Queue2 is idle in state(s) 0
Q0 segments left 0, axes/extruders owned 0x80000007
Code queue 0 is empty
Q1 segments left 0, axes/extruders owned 0x0000000
Code queue 1 is empty
=== Filament sensors ===
check 98988 clear 37816885
Extruder 0: pos 2168.44, errs: frame 0 parity 0 ovrun 0 pol 0 ovdue 0
Extruder 1: pos 2304.49, errs: frame 0 parity 0 ovrun 0 pol 0 ovdue 0
=== CAN ===
Messages queued 31934, received 0, lost 0, errs 16816853, boc 0
Longest wait 0ms for reply type 0, peak Tx sync delay 0, free buffers 50 (min 50), ts 17742/0/0
Tx timeouts 0,0,17741,0,0,14191 last cancelled message type 30 dest 127
=== Network ===
Slowest loop: 555.96ms; fastest: 0.03ms
Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(1) Telnet(0) Telnet(0)
HTTP sessions: 1 of 8
= Ethernet =
Interface state: active
Error counts: 0 0 0 1 0 0
Socket states: 2 2 2 2 2 3 0 0
=== Multicast handler ===
Responder is inactive, messages received 0, responses 0

The last software reset was me updating config.g

modl

UPDATE : I SORT OF GOT IT TO WORK !
So what i did was split my gcode in 2 halves so that each is about 550mo. At the end of the first part i am calling a macro printpart2.g
that contains

M23 0:/gcodes/foo_part2.gcode
M24

will it work ? like will extruders stay up will any other macros be called by calling M24 for a new file ?

It's not ideal but I'll go with that for now

Thank you in advance

modl

UPDATE 2 : it failed again after 10mn of printing, and now DWC is going crazy and keeps disconnecting to the point I can't interact with the machine. Also, the print moves were very slow, like requested speed was 50mm/s but actual speed was in the 20s. Speed factor was set to 100%. I don't have physical access to the machine right now so will leave it here for now and see if shutting it down does something. And then think about downgrading to 4.5.6 and leave arrays for now

modl

Hello again,

Reverting to 3.4.6 fixed the issue i can now resume the print normally even with the largest file

T3P3Tony

@modl thanks for the report, this is logged as a bug here:
https://github.com/Duet3D/RepRapFirmware/issues/957

T3P3 created this issue in Duet3D/RepRapFirmware

closed Resuming after power failure - strange behavior #957

dc42

@modl please test this again using 3.5.1 release firmware. I think a change we made at 3.5 rc4 will have fixed this.

T3P3Tony

@T3P3Tony closing this issue, please comment once tested.