Bug: Gcode within trigger is interleaved with later build gcode?
-
@DocTrucker said in Ways to ensure gcode within trigger is uninturupted?:
@tenaja @fcwilt I think I have a gift for saying things in the most confusing way!
What I mean is if I trigger one of the trigger macros with a switch input mid build the Duet (2, on RRF3.3beta2) seems to alternate between running bits of the trigger file and bits of the active build/print process.
Does that make more sense?
What you are asking for is not unusual in the realm of programming in general.
However @dc42 would have to address your issue since he knows what is possible in the world of RRF programming.
I'm going to put forth a guess here that with the current object model and conditional code you might be able to test, in your trigger, if you need to invoke M25/M24.
Frederick
-
@fcwilt yes, in python for example a similar function is performed by the global interpreter lock, or acquiring lock objects in multiprocess systems to for example try and protect from writing to the same file at the same time.
What I'm unsure of at the moment of is if people are thinking there is such a lock already in place, as it's not clear in the gcode documentation that the machine would carry on with a build at the same time as running a trigger, and you could get some very usual effects.
Rather than mods to the M24/M25, perhaps there should be a mod to the creation of the triggers to indicate whether they should run locked or not.
Edit: link to gcode documentation. No mention of machine carrys during a trigger: https://duet3d.dozuki.com/Wiki/Gcode#Section_M581_RepRapFirmware_3_01RC2_and_later
-
@DocTrucker said in Ways to ensure gcode within trigger is uninturupted?:
What I'm unsure of at the moment of is if people are thinking there is such a lock already in place, as it's not clear in the gcode documentation that the machine would carry on with a build at the same time as running a trigger, and you could get some very usual effects.
I have always assumed (dangerous I know) that a trigger could not be interrupted - that wouldn't seem to be wise.
However I have not tried to verify that.
I took a closer look at the object model and I think you can determine if a job is underway.
Frederick
-
@fcwilt said in Ways to ensure gcode within trigger is uninturupted?:
I have always assumed (dangerous I know) that a trigger could not be interrupted - that wouldn't seem to be wise.
To be fair the lock knowledge is something you can't expect Joe Bloggs to have, even if they are competent engineers who simply haven't programmed before. I think this behaviour may be an oversight or it should be clearly warned of in the gcode docs referencing the triggers.
Edit: I've added a warning note to the duet gcode wiki. This will need to be amended if it is indeed a bug.
-
Is this a confirmed bug in 3.3beta2?
-
@dc42 @droftarts @T3P3Tony : As per last message please can one of you please confirm this bug. Check out the video and files for test 4 to show the worst case of interleaving contents of the trigger file and build file from after the trigger event.
Just done a sequence of tests this morning and stored all the relevant config and macro files into zip bundles.
Build file used for these tests: test.g
Test 1:
Zip bundle: https://www.dropbox.com/s/x9g0fun3j2peru9/01_Test_Default.zip?dl=0
Video: https://youtu.be/toY8ejvBPfk
The trigger file is bookended with M300 peep commands with the second command being a M25 pause and penultimate command being a M24 resume. The contents of the pause and resume files have been commente4d out to stop excess machine movements which would cause print issues, but may still be required on a user pause.
Notes:- Only the final M300 gets played.
- Warnings about pausing when process isn't running.
Test 2:
Zip bundle: https://www.dropbox.com/s/oukt6y3jcxoxrbx/02_Test_NoM300.zip?dl=0
Video: https://youtu.be/NNXh6h0w56o
As above with no M300 commands outside of the M25&M24.
Notes:- As above.
Test 3
Zip bundle: https://www.dropbox.com/s/uce3y0yq4tvwdoj/03_Test_ConditionalGcodeMetaVariables.zip?dl=0
Video: https://youtu.be/38RlGprdzHA
This time I wrap the pause and resume commands up in conditional gcode to ensure a pause is only issued if the process is running, and a resume only issues if the trigger issued a pause command.
Notes:- Errors raised from triggering the macros out side of a build process don't occur.
- A trigger occured very near the instant the build finished. This caused the conditional gcode to request a pause, but the system to be idle by the time it was issued. This caused a warning about issuing pauses while the system is idle, but further demonstrated the way the firmware seems to process the contents of trigger files in parallel to the main gcode stream.
Test 4
Zip bundle: https://www.dropbox.com/s/uhifjyx8qfaht03/04_Test_NoPause.zip?dl=0
Video: https://youtu.be/0OzrCobDU3k
No pauses at all, and no conditional gcode. This one was bound to be messy and doesn't disappoint!
Notes:- Commands from trigger file clearly interleaved with the main gcode stream.
Test 5
Zip bundle: https://www.dropbox.com/s/3kvnlt7zvywc80d/05_Test_M600.zip?dl=0
Video: https://youtu.be/Zp0tcaPKgi4
Using M600 instead of M25. I liked the potential to avoid running the pause.g, but there is no corresponding command to resume without processing resume.g, and so it is a little redundant in this use case. Additionally the line in the GCode wiki that reads something like; "execution of all prior gcode commands in the same input stream is completed first" was concerning, as it implied the action of the trigger file would be delayed. For this test the contents of the pause and resume macros were re-instated, and a filament-change.g file added containing no commands.
Command terminal output on DWC:25/03/2021, 09:58:55 Finished printing file 0:/gcodes/test.g, print time was 0h 1m 25/03/2021, 09:58:46 Printing resumed 25/03/2021, 09:58:31 Resume state saved Error: Bad command: 7.473 E1.000 F250 Printing paused for filament change at X56.1 Y-23.9 Z0.0 B0.0 25/03/2021, 09:58:17 Printing resumed 25/03/2021, 09:58:15 Resume state saved Error: Bad command: 1 X15.000 Y-60.000 E1.000 F250 Printing paused for filament change at X19.0 Y-60.0 Z0.0 B0.0 25/03/2021, 09:58:11 Printing resumed 25/03/2021, 09:57:56 Resume state saved Printing paused for filament change at X13.0 Y-60.0 Z0.0 B0.0 25/03/2021, 09:57:36 Printing resumed 25/03/2021, 09:57:22 Resume state saved Error: Bad command: E1.000 F250 Printing paused for filament change at X-47.3 Y-32.7 Z0.0 B0.0 25/03/2021, 09:57:02 Printing resumed 25/03/2021, 09:56:47 Resume state saved Error: Bad command: 1 X-46.163 Y33.837 E1.000 F250 Printing paused for filament change at X-49.0 Y31.0 Z0.0 B0.0 25/03/2021, 09:56:28 Printing resumed 25/03/2021, 09:56:13 Resume state saved Error: Bad command: 00 Y60.000 E1.000 F250 Printing paused for filament change at X11.0 Y60.0 Z0.0 B0.0 25/03/2021, 09:55:52 M32 "0:/gcodes/test.g" File 0:/gcodes/test.g selected for printing
Notes:
- Resume file clearly executed.
- Resume file executed after the trigger macro has resumed the main build process (with M24) AND after some additional commands from the build file.
- Something weird going on that causes reading the main build input stream to be broken. with many bad command errors comming up on the command line despite having the same build file as before.
I have edited this post many times to add in video links, links to the zip files (not allowed to upload zip) and correct many typos!
-
@DocTrucker Thanks for the info, I'm sure @dc42 will get to it. Can you be clear what firmware version you are using for all testing? You mention 3.3b2, but it's not clear that's what you are using. If you are, this thread would be better in 'Beta firmware' rather than 'firmware wishlist'.
Ian
-
@droftarts The thread was moved after I created it before the case for this being a bug became stronger.
Yes 3.3beta2 on a duet 2.
-
@DocTrucker Moved.
Ian
-
@droftarts Thanks Ian. Not expecting these issues to get fixed ASAP, but being able to reference the reporting of the bug in the report to my client will maintain a strong case for continuing to use RRF over other options which may not be initially as user friendly. It also makes sure I haven't just completely miss understood something like missing the 'var.' in a recent post of mine!
I can't think of a strong case for the triggers being processed in parrallel to the gcode stream like this, it is just too counter intuitive for non-coders. I think default should be contents of a trigger run in one lump (excluding actions from commands like M600 that allow the machine to carry on a little further before pausing), but perhaps a argument could be used in the formation of the trigger to allow it to be processed in parallel.
Edit: Just trying some similar tests with swapping tools rather than dynamic reassignment of drives to investigate if it is dynamic use of M584 that is causing the problem. Long story short it is still misbehaving. I've run out of test time now but will try to share the results, files, videos of that too sometime next week.
-
Yesterday I tried re-arranging the code so that the triggers executed a tool change rather than a dynamic tool change. This isn't my preferred way of doing this as I can't seem to command independent extrusion axis moves when one 'extruder' is on T0, and the other on T1. Another barrier against this method is the lack of being able to drive an extrusion axis to a limit switch. While I appreciate this wouldn't be used by many it seems odd that it's not there as this either suggests the axes aren't sharing a base module of functions, or that the home function on the E axis has been coded out. For me the solution is conditional gcode and defining what was limit switches as inputs 2 & 3.
So using the conditional gcode to run a home routine looks like:
T0 var attempts = 0 while var.attempts < 1000 if sensors.gpIn[2].value != 1 G1 E-0.1 F400 set var.attempts = var.attempts + 1 G1 E5 F800
But this has drawbacks:
- Sounds rough. This suggests that the motion controller is executing a start/stop on each G1 move and running the check when stopped.
- Slower to respond. Not sure why this is but you can see the syringes driving had onto the limit switch. How can this be queuing gcode and still checking after each 0.1mm? You can hear the switch click in the videos.
So the main objective of running these tests was to demonstrate that the use of dynamic drive re-assignment was not upsetting an interlock that normally protects the contents of triggers.
Spoiler: tool change behaves in a similarly chaotic manner. This does however have implications for conditional gcode. If this overlapping of build and macros occurs in other areas to then whatever condition you are checking for - in for example an if statement - could have changed before the indented statement code executes. I think this is what is occurring when the conditional gcode homing routine is executed that fails at the end of test 2.
Test 1:
Using tool changes instead of dynamic drive re-assignment. Using M24/M25 and M300 peeps to indicate when the system is processing the triggers.
Largely behaved as before aside from one glitch which may be a false trigger on the switch input.
Video: https://youtu.be/dTjPfLtTX9I
System directory zip: https://www.dropbox.com/s/fd2imu1so6vzcbp/210329_01_Pauses.zip?dl=0Test 2:
As above but without the pauses. Things go wonky very quickly·
Video: https://youtu.be/FUZJCz39Iro
System directory zip: https://www.dropbox.com/s/ynokr33cyow68fb/210329_02_NoPauses.zip?dl=0 -
@fcwilt said in Bug: Gcode within trigger is interleaved with later build gcode?:
@tenaja said in Ways to ensure gcode within trigger is uninturupted?:
You will note that at the "top", I disable the triggers, then run another program, then re-enable the triggers.
From the M581 docs:
Polling for further trigger conditions is suspended until the trigger macro file has been completed.
Did you verify that your trigger could be re-triggered while is was executing?
Frederick
I am not sure if you got your answer to this, but the word "suspended" in the sentence above could be rewritten "queued for later execution". That is why I disabled them--if you press the button several times during a macro execution, it stores up the number of button presses, and runs them upon completion. This was catastrophic in my application.
-
@tenaja interesting. I hadn't read it like that. The errors I saw were definately commands from the main build file getting interleaved with the gcode contained in the trigger file and any associated pause or resume scripts.
As of yet no one from duet3D has even confirmed if this is a bug yet...
-
@tenaja said in Bug: Gcode within trigger is interleaved with later build gcode?:
if you press the button several times during a macro execution, it stores up the number of button presses, and runs them upon completion.
I asked @dc42 lately, if the triggers needed to be 'debounced' but he said, that the first trigger would lock the input until the macro is finished. Hard to believe, they count up while the macro is running? (another bug?)
-
@DocTrucker
I call such locking mechanism 'semaphores'. It could be as simple as a boolean variable like 'extruder(0).available'. If your trigger macro needs exclusive access to extruder(0) it would set the variable to false and the print queue would stop (unless it contains commands not related to the locked resources) until the macro is finished and 'available' is true again.
M116 is a good example for an existing 'blocker'. Maybe it can be used for other things than temperatures?
(eg. M116 E0.endstop)
I'm not familiar with all the object variables, I made up the variable name above. But that way, it could be implemented, -
@o_lampe yeah, semaphores, interlocks, interpretor locks. Different languages have their pet names for them. I've gone so far as manually coding them using the existance of files in the file system to sycronise multiple processes. Multithread and multiprocess issues easily gave me the biggest headaches when I was doing lots of code.
-
@o_lampe funny that, I was thinking about de-bouncing. I use makerbot style mechanical limits for simplicity, but strip all the surface mounts off to be able to use the normally closed part if the switch. I had considered taking a closer look to see if there was an electronic implementation of debounce with the cap/resistor and if I could be more selective in thw removal of parts and still have that.
That said I'm not sure if that us an issue or not. I will run one of the tests with disabling the interlock as the first, and resuming it as the last action at some point.
-
@dc42 @droftarts @T3P3Tony @Phaedrux Can one of you confirm if the interleaving of trigger/build code is a bug please? Not had any official confirmation in that yet.
-
@DocTrucker, it's normal for commands from multiple GCode streams to be interleaved. However, if the movement system is locked because of a command in a macro file (including a trigger), it will remain locked until either the macro completes, or user input is requested via M291. So if you include M400 near the start of your macro, then the machine will compete any moves already queued and then run the rest of the macro without any movement commands from the file being printed.
I will look into adding a parameter to M24 and M25 to allow the pause and resume macro files to be skipped.
-
@dc42 Thanks for the conformation. Appears I burnt a weeks work there needlessly. Look forward to the skip pause/resume gcode flags.
Tried M400. It is not suitable as it causes a large overrun on the limit switch placing too much reliance on the gcode pre-parse which I like to keep largely machine independent where possible.