Poor print quality with RRF3 - especially 3.2.2.
-
@deckingman, I'm very sorry to hear that your machine was damaged. I'm glad the damage isn't too severe.
You reported that the expansion board firmware version didn't appear to have changed, so I checked that first. When I attempted to upload those same firmware binaries to my test system, M115 reported a different firmware version from yours. So I checked what version it should have reported, and that was different again: Duet EXP3HC firmware version 3.3beta1+1 (2021-03-02 15:56:53)".
It turned out that although DWC uploads firmware binaries to /firmware, RRF was still fetching binaries from /sys when upgrading expansion board firmware. So the 3.3beta1 expansion board firmware binaries in /sys were being re-installed instead of the newly-uploaded ones. That explains the version number not changing. I will of course fix this in the next beta.
However, the changes between 3.3beta1 and the later expansion board firmware are minor. I have just gone through the commit logs to verify that there have been no critical changes. In particular, the CAN protocols have not changed. So this doesn't explain why your machine crashed.
I next turned to the M122 logs that you posted. Thanks for having the presence of mind to pause the print and take a set of M122 readings before resetting. I was rather expecting to find that board 3 had reset, which would explain some missing X moves. However, in the "Console dump after pause" log, the last reset times of the boards read as follows (ignoring the second M122 B3 after you cancelled the print):
0: Last reset 01:19:09 ago, cause: power up/Last software reset at 2021-03-07 11:22, reason: User
1: Last reset 01:19:18 ago, cause: power up
2: Last reset 01:19:25 ago, cause: power up
3: Last reset 01:19:32 ago, cause: power upSo all four boards were reset at the same time.
Looking in more detail at those logs, I found a couple of interesting parts:
- The M122 B3 report when paused shows the following:
Driver 0: position -1182960, 1600.0 steps/mm, standstill, reads 45627, writes 0 timeouts 0, SG min/max 14/326, steps req 4320 done 4320
Driver 1: position 3200, 80.0 steps/mm, ok, reads 45627, writes 0 timeouts 0, SG min/max 0/1019, steps req 1418006 done 1419078
Driver 2: position -145076, 80.0 steps/mm, ok, reads 45626, writes 0 timeouts 0, SG min/max 0/373, steps req 1426723 done 1427794
Moves scheduled 16699, completed 16698, in progress 1, hiccups 0, step errors 0, maxPrep 85, maxOverdue 5, maxInc 2, mcErrs 0, gcmErrs 0It's reporting 1 move in progress, yet the number of steps done on drivers 1 and 2 is greater than the number of steps requested. However, I think this may be caused by running the previous M122 when the printer was not paused or otherwise idle, so that steps from moves that were in the queue when the previous M122 was run have been included in the steps-done count.
- The M122 report for the main board shows this:
Tasks: NETWORK(ready,224) ETHERNET(notifyWait,124) HEAT(delaying,284) CanReceiv(notifyWait,795) CanSender(notifyWait,359) CanClock(delaying,349) TMC(notifyWait,18) MAIN(running,924) IDLE(ready,20)
The numbers are the remaining stack space. I have never seen the TMC stack space go as low as that. It sometimes happens that the actual stack space used is greater than the reported amount due to the compiler allocating stack but not using the bit that is monitored. So it's possible that the TMC task stack is overflowing. There is no other evidence to suggest this, however I will increase the stack size as a precaution.
There are no indications in the reports of lost CAN messages: no send timeouts in the main board M122, and no 'oos' count in the M122 B3.
I plan to proceed as follows:
-
Review the changes between the beta1 main board firmware and the version I provided on Dropbox, and the CAN transmit fifo driver that the new firmware uses.
-
I already planned to add a 3HC board to my tool changer and use it to drive the X and Y axes, so that I have a machine (not just a bench setup) that uses a 3HC to drive axes. I will do that and then try your print.
Three questions for you:
-
Is the amount of X shift in the photo you posted consistent with the amount of shift being the length of the box it was printing? Or so you think it may have been more?
-
Have you already shared that GCode file, and if so, where?
-
You said that you were unable to download the console and you had to copy-and-paste it. Do you mean that you tried to click on the list icon at the top right of the console (to get the "Download as text" option), but it didn't respond?
-
@oliof said in Poor print quality with RRF3 - especially 3.2.2.:
@o_lampe still looking to see whether PA routine is skipped if value is set to 0.0 or not. I am not that familiar with the RRF code yet, as I have mainly messed around in contained places (Kinematics) so far.
RRF does not distinguish between PA never having been set, and being set to 0.0.
-
@dc42 said in Poor print quality with RRF3 - especially 3.2.2.:
Three questions for you:
-
Is the amount of X shift in the photo you posted consistent with the amount of shift being the length of the box it was printing? Or so you think it may have been more?
-
Have you already shared that GCode file, and if so, where?
-
You said that you were unable to download the console and you had to copy-and-paste it. Do you mean that you tried to click on the list icon at the top right of the console (to get the "Download as text" option), but it didn't respond?
#1 The print is actually a box plus lid. The lid is only 3mm thick (tall) and an unremarkable rectangle, so I haven't included any pics of that before. But at the point of failure, the machine had long finished the lid and was printing just the box. That box is a tad over 70mm wide in X. From the witness mark on the bed, I can see that the right hand edge of that box is about 110mm from the right hand edge of the build plate. So for the head to crash into the frame and try to repeatedly beat the hell out of it, means that there would have to have been a shift of >110 mm - significantly larger than the width of the part it was printing.
#2. No I hadn't shared the file. I have now uploaded it to the folder on Google drive that I last linked too (the one that has the console dump file). In fact, I've uploaded the original version as sliced plus the version which has UVAB moves added. The latter is distinguishable by the "UVAB" suffix which is added to the end of the file name - that's the one I've been using but unless you can simulate a CoreXYUVAB, it won't be much use (hence the inclusion of the original).
Note. You'll need to remove the call to the "pre-print" macro. You'll then have to heat the bed and nozzle and send a "T0" before attempting to print or simulate that file (as well as home the printer). I could post the "pre-print" macro but it calls other macros to set the tool temperatures, home the printer, purge and wipe the nozzle etc, so it all gets complicated and it'll just be easier to manually heat the bed and hot end.
#3. That's correct. In all other instances, I've simply selected the "download as text" option but after pressing "pause", that didn't work. Firefox shows me when a file is being downloaded but that didn't happen - there was no indication on Firefox that at download was happening. And no additional files appeared in the list of downloads (only those console.text files that were downloaded prior to the crash).
I can't remember if I pressed cancel before or after attempting to download the console text. I think, I pressed pause, tried to download, then pressed cancel and tried to download again, but I can't be 100% sure of that. I've uploaded both the pause.g and cancel.g files to the same place. I can't see anything in either of those that would have prevented downloads from working but you might spot something I've missed.
-
-
For anyone watching this thread, and for those who have contributed, I just want to say that the Duet team and I have opened up the communication medium that we used at the very start (when Gen 3 was still at the pre-production stage), in order to work together to resolve these issues. That's nothing personal - just that these forums are maybe not the best way to post messages rapidly back and forth between us.
I thought it important to state that fact, in case anyone got the impression that I had been abandoned by the Duet team - that isn't the case and we are working together to try and get to the bottom of wtf is happening.
-
Thank you for letting us know.
-
Keep at it and let us know what you find out.
I'm likely affected by many of the issues that you're discussing - also have a large machine with many extruders and i'd love to continue using it
@dc42 keep it up as well! I'm happy and willing to give some tests as well, but I won't have the time to dedicate to all of the detailed troubleshooting that deckingman is doing.
Luke
-
@dc42
why are there different driver reports for main board and 3hc . for example main board does not show steps /mm and step req / done .@deckingman sorry to see your machine damaged , at least its nothing serious . did you consider using some king of fail safe between the carriages ? for example connect a wire that is shorter then rest of the stuff that will disconnect earlier in case of misalignment and stop the machine .
i looked at your gcode , xy and ab positions should be the same right ?
are you using different step/mm for xy and ab ?i cannot see any relation between the xy driver position and ab position on the main board . maybe this can explain something .
-
@hackinistrator Forget about the AB gantry. That's a force cancelling/load balancing gantry that is completely separate to the XY and UV gantries. It simply carries a container with lumps of lead. It's just another CoreXY mechanism that sits at the very top of the machine and is in no way connected to the hot end or the extruders. The motor directions are reversed so it does the exact opposite of the XY gantry to stop the printer rocking around when I throw my heavy hot end around at high speed. It does nothing for print quality and the only reason it's still there is that I can't be bothered to remove it.
The XY gantry carries the hot end. The UV gantry carries the 6 extruders and two of the expansion boards. The extruders are connected to the hot end with short Bowden tubes. All the wires for the heater, fans, temp sensor, end stop etc for the hot end are connected to the expansion boards that are fitted to the UV gantry above. The UV gantry is the one that "follows" the XY gantry with an "envelope" that is +- 20mm of the XY position.
I've never before seen a wild excursion of the XY gantry that was not mimicked by the UV gantry. Nor did I foresee that possibility.
-
@deckingman said in Poor print quality with RRF3 - especially 3.2.2.:
That's a force cancelling/load balancing gantry that is completely separate to the XY and UV gantries. It simply carries a container with lumps of lead. It's just another CoreXY mechanism that sits at the very top of the machine and is in no way connected to the hot end or the extruders. The motor directions are reversed so it does the exact opposite of the XY gantry to stop the printer rocking around when I throw my heavy hot end around at high speed.
That is most interesting. Where did you hear about that idea or was it something you devised on your own?
Thanks.
Frederick
-
Forget about the AB gantry. That's a force cancelling/load balancing gantry that is completely separate to the XY and UV gantries.
i think in your case its important .
x is related to a , and y is related to b .so i would expect these relations to stay the same during print . your reports shows otherwise (at least i think so)
-
@fcwilt It was my own idea - another world first that has since been copied. I did a write up of it on my blog. I'm using my phone right now so can't easily provide a link but there is a link to my blog in my signature.
-
@hackinistrator I can't say that I noticed, but yes A should always be the same as X and B should always be the same as Y. The X and Y motors are on expansion board 3 and the A and B motors are on the main board. That might be a clue.....
-
@deckingman
about the XYUVAB axis':
I read in the Wiki that you have to use definitions of the extra axis in a certain order. E.g. you can't define V-axis without having a U-axis. If you still do, the FW automatically defines a virtual axis.
I believe, that you have to define UVW before using ABC or you have an unknown W-axis in your system, which might do spooky things.
Your printer is running for a long time with this axis', but maybe a newer FW treats theses ghosts different? -
@o_lampe said in Poor print quality with RRF3 - especially 3.2.2.:
@deckingman
about the XYUVAB axis':
I read in the Wiki that you have to use definitions of the extra axis in a certain order. E.g. you can't define V-axis without having a U-axis. If you still do, the FW automatically defines a virtual axis.
I believe, that you have to define UVW before using ABC or you have an unknown W-axis in your system, which might do spooky things.
Your printer is running for a long time with this axis', but maybe a newer FW treats theses ghosts different?I have this
M584 X3.2 Y3.1 Z3.0 U0.0 V0.1 A0.2 B0.3 E1.0:1.1:1.2:2.0:2.1:2.2 R0 P7;
and
M669 K8 A0:0:0:0:0:1:1 B0:0:0:0:0:1:-1;I'm fairly sure using those matrix values should sort out any ghost axes but I'll leave it to @dc42 - he's best placed to know.
-
@fcwilt Here is a link to the first post I did with the load balancing gantry
https://somei3deas.wordpress.com/2018/10/04/dynamic-force-cancellationload-balancing/. Not that it was later discovered that this has no effect on print quality. That is to say, even when the entire printer rocked from side to side, the actual print was fine. So eliminating the rocking was largely a solution to a problem that doesn't exist. -
@deckingman said in Poor print quality with RRF3 - especially 3.2.2.:
@fcwilt It was my own idea - another world first that has since been copied. I did a write up of it on my blog. I'm using my phone right now so can't easily provide a link but there is a link to my blog in my signature.
Thanks. That was quite educational.
But I think I've found the source of ALL of your printer problems.
It is not big enough.
Frederick
-
@fcwilt The trouble is, I can't make it much taller because I'm not tall enough. I have to stand on a step to see they very top of it as it is now. I did mention to my wife that if we did away with the freezer in the garage, I could make it wider. But for some reason, she prefers to have that second freezer. Strange creatures women - always have the wrong priorities
-
@deckingman said in Poor print quality with RRF3 - especially 3.2.2.:
@o_lampe said in Poor print quality with RRF3 - especially 3.2.2.:
@deckingman
about the XYUVAB axis':
I read in the Wiki that you have to use definitions of the extra axis in a certain order. E.g. you can't define V-axis without having a U-axis. If you still do, the FW automatically defines a virtual axis.
I believe, that you have to define UVW before using ABC or you have an unknown W-axis in your system, which might do spooky things.
Your printer is running for a long time with this axis', but maybe a newer FW treats theses ghosts different?I have this
M584 X3.2 Y3.1 Z3.0 U0.0 V0.1 A0.2 B0.3 E1.0:1.1:1.2:2.0:2.1:2.2 R0 P7;
and
M669 K8 A0:0:0:0:0:1:1 B0:0:0:0:0:1:-1;I'm fairly sure using those matrix values should sort out any ghost axes but I'll leave it to @dc42 - he's best placed to know.
did you copy and paste m669 from your config.g , are you sure its not a mistake?
in m584 a and b defined as driver 2 and 3 .
in m669 you are setting a and b to move with drives 5 and 6 ? there is no drive 6 on main board .
i dont know whats going on here .also in m584 you define x as driver 2 on 3hc(b3) and y as driver 1 ? any reason for that ? usually x comes before y .
this is your motor location per you dump files :
3hc #3
Driver 0: position -1182960 z axis
Driver 1: position 3200 y axis
Driver 2: position -145076 x axismain board
Driver 0: position 31360 u axis
Driver 1: position 6604 v axis
Driver 2: position 16320 a axis
Driver 3: position 31840 b axis
Driver 4: position 4756 ?
Driver 5: position 31360 ?when you paused the print , a and b (if they are configured right , not sure anymore) should be :
a around 200mm from its end stop
b around 400mm from its end stop
is this logical ?
i dont know how much steps/mm your xy has . if its same as ab (80steps/mm) your x should be around 1.8meters from end stop . i dont thik your printer is that big
so maybe you're using different steps/mm for xy , but anyways i think their relation to ab is way off .
can you send M669 command and post the report ?edit :
almost forgot , drives 5,6 on main board not defined , but i still see changing position on them , whats going on ? -
@hackinistrator Dunno. Driver position (as reported by M122) is fairly new so I've never looked at it before. Maybe there is something odd about the way they are reported - maybe something different for expansion boards and the main board?
Don't forget this is CoreXY kinematics, not Cartesian. So both motors contribute to movement. That's why I prefer to call them Alpha and Beta but for configuration purposes, they have to be called X and Y (or U and V or A and B). Also, the A and B motor directions are reversed. Oh and the axes lengths are a bit longer than the XY.
I can confirm that they all use the same belt and pulley pitch and all motors are 0.9 degree, so the steps per mm are the same for all axes (80). I can also confirm that it all works as it should.
No idea why Drivers 4 and 5 report anything - as you say, they aren't configured and physically, don't have any motors connected. You'll have to ask @dc42.
-
On the expansion boards, the driver positions are the net number of microsteps moved since the board was powered up or reset. Therefore, for a drive used to move an axis, the position is where the axis is now compared to where it was at power up. However, if after homing it you move the axis to a particular position and record the corresponding driver position, then when you next move the axis back to that position, the driver position should read the same.
For the main board, if drivers 4 and 5 have never been configured, I would expect the positions to read back as zero.