Macro files SBC v Standalone
-
Hi @dc42 and @chrishamm
I've been trying to duplicate the klipper resonance test using a macro file. In the process of this I noticed the following:- The SBC version of my macro runs much more slowly than the same file on a standalone system.
- There seems to be a difference in rounding between the two when using expressions enclosed with {}
When I run the following macro:
var stime = state.upTime var min_freq = 5.0 var max_freq = 200 var accel_per_hz = 75 var hz_per_sec = 1 var freq = var.min_freq var max_accel = var.max_freq * var.accel_per_hz var X = 150 var Y = 150 var sign = 1 echo "max_accel " ^ var.max_accel ;M201 X{var.max_accel} Y{var.max_accel} ;G1 X{var.X} Y{var.Y} echo "Testing " ^ var.freq var old_freq = var.freq while var.freq < var.max_freq + 0.000001 var t_seg = 0.25 / var.freq var accel = var.accel_per_hz * var.freq var max_v = var.accel * var.t_seg ;M204 P{var.accel} var L = 0.5 * var.accel * (var.t_seg*var.t_seg) var nX = var.X + (var.sign * var.L) ;G1 X{var.nX} F{var.max_v*60} ;G1 X{var.X} set var.sign = -var.sign set var.freq = var.freq + (2. * var.t_seg * var.hz_per_sec) if floor(var.freq) >= floor(var.old_freq) + 5 echo "Testing " ^ var.freq set var.old_freq = var.freq echo "total time " ^ state.upTime - var.stime
It executes in approx 120 seconds on a standalone system but takes 969 seconds when using a SBC. I suspect that most people would expect the SBC to be faster for this sort of thing.
These timings are from the v3.4beta6 STM32F4 port but @jay_s_uk has confirmed similar timing (140 seconds and 1240 seconds) using Duet3D hardware
For the second problem if I run the following test program:
var R1 = 0.0468923 var R2 = 0.0469111 var R3 = 0.0469017 var R4 = 0.0469205 var X = 150.0 var X1 = var.X + var.R1 var X2 = var.X + var.R2 var X3 = var.X - var.R3 var X4 = var.X - var.R4 echo "X1 " ^ var.X1 ^ " {X1} " ^ {var.X1} echo "X2 " ^ var.X2 ^ " {X2} " ^ {var.X2} echo "X3 " ^ var.X3 ^ " {X3} " ^ {var.X3} echo "X4 " ^ var.X4 ^ " {X4} " ^ {var.X4}
On a standalone system then I get the following output
14/11/2021, 10:20:18 M98 P"0:/macros/roundtest.g" X1 150.0469 {X1} 150.0469 X2 150.0469 {X2} 150.0469 X3 149.9531 {X3} 149.9531 X4 149.9531 {X4} 149.9531
Which seems fine to me, but on the SBC version I get:
14/11/2021, 10:18:10 X4 149.9531{X4} 149.95 14/11/2021, 10:18:10 X3 149.9531{X3} 150.0 14/11/2021, 10:18:09 X2 150.0469{X2} 150.0 14/11/2021, 10:18:09 X1 150.0469{X1} 150.05
Which seems to round the {} version to two decimal places, but the rounding does not seem correct to me.
I've only tested the 2nd set of tests on STM32F4 hardware, but I suspect the issue will be present on the Duet3D version.
Let me know if you need any further information.
-
Hi @gloomyandy,
With v3.3 and v3.4 the performance of meta G-code expressions can be slower than in standalone mode but this is negilible for most applications (e.g. an
if
statement has marginal effect on the execution speed). The reason for this is that DCS flushes pending codes when each meta G-code is executed and a flush request - at present - is only resolved after a full SPI transfer. Two notes on this:- To confirm this and to speed up the execution of your macro at least partially, you can use M576 at the start and end of your macro file to increase the number of SPI transfers per second (
M576 S0
), although this will increase CPU usage as well. - I plan to refactor much of the G-code flow in DCS 3.5 which will speed up the flush requests significantly. Until then
M576
should let you boost the throughput, though.
@gloomyandy said in Macro files SBC v Standalone:
It executes in approx 120 seconds on a standalone system but takes 969 seconds when using a SBC. I suspect that most people would expect the SBC to be faster for this sort of thing.
These timings are from the v3.4beta6 STM32F4 port but @jay_s_uk has confirmed similar timing (140 seconds and 1240 seconds) using Duet3D hardware
This is a misconception because RRF still does most of the expression evaluation. DSF simply replaces certain object model values from DSF before the final expressions are sent to RRF. Of course I could write my own expression parser in DSF, but that still won't address the main bottleneck here: Availability of correct live data. RRF has to be queried in regular intervals anyway, and asking it to evaluate an expression may be still faster than merging an object model JSON before evaluating everything else directly on the SBC.
Thanks for pointing out the formatting differences between standalone and SBC mode, I'm going to have a look at it.
- To confirm this and to speed up the execution of your macro at least partially, you can use M576 at the start and end of your macro file to increase the number of SPI transfers per second (
-
@chrishamm can this be optimised at all when there are several meta commands in a row? e.g. only flush the pipeline before the first one?
-
@dc42 With the modifications I have in mind for v3.5 I think this could be achieved. I'd like to replace the current task-driven approach for handling G-codes with a dataflow pipeline which should simplify the code flushing concept as well.
-
@chrishamm @dc42 Thanks for taking a look at this and for the M576 S0 suggestion I'll give it a go!
On the SBC/Standalone speed yes I understand the issue, but I suspect most people don't and would expect that the SBC would boost performance. If you are thinking of reworking things... this may be a little drastic, but now that you have remote file I/O I wonder if just running the gcode file on the board and using the remote file access code to pull the code from the SBC might make things simpler? I'm not sure what gcodes are currently handled on the SBC or how much functionality would be lost by doing this, but it was just a thought I had the other day when looking at the code.
-
@gloomyandy That's an absolute no-go. We already have a number of plugins intercepting the G-code stream so using the remote file I/O from RRF to read G-code files would undermine this concept and render all these plugins useless. Also bear in mind that there is no buffer mechanism for read SBC file data in RRF yet, even though that's something I'd like to change in v3.5 as well. As I said, the bottleneck here is the flush mechanism and I will improve that in v3.5. Once that is done, the performance of macros like yours will be higher as well.
The SBC does give you a performance boost, but it's network connectivity that benefits from it in the first place. I still see that along with plugin capability and more dynamic configurations as the main advantages of SBC configurations.
-
@chrishamm I'm curious which plugins intercept the gcode stream? I assume they are SBC only?
-
@gloomyandy Yes, correct, for example DuetPiManagementPlugin or ExecOnMCode.
-
@chrishamm Hi Christian/@dc42 I've just seen what I think is another difference between standalone and SBC mode with a script file. The following file
echo "; this should work"
Works fine in standalone mode but with an SBC I get:
Error: Failed to evaluate """: control character in string of testquote.g
Not sure exactly what is going on, perhaps it is treating the ";" inside a string as a comment?
-
@gloomyandy Thanks, I'll check what's going on. It will be fixed in the upcoming version.
-