Duet3 6CH + 3CH expansion board - Missing steps.
-
@evomotors, I believe at least some of these issues are caused by CAN clock sync jitter, meaning that the expansion board clocks don't run perfectly in sync either the main board. When I measured it, I found that it was higher than I expected, frequently reaching 300us and sometimes more under conditions of high step rates and short move segments. Fortunately we've had a plan to reduce this jitter ever since we designed Duet 3. I've been working on this all day, and the jitter is now reduced to 13us even with heavy traffic. There is the possibility to reduce it still further.
I have two other related issues to investigate before we release an early 3.3 beta.
-
@deckingman sure , i watched the vid.
i'm not sure if the issue occurs at random time.
if you check the vid , you can see the printer completed pretty fast accel moves without issues while moving on XY plane .
the issue occurred when it was making fast diagonal move - single motor running .
what is the default motor current on those drivers anyways ?
maybe its a good idea to try and reduce motor current on exp. board to very low value and see if it has any effect .
there are no CAN errors so i don't see how it can be comm. error . -
@hackinistrator there is some timing jitter (as @dc42 has pointed out) that is not currently reported in 3.2.
@evomotors thanks for all the work in detailing the problem, we are working on it!
-
-
I am pleased to report that we have made substantial progress on this issue. In addition to fixing the CAN clock jitter, we have identified and fixed an issue that could cause whole moves to be abandoned on CAN-connected expansion boards. We have only been able to reproduce this using high step rates and short moves, however the cause of the issue (floating point rounding error) could conceivably happen at lower step rates too.
I expect to provide unofficial 3.3beta versions of all firmwares tomorrow.
-
@dc42 Holy crap! Big if true!!!!!! I can't wait to try out the firmware to see if it solves my issues!
-
@dc42
Do you need a beta tester? -
@evomotors said in Duet3 6CH + 3CH expansion board - Missing steps.:
Do you need a beta tester?
always welcome.
the firmware will be available for all here
https://github.com/Duet3D/RepRapFirmware/releases -
@dc42 Any updates? I guess there some roadblocks with other issues.
-
-
@dc42 said in Duet3 6CH + 3CH expansion board - Missing steps.:
@evomotors, https://forum.duet3d.com/topic/20991/early-rrf-3-3beta-files-available
Issue still exists after 3.3beta install. Visually no difference.
This is after starting and canceling print:
M122 === Diagnostics === RepRapFirmware for Duet 3 MB6HC version 3.3beta running on Duet 3 MB6HC v1.01 or later (SBC mode) Board ID: 08DJM-956L2-G43S8-6JKDL-3SJ6L-1802G Used output buffers: 1 of 40 (15 max) === RTOS === Static ram: 149772 Dynamic ram: 64248 of which 104 recycled Never used RAM 140548, free system stack 126 words Tasks: Linux(ready,141) HEAT(blocked,299) CanReceiv(blocked,893) CanSender(blocked,346) CanClock(blocked,326) TMC(blocked,52) MAIN(running,613) IDLE(ready,19) Owned mutexes: HTTP(MAIN) === Platform === Last reset 00:20:29 ago, cause: power up Last software reset details not available Error status: 0x00 Aux0 errors 0,0,0 Aux1 errors 0,0,0 MCU temperature: min 23.3, current 35.4, max 35.6 Supply voltage: min 24.1, current 24.2, max 24.5, under voltage events: 0, over voltage events: 0, power good: yes 12V rail voltage: min 12.1, current 12.1, max 12.2, under voltage events: 0 Driver 0: position 71266, standstill, reads 15769, writes 24 timeouts 0, SG min/max 0/243 Driver 1: position 8428, standstill, reads 15769, writes 24 timeouts 0, SG min/max 0/248 Driver 2: position 1039, standstill, reads 15769, writes 24 timeouts 0, SG min/max 0/269 Driver 3: position 0, standstill, reads 15770, writes 24 timeouts 0, SG min/max 0/267 Driver 4: position 0, standstill, reads 15783, writes 11 timeouts 0, SG min/max 0/0 Driver 5: position 0, standstill, reads 15783, writes 11 timeouts 0, SG min/max 0/0 Date/time: 2021-01-16 15:50:32 Slowest loop: 178.46ms; fastest: 0.03ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 37.5MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 125, maxWait 700263ms, bed compensation in use: mesh, comp offset 0.000 === MainDDARing === Scheduled moves 2249, completed moves 2249, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 3], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1 Heater 0 is on, I-accum = 0.4 Heater 1 is on, I-accum = 0.3 === GCodes === Segments left: 0 Movement lock held by null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File* is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue* is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === CAN === Messages queued 7096, send timeouts 0, received 38, lost 0, longest wait 1ms for reply type 6018, peak Tx sync delay 414, free buffers 48 === SBC interface === State: 4, failed transfers: 0 Last transfer: 1ms ago RX/TX seq numbers: 44700/44700 SPI underruns 0, overruns 0 Number of disconnects: 0, IAP RAM available 0x2c478 Buffer RX/TX: 0/0-0 === Duet Control Server === Duet Control Server v3.2.0 Code buffer space: 4096 Configured SPI speed: 8000000 Hz Full transfers per second: 31.34 Maximum length of RX/TX data transfers: 2656/1664
M122 B1 Diagnostics for board 1: Duet EXP3HC firmware version 3.3beta (2021-01-16 08:18:18) Bootloader ID: not available Never used RAM 154492, free system stack 166 words HEAT 92 CanAsync 92 CanRecv 86 TMC 30 MAIN 306 AIN 259 Last reset 00:21:14 ago, cause: power up Last software reset data not available Driver 0: position -343134, 160.0 steps/mm, standstill, reads 45658, writes 24 timeouts 0, SG min/max 0/218, steps req 1681012 done 1337262 Driver 1: position -74772, 160.0 steps/mm, standstill, reads 45661, writes 24 timeouts 0, SG min/max 0/349, steps req 1442638 done 1203360 Driver 2: position 5964, 172.0 steps/mm, standstill, reads 45669, writes 20 timeouts 0, SG min/max 0/198, steps req 32691 done 32719 Moves scheduled 2198, completed 2198, in progress 0, hiccups 185, step errors 0 Peak sync jitter 14, peak Rx sync delay 183, resyncs 0, no step interrupt scheduled VIN: 24.7V, V12: 12.3V MCU temperature: min 43.8C, current 44.0C, max 44.2C Ticks since heat task active 246, ADC conversions started 1274238, completed 1274236, timed out 0 Last sensors broadcast 0x00000000 found 0 0 ticks ago, loop time 0 CAN messages queued 58, send timeouts 0, received 13653, lost 0, free buffers 36, error reg 11004e
And this is after just after boot before printing:
m122 B1 Diagnostics for board 1: Duet EXP3HC firmware version 3.3beta (2021-01-16 08:18:18) Bootloader ID: not available Never used RAM 154420, free system stack 200 words HEAT 79 CanAsync 92 CanRecv 86 TMC 64 MAIN 298 AIN 259 Last reset 00:00:49 ago, cause: software Last software reset data not available Driver 0: position 0, 160.0 steps/mm, standstill, reads 49081, writes 23 timeouts 0, SG min/max 0/0, steps req 0 done 0 Driver 1: position 0, 160.0 steps/mm, standstill, reads 49083, writes 23 timeouts 0, SG min/max 0/0, steps req 0 done 0 Driver 2: position 0, 172.0 steps/mm, standstill, reads 49087, writes 23 timeouts 0, SG min/max 0/0, steps req 0 done 0 Moves scheduled 0, completed 0, in progress 0, hiccups 0, step errors 0 Peak sync jitter 11, peak Rx sync delay 181, resyncs 1, no step interrupt scheduled VIN: 24.7V, V12: 12.3V MCU temperature: min 43.8C, current 43.8C, max 44.0C Ticks since heat task active 236, ADC conversions started 49228, completed 49226, timed out 0 Last sensors broadcast 0x00000000 found 0 240 ticks ago, loop time 0 CAN messages queued 60, send timeouts 3, received 403, lost 0, free buffers 36, error reg 11004f
-
Thanks, from the M122 B1 report there's evidently still an issue with steps being lost on the 3HC.
Have you already posted your config.g and the GCode file somewhere?
-
@dc42 said in Duet3 6CH + 3CH expansion board - Missing steps.:
Thanks, from the M122 B1 report there's evidently still an issue with steps being lost on the 3HC.
Have you already posted your config.g and the GCode file somewhere?
Yes, it is the same configuration and stl file as posted above in this thread
https://forum.duet3d.com/topic/20713/duet3-6ch-3ch-expansion-board-missing-steps/79?_=1610818681447
The config is where XY connected to expansion board
-
Thanks, I'll run those files on my bench system. I guess I should have done that earlier, rather than assuming that the bug that Tony and I reproduced was the one that affected your machine.
-
@dc42 said in Duet3 6CH + 3CH expansion board - Missing steps.:
Thanks, I'll run those files on my bench system. I guess I should have done that earlier, rather than assuming that the bug that Tony and I reproduced was the one that affected your machine.
Let me know if you need me to run some other tests.
-
It's running. So far I am seeing the extruder have more reported steps than commanded steps, but X and Y doing exactly the correct number of steps.
It's occurred to me that any homing commands or other G1 H1 moves will mess up the step count, making steps done be less than steps requested. Does your cancel.g file have a homing command in it?
-
@dc42 said in Duet3 6CH + 3CH expansion board - Missing steps.:
It's running. So far I am seeing the extruder have more reported steps than commanded steps, but X and Y doing exactly the correct number of steps.
It's occurred to me that any homing commands or other G1 H1 moves will mess up the step count, making steps done be less than steps requested. Does your cancel.g file have a homing command in it?
No, no homing in cancel.g
I did another print with print and move speed 40mm/s . Still the same issue. Is your kinematics the same? My is CoreXY
-
@dc42
I don't have cancel.g -
This was longer run on slow speeds.
M122 B1 Diagnostics for board 1: Duet EXP3HC firmware version 3.3beta (2021-01-16 08:18:18) Bootloader ID: not available Never used RAM 154420, free system stack 164 words HEAT 92 CanAsync 92 CanRecv 86 TMC 30 MAIN 282 AIN 259 Last reset 03:11:52 ago, cause: power up Last software reset data not available Driver 0: position -687060, 160.0 steps/mm, standstill, reads 32218, writes 22 timeouts 0, SG min/max 0/622, steps req 7128898 done 6785821 Driver 1: position -104279, 160.0 steps/mm, standstill, reads 32217, writes 22 timeouts 0, SG min/max 0/1023, steps req 6693175 done 6454161 Driver 2: position 64560, 172.0 steps/mm, standstill, reads 32221, writes 18 timeouts 0, SG min/max 0/693, steps req 218156 done 219933 Moves scheduled 20756, completed 20756, in progress 0, hiccups 25, step errors 0 Peak sync jitter 15, peak Rx sync delay 49081, resyncs 93, no step interrupt scheduled VIN: 24.7V, V12: 12.3V MCU temperature: min 43.8C, current 43.8C, max 44.2C Ticks since heat task active 94, ADC conversions started 11512336, completed 11512334, timed out 0 Last sensors broadcast 0x00000000 found 0 98 ticks ago, loop time 0 CAN messages queued 77, send timeouts 0, received 110725, lost 356, free buffers 36, error reg 1
-
Thanks. I can see three possible issues in that M122 report:
-
The X and Y steps done are not the same as steps requested. However, I think this must be because you are doing a homing command somewhere in the sequence, because I don't see this. Perhaps your pause.g or start.g file contains a G28 command? Or you ran the initial M122 command, then homed the printer, then started the print? If you can find and remove that G28 command (or pause and run M122 after doing it) then please do that and do another run, to check that the X and Y steps are correct. To be clear, in the M122 B1 report I expect the X and Y steps done to be the same as steps reported, provided that the axes where not moving when this M122 or the previous M122 were run and there have been no G1 H1 or G1 H3 moves executed between the two M122 commands.
-
The extruder steps done is greater than the steps commanded. The primary reason for this is that pressure advance causes retraction steps to be generated at the end of a move that ends at low speed, and the steps commanded figure doesn't take account of that. However, if I set pressure advance to zero then the extruder steps done is slightly less than the steps commanded, which is wrong. So I think there is a rounding error in the pressure advance calculation.
-
Probably the most serious issue is that it has reported that some CAN messages have been lost and the peak Rx sync delay is very high. This suggests that movement messages are not being queued as fast as they are arriving. I was already planning to do that part of the move processing in a different way with lower latency, and I will now work on that urgently.
Thanks for your patience. With your permission, I will retain and use your GCode file as one of our test files.
-