Duet3 6CH + 3CH expansion board - Missing steps.
-
@dc42 said in Duet3 6CH + 3CH expansion board - Missing steps.:
Thanks. I'll do that E drive position reporting check on my tool changer. It may be just a reporting issue.
Here are promised videos. Use bookmarks to go directly to issue.
XY Motors connected to Duet 3 board
XY Motors connected to expansion board
Edit: attaching config files
xy_on_expansion.config.g
xy_on_duet3.config.gCompressed gcode, renamed to *.txt
ZR_BearingTopMount_FR.zip.txtDiagnostic after failed print (for M122 B1)
M122 B1 Diagnostics for board 1: Duet EXP3HC firmware version 3.2 (2021-01-05) Bootloader ID: not available Never used RAM 154728, free system stack 154 words HEAT 90 CanAsync 94 CanRecv 84 TMC 30 MAIN 315 AIN 257 Last reset 08:10:45 ago, cause: power up Last software reset data not available Driver 0: position -591153, 160.0 steps/mm, standstill, reads 62793, writes 52 timeouts 0, SG min/max 0/1023 Driver 1: position -76259, 160.0 steps/mm, standstill, reads 62795, writes 52 timeouts 0, SG min/max 0/1023 Driver 2: position 52354, 172.0 steps/mm, standstill, reads 62807, writes 44 timeouts 0, SG min/max 0/156 Moves scheduled 4073, completed 4073, in progress 0, hiccups 218 No step interrupt scheduled VIN: 24.7V, V12: 12.3V MCU temperature: min 43.8C, current 44.0C, max 44.2C Ticks since heat task active 245, ADC conversions started 29445736, completed 29445736, timed out 0 Last sensors broadcast 0x00000000 found 0 248 ticks ago, loop time 0 CAN messages queued 122, send timeouts 0, received 269109, lost 0, free buffers 36
Diagnostic after failed print (for M122)
M122 === Diagnostics === RepRapFirmware for Duet 3 MB6HC version 3.2 running on Duet 3 MB6HC v1.01 or later (SBC mode) Board ID: 08DJM-956L2-G43S8-6JKDL-3SJ6L-1802G Used output buffers: 1 of 40 (11 max) === RTOS === Static ram: 149788 Dynamic ram: 63136 of which 68 recycled Never used RAM 145840, free system stack 128 words Tasks: Linux(ready,87) HEAT(blocked,296) CanReceiv(blocked,849) CanSender(blocked,344) CanClock(blocked,352) TMC(blocked,53) MAIN(running,673) IDLE(ready,19) Owned mutexes: HTTP(MAIN) === Platform === Last reset 01:37:31 ago, cause: watchdog Last software reset at 2021-01-11 20:53, reason: User, none spinning, available RAM 146296, slot 1 Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task Linu Freestk 0 n/a Error status: 0x00 Aux0 errors 0,0,0 Aux1 errors 0,0,0 MCU temperature: min 35.9, current 37.4, max 38.4 Supply voltage: min 24.1, current 24.3, max 24.4, under voltage events: 0, over voltage events: 0, power good: yes 12V rail voltage: min 12.1, current 12.1, max 12.2, under voltage events: 0 Driver 0: position 70760, standstill, reads 54741, writes 24 timeouts 0, SG min/max 0/238 Driver 1: position 8428, standstill, reads 54741, writes 24 timeouts 0, SG min/max 0/236 Driver 2: position 1039, standstill, reads 54741, writes 24 timeouts 0, SG min/max 0/264 Driver 3: position 0, standstill, reads 54741, writes 24 timeouts 0, SG min/max 0/255 Driver 4: position 0, standstill, reads 54754, writes 11 timeouts 0, SG min/max 0/0 Driver 5: position 0, standstill, reads 54754, writes 11 timeouts 0, SG min/max 0/0 Date/time: 2021-01-11 22:30:54 Slowest loop: 187.54ms; fastest: 0.03ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 37.5MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 125, maxWait 272324ms, bed compensation in use: mesh, comp offset 0.000 === MainDDARing === Scheduled moves 1805, completed moves 1805, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 2], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1 Heater 0 is on, I-accum = 0.4 Heater 1 is on, I-accum = 0.3 === GCodes === Segments left: 0 Movement lock held by null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File* is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue* is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === CAN === Messages queued 25218, send timeouts 0, received 41, lost 0, longest wait 1ms for reply type 6018, free buffers 48 === SBC interface === State: 4, failed transfers: 0 Last transfer: 2ms ago RX/TX seq numbers: 13476/13476 SPI underruns 0, overruns 0 Number of disconnects: 0, IAP RAM available 0x2c8a8 Buffer RX/TX: 0/0-0 === Duet Control Server === Duet Control Server v3.2.0 Code buffer space: 4096 Configured SPI speed: 8000000 Hz Full transfers per second: 35.91 Maximum length of RX/TX data transfers: 2800/1684
-
Thanks for the videos. It always helps to be able to see what's going on. Thanks for the time you've put into troubleshooting the issue.
While we wait for DC42 to get a chance to see your videos, could you do a test print for me using
M566 P1
set? -
@Phaedrux said in Duet3 6CH + 3CH expansion board - Missing steps.:
M566 P1
Sure, do I need another video of it?
-
@evomotors Let's just see if it makes a difference at all.
The default jerk policy is 0, which replicates the behaviour of earlier versions of RRF (jerk is only applied between two printing moves, or between two travel moves, and only if they both involve XY movement or neither does). Changing the jerk policy to 1 allows jerk to be applied between any pair of moves.
-
@Phaedrux said in Duet3 6CH + 3CH expansion board - Missing steps.:
@evomotors Let's just see if it makes a difference at all.
The default jerk policy is 0, which replicates the behaviour of earlier versions of RRF (jerk is only applied between two printing moves, or between two travel moves, and only if they both involve XY movement or neither does). Changing the jerk policy to 1 allows jerk to be applied between any pair of moves.
Did test with following config changes.
M566 X500.00 Y500.00 Z50.00 E3000.00 P1
Failed the same way
-
Thanks for testing.
-
Not trying to hijack the thread but I got "similar" issues on a 6HC + Toolboard running 3.2
I only see the issue on round surfaces on which there are no retractions (set at 0.2) and PA from the ~middle of the hole was set from 0.005 to 0 which made it worse?!
A previous print with bigger retraction 0.3 and bigger PA 0.02 had issues but not of this magnitude, which is curious.I can start a new thread with all the info if necessary.
-
@jbarros said in Duet3 6CH + 3CH expansion board - Missing steps.:
I can start a new thread with all the info if necessary.
Please do.
-
@evomotors
are you 100% sure that its "data loss" causing the issue ?
it seems you set everything correctly (i'm far from being expert...) maybe the board ignores your current limit settings .
can you try running the same gcode (or simple high accel macro) while the motors disconnected from the load (remove belts) ?
i'm not sure its a comm speed issue . maybe the tboard simply ignores motor current settings . -
@hackinistrator said in Duet3 6CH + 3CH expansion board - Missing steps.:
@evomotors
are you 100% sure that its "data loss" causing the issue ?
it seems you set everything correctly (i'm far from being expert...) maybe the board ignores your current limit settings .
can you try running the same gcode (or simple high accel macro) while the motors disconnected from the load (remove belts) ?
i'm not sure its a comm speed issue . maybe the tboard simply ignores motor current settings .I'm not 100% sure that it's caused by data loss. If I set my motors current limits correctly it should not ignore them, unless there is a bug in the firmware.
I prefer not to remove belts, but even if I do, what it is going to prove? How can you tell if gantry in the correct position if belts are disconnected?
But you maybe correct, it does feels like motors ether underpowered or pulses are irregular on these fast moves. Unfortunately there is no way for me to tell, I don't have oscilloscope on hands. But it should be really easy for @dc42 to test.
I'm still waiting for @dc42 to look at the videos and reply.
-
@hackinistrator Have you watched the videos that @evomotors posted? If the board ignores the motor current, then it does so selectively at some seemingly random time after a print starts, then restores it again immediately after.
-
I'm getting lonely here after proving existence of the issue.
@dc42 Do you want me to perform some additional tests?
-
Thanks for your patience. He's aware and will respond ASAP.
-
@Phaedrux said in Duet3 6CH + 3CH expansion board - Missing steps.:
Thanks for your patience. He's aware and will respond ASAP.
Oh thanks! The good news that I'm not alone here.
-
@evomotors For want of anything better to say or suggest, have you tried commenting out the M593 (DAA) to see if that makes any difference? No reason for suggesting it other than it's something which modifies motion planning, which might get applied to main boards but not expansion boards. I don't know if that's even possible but testing without M593 would at least eliminate it as a potential cause. Just a thought if you've got nothing better to do while you are waiting...........
-
@evomotors, I believe at least some of these issues are caused by CAN clock sync jitter, meaning that the expansion board clocks don't run perfectly in sync either the main board. When I measured it, I found that it was higher than I expected, frequently reaching 300us and sometimes more under conditions of high step rates and short move segments. Fortunately we've had a plan to reduce this jitter ever since we designed Duet 3. I've been working on this all day, and the jitter is now reduced to 13us even with heavy traffic. There is the possibility to reduce it still further.
I have two other related issues to investigate before we release an early 3.3 beta.
-
@deckingman sure , i watched the vid.
i'm not sure if the issue occurs at random time.
if you check the vid , you can see the printer completed pretty fast accel moves without issues while moving on XY plane .
the issue occurred when it was making fast diagonal move - single motor running .
what is the default motor current on those drivers anyways ?
maybe its a good idea to try and reduce motor current on exp. board to very low value and see if it has any effect .
there are no CAN errors so i don't see how it can be comm. error . -
@hackinistrator there is some timing jitter (as @dc42 has pointed out) that is not currently reported in 3.2.
@evomotors thanks for all the work in detailing the problem, we are working on it!
-
-
I am pleased to report that we have made substantial progress on this issue. In addition to fixing the CAN clock jitter, we have identified and fixed an issue that could cause whole moves to be abandoned on CAN-connected expansion boards. We have only been able to reproduce this using high step rates and short moves, however the cause of the issue (floating point rounding error) could conceivably happen at lower step rates too.
I expect to provide unofficial 3.3beta versions of all firmwares tomorrow.