Duet3D Logo Duet3D
    • Tags
    • Documentation
    • Order
    • Register
    • Login

    3.3b1CAN timeout resets itself?

    Scheduled Pinned Locked Moved Solved
    Beta Firmware
    4
    25
    956
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • TypQxQundefined
      TypQxQ @dc42
      last edited by

      @dc42 said in 3.3b1CAN timeout resets itself?:

      @TypQxQ, please provide your sys/custom/tool_0.g and sys/custom/conf_Endstops.g files.

      Here is everything in my config and customconfig folder.
      https://we.tl/t-xwMjBHKrGd

      Thank you!

      1 Reply Last reply Reply Quote 0
      • dc42undefined
        dc42 administrators
        last edited by

        Thanks, I am running that job now. How long did you leave it running last time, before you ran M122 B1 and got this message?

        CAN messages queued 436976, send timeouts 0, received 689661, lost 0, free buffers 36, min 36, error reg 0
        dup 0, oos 16, bm 8, wbm 9772795
        

        Duet WiFi hardware designer and firmware engineer
        Please do not ask me for Duet support via PM or email, use the forum
        http://www.escher3d.com, https://miscsolutions.wordpress.com

        1 Reply Last reply Reply Quote 0
        • dc42undefined
          dc42 administrators
          last edited by dc42

          Progress update. Running your job and your configuration, I am seeing a lot of send timeouts (in the M122 report) on the MB6HC. The number of out-of-sequence messages received on the EXP3HC ('oos' in the M122 B1 report) is much lower, suggesting that most of the send timeouts are spurious; but the fact that they occur at all indicates a problem. I will keep you informed.

          Duet WiFi hardware designer and firmware engineer
          Please do not ask me for Duet support via PM or email, use the forum
          http://www.escher3d.com, https://miscsolutions.wordpress.com

          1 Reply Last reply Reply Quote 0
          • dc42undefined
            dc42 administrators
            last edited by dc42

            It appears that with a code change I have resolved the send timeouts and associated 'oos' reports. Your print has been running for 76 minutes now with no error reports.

            I still haven't seen any of the 'bm' reports on my system. They are likely to have a different cause.

            I forgot to ask one thing. As I understand it, your system has just a MB6HC and one EXP3HC connected via CAN, and no other CAN-connected boards. Have you fitted the CAN termination jumpers to the EXP3HC board?

            PS - if you want to try the 6HC and 3HC firmware binaries that I am running, you can find them here https://www.dropbox.com/sh/q5uqqkjbmhgvlhq/AACYqG0ynLME9ogoLd1zLB2Xa?dl=0.

            Duet WiFi hardware designer and firmware engineer
            Please do not ask me for Duet support via PM or email, use the forum
            http://www.escher3d.com, https://miscsolutions.wordpress.com

            TypQxQundefined 1 Reply Last reply Reply Quote 0
            • TypQxQundefined
              TypQxQ @dc42
              last edited by TypQxQ

              @dc42 said in 3.3b1CAN timeout resets itself?:

              It appears that with a code change I have resolved the send timeouts and associated 'oos' reports. Your print has been running for 76 minutes now with no error reports.

              I still haven't seen any of the 'bm' reports on my system. They are likely to have a different cause.

              I forgot to ask one thing. As I understand it, your system has just a MB6HC and one EXP3HC connected via CAN, and no other CAN-connected boards. Have you fitted the CAN termination jumpers to the EXP3HC board?

              This is great news!
              I can confirm my CAN termination jumpers are closed on 3HC.
              I plan is to buy another 3HC if I get multiple motors on XY to work 😄

              @dc42 said in 3.3b1CAN timeout resets itself?:

              Thanks, I am running that job now. How long did you leave it running last time, before you ran M122 B1 and got this message?

              Not sure what if you mean time since reboot or since stopped printing.
              According to the eventlog above, "Last reset 10:14:33 ago" and was idle for almost 8h.

              When it was printing and had an uptime of 00:07:53 it read:

              CAN messages queued 3305, send timeouts 222, received 2369, lost 0, free buffers 36, min 36, error reg 100000
              dup 0, oos 0, bm 4, wbm 5559827
              
              1 Reply Last reply Reply Quote 0
              • dc42undefined
                dc42 administrators
                last edited by dc42

                What I meant was, how long did you have to print for to get a nonzero 'bm' value when you ran M122 B1? Because I haven't seen any yet. Looks like 7 minutes may have been enough for you.

                Duet WiFi hardware designer and firmware engineer
                Please do not ask me for Duet support via PM or email, use the forum
                http://www.escher3d.com, https://miscsolutions.wordpress.com

                TypQxQundefined 1 Reply Last reply Reply Quote 0
                • TypQxQundefined
                  TypQxQ @dc42
                  last edited by TypQxQ

                  @dc42 said in 3.3b1CAN timeout resets itself?:

                  What I meant was, how long did you have to print for to get a nonzero 'bm' value when you ran M122 B1? Because I haven't seen any yet. Looks like 7 minutes may have been enough for you.

                  Trying the new firmware right now.
                  I don't know what BM and WBM stands for but I got them again after 00:13:24 :

                  CAN messages queued 6740, send timeouts 0, received 6156, lost 0, free buffers 36, min 36, error reg 0
                  dup 0, oos 0, bm 12, wbm 15972458
                  

                  Also before printing I had some error reg. at uptime of 00:03:24:

                  CAN messages queued 2083, send timeouts 0, received 1735, lost 0, free buffers 36, min 36, error reg 110000
                  dup 0, oos 0, bm 0, wbm 0
                  

                  Now printing without any timeouts so far but only 20min inside print.

                  1 Reply Last reply Reply Quote 0
                  • dc42undefined
                    dc42 administrators
                    last edited by dc42

                    'bm' is the bad move command count, which I put in temporarily to debug a previous issue. That's the count of move commands that were scheduled to start before the previous move had ended. 'wbm' is a record of the worst bad instance; the figure is how many 750kHz clock ticks the new move was supposed to start before the previous move had finished. So your 'wbm' figure of 5559827 indicates that a move was scheduled to start 7.4 seconds before the previous move ended.

                    I don't see a nonzero bad move command count on my machine. However, it's occurred to me that you might legitimately have this situation after homing. I don't do homing of my simulation of your machine, I just use G92 to set all the axis positions. But if you are homing an axis connected to the EXP3HC, then that homing move might be scheduled to take e.g. 10 seconds but stop after 3 seconds when it triggers the homing switch. Then the next move will start before the homing move was originally scheduled to end, triggering the bad move command logging.

                    I will fix the bad move command counter to take account of moves that complete before they were originally scheduled to complete.

                    PS:

                    1. I've just replaced the EXP3HC binary at https://www.dropbox.com/sh/q5uqqkjbmhgvlhq/AACYqG0ynLME9ogoLd1zLB2Xa?dl=0 to include that change to the bad move count logging.
                    2. My print has been going for almost 5 hours now. During that time it has recorded just one send timeout on the main board while sending a movement message. I think it must be spurious, because there wasn't a corresponding 'oos' (out of sequence) error on the EXP3HC, which I would expect if the 3HC hadn't received the move.

                    Duet WiFi hardware designer and firmware engineer
                    Please do not ask me for Duet support via PM or email, use the forum
                    http://www.escher3d.com, https://miscsolutions.wordpress.com

                    TypQxQundefined 1 Reply Last reply Reply Quote 2
                    • TypQxQundefined
                      TypQxQ @dc42
                      last edited by TypQxQ

                      @dc42 I have finished printing and after 9h16m the print looks good and I also only received one timeout:

                      Messages queued 1750358, send timeouts 1, received 362388, lost 0, longest wait 24ms for reply type 6024, peak Tx sync delay 578, free buffers 49 (min 47)
                      CAN messages queued 361984, send timeouts 0, received 1747438, lost 0, free buffers 36, min 36, error reg 0
                      dup 0, oos 0, bm 0, wbm 0
                      

                      I loaded the new 3HC firmware with time 22:51 on Dropbox and will run it again. But the time compile time in M122 hasn't changed after updating this morning: "Duet EXP3HC firmware version 3.3beta1+1 (2021-03-02 15:56:53)"

                      Thank You!!! It this looks sooo good!

                      2021-03-03 05:32:39 Finished printing file 0:/gcodes/RearLeft v6.2.gcode, print time was 9h 16m
                      2021-03-03 06:30:00 === Diagnostics ===
                      RepRapFirmware for Duet 3 MB6HC version 3.3beta1+1 running on Duet 3 MB6HC v0.6 or 1.0 (SBC mode)
                      Board ID: 08DJM-956L2-G43S4-6JKDD-3SJ6L-TB62G
                      Used output buffers: 1 of 40 (17 max)
                      === RTOS ===
                      Static ram: 148212
                      Dynamic ram: 62952 of which 0 recycled
                      Never used RAM 143028, free system stack 98 words
                      Tasks: Linux(ready,63) HEAT(delaying,284) CanReceiv(notifyWait,774) CanSender(notifyWait,359) CanClock(delaying,342) TMC(notifyWait,18) MAIN(running,649) IDLE(ready,20)
                      Owned mutexes: HTTP(MAIN)
                      === Platform ===
                      Last reset 10:23:58 ago, cause: power up
                      Last software reset at 2021-03-02 19:50, reason: User, none spinning, available RAM 139760, slot 2
                      Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task Linu Freestk 0 n/a
                      Error status: 0x00
                      Aux0 errors 0,0,0
                      Aux1 errors 0,0,0
                      MCU temperature: min 43.3, current 43.5, max 44.8
                      Supply voltage: min 31.8, current 31.9, max 32.0, under voltage events: 0, over voltage events: 0, power good: yes
                      12V rail voltage: min 12.0, current 12.1, max 12.1, under voltage events: 0
                      Driver 0: position 160000, standstill, reads 5226, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 1: position 0, standstill, reads 5227, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 2: position 311549, standstill, reads 5227, writes 2 timeouts 0, SG min/max 0/158
                      Driver 3: position 0, standstill, reads 5227, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 4: position 0, standstill, reads 5227, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 5: position 0, standstill, reads 5226, writes 2 timeouts 0, SG min/max 0/1023
                      Date/time: 2021-03-03 06:29:58
                      Slowest loop: 90.33ms; fastest: 0.05ms
                      === Storage ===
                      Free file entries: 10
                      SD card 0 not detected, interface speed: 37.5MBytes/sec
                      SD card longest read time 0.0ms, write time 0.0ms, max retries 0
                      === Move ===
                      DMs created 125, maxWait 286ms, bed compensation in use: mesh, comp offset 0.000
                      === MainDDARing ===
                      Scheduled moves 1889328, completed moves 1889328, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
                      === AuxDDARing ===
                      Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
                      === Heat ===
                      Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
                      Heater 0 is on, I-accum = 0.1
                      === GCodes ===
                      Segments left: 0
                      Movement lock held by null
                      HTTP* is doing "M122 B0" in state(s) 0
                      Telnet is idle in state(s) 0
                      File* is idle in state(s) 0
                      USB is idle in state(s) 0
                      Aux is idle in state(s) 0
                      Trigger* is idle in state(s) 0
                      Queue* is idle in state(s) 0
                      LCD is idle in state(s) 0
                      SBC is idle in state(s) 0
                      Daemon is idle in state(s) 0
                      Aux2 is idle in state(s) 0
                      Autopause is idle in state(s) 0
                      Code queue is empty.
                      === CAN ===
                      Messages queued 1750358, send timeouts 1, received 362388, lost 0, longest wait 24ms for reply type 6024, peak Tx sync delay 578, free buffers 49 (min 47)
                      Last cancelled message type 51 dest 1
                      === SBC interface ===
                      State: 4, failed transfers: 0
                      Last transfer: 1ms ago
                      RX/TX seq numbers: 40775/40775
                      SPI underruns 0, overruns 0
                      Number of disconnects: 0, IAP RAM available 0x2d3ac
                      Buffer RX/TX: 0/0-0
                      2021-03-03 06:30:03 Diagnostics for board 1:
                      Duet EXP3HC firmware version 3.3beta1+1 (2021-03-02 15:56:53)
                      Bootloader ID: SAME5x bootloader version 2.2 (2021-01-16b1)
                      Never used RAM 154884, free system stack 0 words
                      Tasks: Move(notifyWait,80) HEAT(delaying,69) CanAsync(notifyWait,72) CanRecv(notifyWait,82) CanClock(notifyWait,74) TMC(notifyWait,30) MAIN(running,301) AIN(delaying,260)
                      Last reset 10:24:02 ago, cause: power up
                      Last software reset at 2020-11-10 18:16, reason: StackOverflow, available RAM 154112, slot 0
                      Software reset code 0x0100 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0000080e BFAR 0xe000ed38 SP 0x2002ffb4 Task � Freestk 4294967295 ok
                      Stack: 2000137c 200013b0 0002525d 00000000 0003e418 20001598 20003b28 00000000 00024885 200014b4 20003ab0 00000000 00f00000 e000ef34 c0000000 20003b30 000249c5 0002472c 61000000 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
                      Driver 0: position -1440, 160.0 steps/mm,  standstill, reads 17400, writes 2 timeouts 0, SG min/max 0/1023, steps req 268204856 done 268206165
                      Driver 1: position -144480, 160.0 steps/mm,  standstill, reads 17399, writes 2 timeouts 0, SG min/max 0/1023, steps req 271278668 done 271279392
                      Driver 2: position 48970973, 830.0 steps/mm,  standstill, reads 17397, writes 2 timeouts 0, SG min/max 0/1023, steps req 47753506 done 47753659
                      Moves scheduled 1888753, completed 1888753, in progress 0, hiccups 0, step errors 0, maxPrep 191, maxOverdue 5, maxInc 1, mcErrs 0, gcmErrs 0
                      Peak sync jitter 11, peak Rx sync delay 198, resyncs 30, no step interrupt scheduled
                      VIN: 24.1V, V12: 12.1V
                      MCU temperature: min 39.8C, current 40.0C, max 40.2C
                      Ticks since heat task active 64, ADC conversions started 37442556, completed 37442554, timed out 0
                      Last sensors broadcast 0x00000002 found 1 68 ticks ago, loop time 0
                      CAN messages queued 361984, send timeouts 0, received 1747438, lost 0, free buffers 36, min 36, error reg 0
                      dup 0, oos 0, bm 0, wbm 0
                      2021-03-03 06:55:17 G10 P0 S220
                      

                      Before and after the timeout fix:
                      6D0D0F5E-5821-4C72-9BA1-52FD15B74508.jpeg

                      And final again:

                      2021-03-03 18:14:05	m122 b0
                      === Diagnostics ===
                      RepRapFirmware for Duet 3 MB6HC version 3.3beta1+1 running on Duet 3 MB6HC v0.6 or 1.0 (SBC mode)
                      Board ID: 08DJM-956L2-G43S4-6JKDD-3SJ6L-TB62G
                      Used output buffers: 1 of 40 (21 max)
                      === RTOS ===
                      Static ram: 148212
                      Dynamic ram: 62904 of which 0 recycled
                      Never used RAM 143076, free system stack 98 words
                      Tasks: Linux(ready,69) HEAT(delaying,284) CanReceiv(notifyWait,774) CanSender(notifyWait,359) CanClock(delaying,342) TMC(notifyWait,18) MAIN(running,649) IDLE(ready,20)
                      Owned mutexes: HTTP(MAIN)
                      === Platform ===
                      Last reset 10:54:16 ago, cause: software
                      Last software reset at 2021-03-03 07:19, reason: User, none spinning, available RAM 143444, slot 2
                      Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task Linu Freestk 0 n/a
                      Error status: 0x00
                      Aux0 errors 0,0,0
                      Aux1 errors 0,0,0
                      MCU temperature: min 42.1, current 43.6, max 45.1
                      Supply voltage: min 31.8, current 31.9, max 32.0, under voltage events: 0, over voltage events: 0, power good: yes
                      12V rail voltage: min 12.0, current 12.0, max 12.1, under voltage events: 0
                      Driver 0: position 160000, standstill, reads 42231, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 1: position 0, standstill, reads 42231, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 2: position 311549, standstill, reads 42227, writes 6 timeouts 0, SG min/max 0/156
                      Driver 3: position 0, standstill, reads 42231, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 4: position 0, standstill, reads 42231, writes 2 timeouts 0, SG min/max 0/1023
                      Driver 5: position 0, standstill, reads 42231, writes 2 timeouts 0, SG min/max 0/1023
                      Date/time: 2021-03-03 18:14:02
                      Slowest loop: 1001.41ms; fastest: 0.05ms
                      === Storage ===
                      Free file entries: 10
                      SD card 0 not detected, interface speed: 37.5MBytes/sec
                      SD card longest read time 0.0ms, write time 0.0ms, max retries 0
                      === Move ===
                      DMs created 125, maxWait 7881ms, bed compensation in use: mesh, comp offset 0.000
                      === MainDDARing ===
                      Scheduled moves 1889328, completed moves 1889328, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 1, 0], CDDA state -1
                      === AuxDDARing ===
                      Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
                      === Heat ===
                      Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
                      Heater 0 is on, I-accum = 0.1
                      === GCodes ===
                      Segments left: 0
                      Movement lock held by null
                      HTTP* is doing "M122 B0" in state(s) 0
                      Telnet is idle in state(s) 0
                      File* is idle in state(s) 0
                      USB is idle in state(s) 0
                      Aux is idle in state(s) 0
                      Trigger* is idle in state(s) 0
                      Queue* is idle in state(s) 0
                      LCD is idle in state(s) 0
                      SBC is idle in state(s) 0
                      Daemon is idle in state(s) 0
                      Aux2 is idle in state(s) 0
                      Autopause is idle in state(s) 0
                      Code queue is empty.
                      === CAN ===
                      Messages queued 2236102, send timeouts 4, received 463095, lost 0, longest wait 20ms for reply type 6024, peak Tx sync delay 554, free buffers 49 (min 47)
                      Last cancelled message type 51 dest 1
                      === SBC interface ===
                      State: 4, failed transfers: 0
                      Last transfer: 1ms ago
                      RX/TX seq numbers: 39306/39306
                      SPI underruns 0, overruns 0
                      Number of disconnects: 0, IAP RAM available 0x2d3dc
                      Buffer RX/TX: 0/0-0
                      === Duet Control Server ===
                      Duet Control Server v3.2.2
                      Code buffer space: 4096
                      Configured SPI speed: 8000000 Hz
                      Full transfers per second: 51.37
                      Maximum length of RX/TX data transfers: 5092/1688
                      2021-03-03 18:13:58	m122 b1
                      Diagnostics for board 1:
                      Duet EXP3HC firmware version 3.3beta1+1 (2021-03-02 15:56:53)
                      Bootloader ID: SAME5x bootloader version 2.2 (2021-01-16b1)
                      Never used RAM 154884, free system stack 0 words
                      Tasks: Move(notifyWait,80) HEAT(delaying,69) CanAsync(notifyWait,72) CanRecv(notifyWait,82) CanClock(notifyWait,74) TMC(notifyWait,30) MAIN(running,331) AIN(delaying,260)
                      Last reset 10:54:27 ago, cause: software
                      Last software reset at 2020-11-10 18:16, reason: StackOverflow, available RAM 154112, slot 0
                      Software reset code 0x0100 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0000080e BFAR 0xe000ed38 SP 0x2002ffb4 Task � Freestk 4294967295 ok
                      Stack: 2000137c 200013b0 0002525d 00000000 0003e418 20001598 20003b28 00000000 00024885 200014b4 20003ab0 00000000 00f00000 e000ef34 c0000000 20003b30 000249c5 0002472c 61000000 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
                      Driver 0: position -480, 160.0 steps/mm,  standstill, reads 62626, writes 2 timeouts 0, SG min/max 0/1023, steps req 333181004 done 333181004
                      Driver 1: position -48160, 160.0 steps/mm,  standstill, reads 62625, writes 2 timeouts 0, SG min/max 0/1023, steps req 340878416 done 340878416
                      Driver 2: position 48970973, 830.0 steps/mm,  standstill, reads 62625, writes 3 timeouts 0, SG min/max 0/1023, steps req 60334361 done 60334361
                      Moves scheduled 1888677, completed 1888677, in progress 0, hiccups 0, step errors 0, maxPrep 180, maxOverdue 4, maxInc 2, mcErrs 0, gcmErrs 0
                      Peak sync jitter 11, peak Rx sync delay 204, resyncs 36, no step interrupt scheduled
                      VIN: 24.1V, V12: 12.1V
                      MCU temperature: min 39.8C, current 40.0C, max 40.2C
                      Ticks since heat task active 155, ADC conversions started 39267896, completed 39267896, timed out 0
                      Last sensors broadcast 0x00000002 found 1 160 ticks ago, loop time 0
                      CAN messages queued 462985, send timeouts 0, received 2236041, lost 0, free buffers 36, min 36, error reg 0
                      dup 0, oos 0, bm 0, wbm 0
                      2021-03-03 16:46:53	Finished printing file 0:/gcodes/RearLeft v6.2.gcode, print time was 9h 18m
                      
                      1 Reply Last reply Reply Quote 1
                      • ctilley79undefined
                        ctilley79 @dc42
                        last edited by

                        @dc42 said in 3.3b1CAN timeout resets itself?:

                        The messages timed out and messages lost counters get cleared immediately after you run M122.

                        The only time I have seen a problem with the CAN communication was under the following conditions:

                        • The EXP3HC had separate power wires from the PSU

                        I may have screwed this up. So It's best to power the 3HC from the 6HC Mainboard instead of running a separate wire from the PSU? Why?

                        dc42undefined 1 Reply Last reply Reply Quote 0
                        • dc42undefined
                          dc42 administrators @ctilley79
                          last edited by

                          @ctilley79 said in 3.3b1CAN timeout resets itself?:

                          I may have screwed this up. So It's best to power the 3HC from the 6HC Mainboard instead of running a separate wire from the PSU? Why?

                          It only matters if you are powering a high-powered bed heater directly from the Duet and you have long wires between the Duet and the PSU. Under those conditions, the changes in ground voltage caused by the bed heater being switching on/off rapidly with PWM can affect the CAN communication.

                          Duet WiFi hardware designer and firmware engineer
                          Please do not ask me for Duet support via PM or email, use the forum
                          http://www.escher3d.com, https://miscsolutions.wordpress.com

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          Unless otherwise noted, all forum content is licensed under CC-BY-SA