Duet3D Logo Duet3D
    • Tags
    • Documentation
    • Order
    • Register
    • Login

    Dangerous CAN bus failure

    Scheduled Pinned Locked Moved
    General Discussion
    7
    23
    811
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • gnydickundefined
      gnydick
      last edited by

      I walked into my office and saw a print had finished and a communication failure occurred, the LEDs on tools 0 & 1 were blinking fast.

      This was a single tool print, and the heater on tool 0 was still on, but the panel due showed that no tool was selected and no heater was active, yet the temperature read out was the print temperature.

      Sure enough it was still hot. The part cooling fan and cold section fan were still running too.

      oliofundefined gloomyandyundefined 2 Replies Last reply Reply Quote 0
      • oliofundefined
        oliof @gnydick
        last edited by

        @gnydick did you by chance collect the output of M122 when the machine was in this state? If your system is an SBC one, logs from the SBC would also be of interest.

        <>RatRig V-Minion Fly Super5Pro RRF<> V-Core 3.1 IDEX k*****r <> RatRig V-Minion SKR 2 Marlin<>

        gnydickundefined 1 Reply Last reply Reply Quote 1
        • Phaedruxundefined
          Phaedrux Moderator
          last edited by

          It would be good to gather some diagnostics and details. In general though, we are investigating how to handle cases where canbus expansions lose contact.

          Z-Bot CoreXY Build | Thingiverse Profile

          gnydickundefined 1 Reply Last reply Reply Quote 1
          • gnydickundefined
            gnydick @Phaedrux
            last edited by

            @Phaedrux Fail-safe. There needs to be a heartbeat. Without the heartbeat, all nodes should shoot themselves in the head.

            OwenDundefined oliofundefined T3P3Tonyundefined 3 Replies Last reply Reply Quote 0
            • gnydickundefined
              gnydick @oliof
              last edited by

              @oliof where are the SBC logs accumulated?

              oliofundefined 1 Reply Last reply Reply Quote 0
              • OwenDundefined
                OwenD @gnydick
                last edited by

                @gnydick said in Dangerous CAN bus failure:

                @Phaedrux Fail-safe. There needs to be a heartbeat. Without the heartbeat, all nodes should shoot themselves in the head.

                I agree that a heartbeat sounds appropriate. The tool board should shut down all outputs if it's lost communication.

                As a work around could you not monitor the communication and temps in daemon.g ?
                I have my bed and heater power supplies controlled by SSR
                In daemon.g I monitor heater state, temps and other things.
                If a heater is "off" but the temp is not falling then the SSR is shut off.

                gnydickundefined 1 Reply Last reply Reply Quote 0
                • oliofundefined
                  oliof @gnydick
                  last edited by

                  @gnydick

                  1. log into your Single Board Computer
                  2. run sudo journalctl -x -u duetcontrolserver

                  <>RatRig V-Minion Fly Super5Pro RRF<> V-Core 3.1 IDEX k*****r <> RatRig V-Minion SKR 2 Marlin<>

                  1 Reply Last reply Reply Quote 0
                  • oliofundefined
                    oliof @gnydick
                    last edited by

                    @gnydick I agree that the main board should have failed loudly and safely. You'd probably need a relay on the power line to the toolboard to be able to shut it down when itself can't any more, and then shut down the main control board.

                    <>RatRig V-Minion Fly Super5Pro RRF<> V-Core 3.1 IDEX k*****r <> RatRig V-Minion SKR 2 Marlin<>

                    1 Reply Last reply Reply Quote 0
                    • gloomyandyundefined
                      gloomyandy @gnydick
                      last edited by

                      @gnydick Was the board still reporting the hotend temperatures (you can usually see small fluctuations). Was there any error message reported in the DWC console? If it happens again try running M122 to the toolboards to get more information and to check if the mainboard is still able to talk to the board. The rapid flashing light does not necessarily mean that the toolboard has lost all contact with the mainboard, it may just have lost time sync (which can be caused by a total loss of communications but it can also be caused by other things).

                      gnydickundefined 1 Reply Last reply Reply Quote 0
                      • gnydickundefined
                        gnydick @OwenD
                        last edited by

                        @OwenD if there is communication loss, you couldn't tell the toolboard to turn off the heater. that's why it has to be self-governed.

                        OwenDundefined 1 Reply Last reply Reply Quote 0
                        • gnydickundefined
                          gnydick @gloomyandy
                          last edited by

                          @gloomyandy it was still reporting temperature. I'm going to grab the logs to see what there is.

                          1 Reply Last reply Reply Quote 0
                          • OwenDundefined
                            OwenD @gnydick
                            last edited by

                            @gnydick
                            I didn't mean tell the tool board to turn off.
                            I meant if the main board can't communicate or senses heater anomaly then you turn off power to heaters (via SSR's).
                            If it can't communicate it should reflect in the object model (maybe can address?)
                            You said DWC showed heater turned off but stable high temp, so presumably thermistor is on main board?

                            gnydickundefined 1 Reply Last reply Reply Quote 0
                            • gnydickundefined
                              gnydick @OwenD
                              last edited by

                              @OwenD no, the heater and thermistor are both on the toolboard.

                              1 Reply Last reply Reply Quote 0
                              • gnydickundefined
                                gnydick
                                last edited by

                                these were happening and stopped at 00:09

                                Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                                

                                the print then finished at 00:19
                                these are all of the log entries around that time

                                Dec 18 00:19:12 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                                Dec 18 00:19:22 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                                Dec 18 00:19:22 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                                Dec 18 00:19:31 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 121, req type 6013, RID=595
                                Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 122, req type 6013, RID=596
                                Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file tfree0.g on channel File
                                Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file /macros/tool_unlock on channel File
                                Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                                Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                                Dec 18 00:19:35 Duet3 DuetControlServer[26090]: [info] Finished macro file /macros/tool_unlock
                                Dec 18 00:19:36 Duet3 DuetControlServer[26090]: [info] Finished macro file tfree0.g
                                Dec 18 00:19:38 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 121, req type 6013, RID=603
                                Dec 18 00:19:42 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                                Dec 18 00:19:42 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                                Dec 18 00:19:48 Duet3 DuetControlServer[26090]: [info] Finished job file
                                Dec 18 00:19:48 Duet3 DuetControlServer[26090]: [info] Finished printing file 0:/gcodes/bowden anchor x 4.gcode, print time was 1h 10m
                                Dec 18 00:19:52 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                                Dec 18 00:19:52 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                                Dec 18 00:20:02 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                                Dec 18 00:20:02 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                                
                                1 Reply Last reply Reply Quote 0
                                • T3P3Tonyundefined
                                  T3P3Tony administrators @gnydick
                                  last edited by

                                  @gnydick said in Dangerous CAN bus failure:

                                  @Phaedrux Fail-safe. There needs to be a heartbeat. Without the heartbeat, all nodes should shoot themselves in the head.

                                  this is not always the case for all users. Some users want to be able to swap out an inactive tool for example, and so have the can bus drop for that may be acceptable for some users.

                                  I do agree that default turn off all heaters and motors on can bus disconnect is probably a good default.

                                  www.duet3d.com

                                  gnydickundefined 1 Reply Last reply Reply Quote 0
                                  • gnydickundefined
                                    gnydick @T3P3Tony
                                    last edited by gnydick

                                    @T3P3Tony that's true, but that's the exception that should be coded for. One observation, it doesn't seem like the tool boards like being hot swapped, they pop when you plug power in while it's on.

                                    I don't know how aware everyone is, but a few years ago when 3D printers proliferated and cheap clones started popping up with no thermal runaway safeguards, all of the influencers started shredding them, just ripping their reputations apart, telling followers to not buy any of brand X, Y, Z and maybe not even after they fix it, wait a couple generations.

                                    I don't think it would be very good PR for Duet if that started happening again.

                                    dc42undefined 1 Reply Last reply Reply Quote -1
                                    • dc42undefined
                                      dc42 administrators @gnydick
                                      last edited by

                                      @gnydick for your information, the reason that RRF requires the sensor that controls a heater to be on the same board as the heater is so that in the event of any CAN issues, temperature control is never lost. So loss of CAN communication does not have the same consequences as "thermal runaway".

                                      What firmware versions were you running on the main board and tool boards when this issue occurred?

                                      Duet WiFi hardware designer and firmware engineer
                                      Please do not ask me for Duet support via PM or email, use the forum
                                      http://www.escher3d.com, https://miscsolutions.wordpress.com

                                      gnydickundefined 1 Reply Last reply Reply Quote 0
                                      • gnydickundefined
                                        gnydick @dc42
                                        last edited by

                                        @dc42 that's what I would expect, but it's not happening, temperature control is lost for all intents and purposes because the firmware in the toolboard doesn't handle it. Also, you're right, it isn't the same exact thing as thermal runaway but it presents the same risk. No control over heater == END GAME.

                                        gloomyandyundefined 1 Reply Last reply Reply Quote 0
                                        • gloomyandyundefined
                                          gloomyandy @gnydick
                                          last edited by

                                          @gnydick I'm confused in your original post you said that the toolboard was reporting the temperature to be the print temperature, which would imply that it was continuing to control the temperature at the last set value that it saw (which I think is what DC42 said it should do). Is that not happening?

                                          gnydickundefined 1 Reply Last reply Reply Quote 0
                                          • gnydickundefined
                                            gnydick @gloomyandy
                                            last edited by

                                            @gloomyandy for all intents and purposes, the toolboard was sending messages, but not receiving them.

                                            gloomyandyundefined 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Unless otherwise noted, all forum content is licensed under CC-BY-SA