• Tags
  • Documentation
  • Order
  • Register
  • Login
Duet3D Logo Duet3D
  • Tags
  • Documentation
  • Order
  • Register
  • Login

Dangerous CAN bus failure

Scheduled Pinned Locked Moved
General Discussion
7
23
812
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • undefined
    gnydick @oliof
    last edited by 18 Dec 2022, 19:46

    @oliof where are the SBC logs accumulated?

    undefined 1 Reply Last reply 18 Dec 2022, 21:13 Reply Quote 0
    • undefined
      OwenD @gnydick
      last edited by 18 Dec 2022, 20:11

      @gnydick said in Dangerous CAN bus failure:

      @Phaedrux Fail-safe. There needs to be a heartbeat. Without the heartbeat, all nodes should shoot themselves in the head.

      I agree that a heartbeat sounds appropriate. The tool board should shut down all outputs if it's lost communication.

      As a work around could you not monitor the communication and temps in daemon.g ?
      I have my bed and heater power supplies controlled by SSR
      In daemon.g I monitor heater state, temps and other things.
      If a heater is "off" but the temp is not falling then the SSR is shut off.

      undefined 1 Reply Last reply 18 Dec 2022, 23:24 Reply Quote 0
      • undefined
        oliof @gnydick
        last edited by 18 Dec 2022, 21:13

        @gnydick

        1. log into your Single Board Computer
        2. run sudo journalctl -x -u duetcontrolserver

        <>RatRig V-Minion Fly Super5Pro RRF<> V-Core 3.1 IDEX k*****r <> RatRig V-Minion SKR 2 Marlin<>

        1 Reply Last reply Reply Quote 0
        • undefined
          oliof @gnydick
          last edited by 18 Dec 2022, 21:18

          @gnydick I agree that the main board should have failed loudly and safely. You'd probably need a relay on the power line to the toolboard to be able to shut it down when itself can't any more, and then shut down the main control board.

          <>RatRig V-Minion Fly Super5Pro RRF<> V-Core 3.1 IDEX k*****r <> RatRig V-Minion SKR 2 Marlin<>

          1 Reply Last reply Reply Quote 0
          • undefined
            gloomyandy @gnydick
            last edited by 18 Dec 2022, 22:28

            @gnydick Was the board still reporting the hotend temperatures (you can usually see small fluctuations). Was there any error message reported in the DWC console? If it happens again try running M122 to the toolboards to get more information and to check if the mainboard is still able to talk to the board. The rapid flashing light does not necessarily mean that the toolboard has lost all contact with the mainboard, it may just have lost time sync (which can be caused by a total loss of communications but it can also be caused by other things).

            undefined 1 Reply Last reply 18 Dec 2022, 23:25 Reply Quote 0
            • undefined
              gnydick @OwenD
              last edited by 18 Dec 2022, 23:24

              @OwenD if there is communication loss, you couldn't tell the toolboard to turn off the heater. that's why it has to be self-governed.

              undefined 1 Reply Last reply 19 Dec 2022, 00:04 Reply Quote 0
              • undefined
                gnydick @gloomyandy
                last edited by 18 Dec 2022, 23:25

                @gloomyandy it was still reporting temperature. I'm going to grab the logs to see what there is.

                1 Reply Last reply Reply Quote 0
                • undefined
                  OwenD @gnydick
                  last edited by 19 Dec 2022, 00:04

                  @gnydick
                  I didn't mean tell the tool board to turn off.
                  I meant if the main board can't communicate or senses heater anomaly then you turn off power to heaters (via SSR's).
                  If it can't communicate it should reflect in the object model (maybe can address?)
                  You said DWC showed heater turned off but stable high temp, so presumably thermistor is on main board?

                  undefined 1 Reply Last reply 19 Dec 2022, 02:53 Reply Quote 0
                  • undefined
                    gnydick @OwenD
                    last edited by 19 Dec 2022, 02:53

                    @OwenD no, the heater and thermistor are both on the toolboard.

                    1 Reply Last reply Reply Quote 0
                    • undefined
                      gnydick
                      last edited by 19 Dec 2022, 02:57

                      these were happening and stopped at 00:09

                      Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
                      

                      the print then finished at 00:19
                      these are all of the log entries around that time

                      Dec 18 00:19:12 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                      Dec 18 00:19:22 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                      Dec 18 00:19:22 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                      Dec 18 00:19:31 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 121, req type 6013, RID=595
                      Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 122, req type 6013, RID=596
                      Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file tfree0.g on channel File
                      Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file /macros/tool_unlock on channel File
                      Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                      Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                      Dec 18 00:19:35 Duet3 DuetControlServer[26090]: [info] Finished macro file /macros/tool_unlock
                      Dec 18 00:19:36 Duet3 DuetControlServer[26090]: [info] Finished macro file tfree0.g
                      Dec 18 00:19:38 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 121, req type 6013, RID=603
                      Dec 18 00:19:42 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                      Dec 18 00:19:42 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                      Dec 18 00:19:48 Duet3 DuetControlServer[26090]: [info] Finished job file
                      Dec 18 00:19:48 Duet3 DuetControlServer[26090]: [info] Finished printing file 0:/gcodes/bowden anchor x 4.gcode, print time was 1h 10m
                      Dec 18 00:19:52 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                      Dec 18 00:19:52 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                      Dec 18 00:20:02 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon
                      Dec 18 00:20:02 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
                      
                      1 Reply Last reply Reply Quote 0
                      • undefined
                        T3P3Tony administrators @gnydick
                        last edited by 20 Dec 2022, 14:53

                        @gnydick said in Dangerous CAN bus failure:

                        @Phaedrux Fail-safe. There needs to be a heartbeat. Without the heartbeat, all nodes should shoot themselves in the head.

                        this is not always the case for all users. Some users want to be able to swap out an inactive tool for example, and so have the can bus drop for that may be acceptable for some users.

                        I do agree that default turn off all heaters and motors on can bus disconnect is probably a good default.

                        www.duet3d.com

                        undefined 1 Reply Last reply 21 Dec 2022, 00:52 Reply Quote 0
                        • undefined
                          gnydick @T3P3Tony
                          last edited by gnydick 21 Dec 2022, 00:52

                          @T3P3Tony that's true, but that's the exception that should be coded for. One observation, it doesn't seem like the tool boards like being hot swapped, they pop when you plug power in while it's on.

                          I don't know how aware everyone is, but a few years ago when 3D printers proliferated and cheap clones started popping up with no thermal runaway safeguards, all of the influencers started shredding them, just ripping their reputations apart, telling followers to not buy any of brand X, Y, Z and maybe not even after they fix it, wait a couple generations.

                          I don't think it would be very good PR for Duet if that started happening again.

                          undefined 1 Reply Last reply 21 Dec 2022, 12:46 Reply Quote -1
                          • undefined
                            dc42 administrators @gnydick
                            last edited by 21 Dec 2022, 12:46

                            @gnydick for your information, the reason that RRF requires the sensor that controls a heater to be on the same board as the heater is so that in the event of any CAN issues, temperature control is never lost. So loss of CAN communication does not have the same consequences as "thermal runaway".

                            What firmware versions were you running on the main board and tool boards when this issue occurred?

                            Duet WiFi hardware designer and firmware engineer
                            Please do not ask me for Duet support via PM or email, use the forum
                            http://www.escher3d.com, https://miscsolutions.wordpress.com

                            undefined 1 Reply Last reply 9 Apr 2023, 08:57 Reply Quote 0
                            • undefined
                              gnydick @dc42
                              last edited by 9 Apr 2023, 08:57

                              @dc42 that's what I would expect, but it's not happening, temperature control is lost for all intents and purposes because the firmware in the toolboard doesn't handle it. Also, you're right, it isn't the same exact thing as thermal runaway but it presents the same risk. No control over heater == END GAME.

                              undefined 1 Reply Last reply 9 Apr 2023, 09:35 Reply Quote 0
                              • undefined
                                gloomyandy @gnydick
                                last edited by 9 Apr 2023, 09:35

                                @gnydick I'm confused in your original post you said that the toolboard was reporting the temperature to be the print temperature, which would imply that it was continuing to control the temperature at the last set value that it saw (which I think is what DC42 said it should do). Is that not happening?

                                undefined 1 Reply Last reply 9 Apr 2023, 09:37 Reply Quote 0
                                • undefined
                                  gnydick @gloomyandy
                                  last edited by 9 Apr 2023, 09:37

                                  @gloomyandy for all intents and purposes, the toolboard was sending messages, but not receiving them.

                                  undefined 2 Replies Last reply 9 Apr 2023, 09:43 Reply Quote 0
                                  • undefined
                                    gloomyandy @gnydick
                                    last edited by 9 Apr 2023, 09:43

                                    @gnydick Yes but was it maintaining the last set temperature?

                                    1 Reply Last reply Reply Quote 0
                                    • undefined
                                      gloomyandy @gnydick
                                      last edited by 9 Apr 2023, 10:26

                                      @gnydick Just to be clear I'm not trying to debate what should happen if a message is lost or if communication with a toolboard is interrupted. I'm trying to establish if the toolboard did what it is supposed to do at the moment and maintained the last set temperature.

                                      undefined 1 Reply Last reply 10 Apr 2023, 00:55 Reply Quote 2
                                      • undefined
                                        gnydick @gloomyandy
                                        last edited by 10 Apr 2023, 00:55

                                        @gloomyandy I realize that, no worries. It's a good question. I don't know how it's supposed to work out what exactly was happening, just that it's wasn't responsive.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Unless otherwise noted, all forum content is licensed under CC-BY-SA