Dangerous CAN bus failure
-
these were happening and stopped at 00:09
Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:24 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel) Dec 18 00:07:25 Duet3 DuetControlServer[26090]: [warn] Resending packet #0 (request GetObjectModel)
the print then finished at 00:19
these are all of the log entries around that timeDec 18 00:19:12 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g Dec 18 00:19:22 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon Dec 18 00:19:22 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g Dec 18 00:19:31 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 121, req type 6013, RID=595 Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 122, req type 6013, RID=596 Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file tfree0.g on channel File Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file /macros/tool_unlock on channel File Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon Dec 18 00:19:32 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g Dec 18 00:19:35 Duet3 DuetControlServer[26090]: [info] Finished macro file /macros/tool_unlock Dec 18 00:19:36 Duet3 DuetControlServer[26090]: [info] Finished macro file tfree0.g Dec 18 00:19:38 Duet3 DuetControlServer[26090]: [error] Response timeout: CAN addr 121, req type 6013, RID=603 Dec 18 00:19:42 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon Dec 18 00:19:42 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g Dec 18 00:19:48 Duet3 DuetControlServer[26090]: [info] Finished job file Dec 18 00:19:48 Duet3 DuetControlServer[26090]: [info] Finished printing file 0:/gcodes/bowden anchor x 4.gcode, print time was 1h 10m Dec 18 00:19:52 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon Dec 18 00:19:52 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g Dec 18 00:20:02 Duet3 DuetControlServer[26090]: [info] Starting macro file daemon.g on channel Daemon Dec 18 00:20:02 Duet3 DuetControlServer[26090]: [info] Finished macro file daemon.g
-
@gnydick said in Dangerous CAN bus failure:
@Phaedrux Fail-safe. There needs to be a heartbeat. Without the heartbeat, all nodes should shoot themselves in the head.
this is not always the case for all users. Some users want to be able to swap out an inactive tool for example, and so have the can bus drop for that may be acceptable for some users.
I do agree that default turn off all heaters and motors on can bus disconnect is probably a good default.
-
@T3P3Tony that's true, but that's the exception that should be coded for. One observation, it doesn't seem like the tool boards like being hot swapped, they pop when you plug power in while it's on.
I don't know how aware everyone is, but a few years ago when 3D printers proliferated and cheap clones started popping up with no thermal runaway safeguards, all of the influencers started shredding them, just ripping their reputations apart, telling followers to not buy any of brand X, Y, Z and maybe not even after they fix it, wait a couple generations.
I don't think it would be very good PR for Duet if that started happening again.
-
@gnydick for your information, the reason that RRF requires the sensor that controls a heater to be on the same board as the heater is so that in the event of any CAN issues, temperature control is never lost. So loss of CAN communication does not have the same consequences as "thermal runaway".
What firmware versions were you running on the main board and tool boards when this issue occurred?
-
@dc42 that's what I would expect, but it's not happening, temperature control is lost for all intents and purposes because the firmware in the toolboard doesn't handle it. Also, you're right, it isn't the same exact thing as thermal runaway but it presents the same risk. No control over heater == END GAME.
-
@gnydick I'm confused in your original post you said that the toolboard was reporting the temperature to be the print temperature, which would imply that it was continuing to control the temperature at the last set value that it saw (which I think is what DC42 said it should do). Is that not happening?
-
@gloomyandy for all intents and purposes, the toolboard was sending messages, but not receiving them.
-
@gnydick Yes but was it maintaining the last set temperature?
-
@gnydick Just to be clear I'm not trying to debate what should happen if a message is lost or if communication with a toolboard is interrupted. I'm trying to establish if the toolboard did what it is supposed to do at the moment and maintained the last set temperature.
-
@gloomyandy I realize that, no worries. It's a good question. I don't know how it's supposed to work out what exactly was happening, just that it's wasn't responsive.