Incident report: RRF 3 RC6 DWC 2.1.0 Lockup during print.
-
Stopped during a print, about 30 min into print. Pi was not running anything extra. ssh sessions stayed intact. DWC lost contact with printer, did not regain until sudo systemctl restart duetcontrolserver. I captured the logs below before the restart. I was messing around with file names, attempting to suppress M73 messages, trying to get the capitalization correct from memory. But, as you can see from the logs, I got that several minutes prior to the hang.
Duet Web Control 2.1.0
Board: Duet 3 MB6HC (MB6HC)
DSF Version: 1.3.0.0
Firmware: RepRapFirmware for Duet 3 MB6HC 3.01-RC6 (2020-04-03b3)Apr 11 21:10:08 duet3 DuetControlServer[350]: [info] Selected file /opt/dsf/sd/gcodes/Double_Filter_Cover.gcode Apr 11 21:10:08 duet3 DuetControlServer[350]: [info] Starting file print Apr 11 21:10:08 duet3 DuetControlServer[350]: [info] File: Optional macro file start.g not found Apr 11 21:10:08 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:10:08 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:10:18 duet3 DuetControlServer[350]: [info] Executing nested macro file tpre0.g on channel File Apr 11 21:10:18 duet3 DuetControlServer[350]: [info] File: Finished intermediate macro file tpre0.g Apr 11 21:10:18 duet3 DuetControlServer[350]: [info] Executing nested macro file tpost0.g on channel File Apr 11 21:12:15 duet3 DuetControlServer[350]: [info] Executing nested macro file tool_lock.g on channel File Apr 11 21:12:16 duet3 DuetControlServer[350]: [info] File: Finished macro file tool_lock.g Apr 11 21:12:19 duet3 DuetControlServer[350]: [info] File: Finished macro file tpost0.g Apr 11 21:12:19 duet3 DuetControlServer[350]: [info] File: Optional macro file M900.g not found Apr 11 21:12:19 duet3 DuetControlServer[350]: [warn] M900: Command is not supported Apr 11 21:12:19 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:12:19 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:13:23 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:13:24 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:14:33 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:14:33 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:15:42 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:15:42 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:16:51 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:16:51 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:17:59 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:17:59 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:19:09 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:19:09 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:20:15 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:20:15 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:21:19 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:21:19 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:22:31 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:22:31 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:23:44 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:23:44 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:24:56 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:24:56 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:26:05 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:26:05 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:27:17 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:27:17 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:28:29 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:28:29 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:29:38 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:29:38 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:30:52 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:30:52 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:32:06 duet3 DuetControlServer[350]: [info] File: Optional macro file M73.g not found Apr 11 21:32:06 duet3 DuetControlServer[350]: [warn] M73: Command is not supported Apr 11 21:33:17 duet3 DuetControlServer[350]: [info] Executing nested macro file M73.g on channel File Apr 11 21:33:18 duet3 DuetControlServer[350]: [info] File: Finished macro file M73.g Apr 11 21:34:29 duet3 DuetControlServer[350]: [info] Executing nested macro file M73.g on channel File Apr 11 21:34:29 duet3 DuetControlServer[350]: [info] File: Finished macro file M73.g Apr 11 21:35:40 duet3 DuetControlServer[350]: [info] Executing nested macro file M73.g on channel File Apr 11 21:35:40 duet3 DuetControlServer[350]: [info] File: Finished macro file M73.g Apr 11 21:36:54 duet3 DuetControlServer[350]: [info] Executing nested macro file M73.g on channel File Apr 11 21:36:54 duet3 DuetControlServer[350]: [info] File: Finished macro file M73.g Apr 11 21:38:05 duet3 DuetControlServer[350]: [info] Executing nested macro file M73.g on channel File Apr 11 21:38:05 duet3 DuetControlServer[350]: [info] File: Finished macro file M73.g Apr 11 21:39:15 duet3 DuetControlServer[350]: [info] Executing nested macro file M73.g on channel File Apr 11 21:39:16 duet3 DuetControlServer[350]: [info] File: Finished macro file M73.g Apr 11 21:40:27 duet3 DuetControlServer[350]: [info] Executing nested macro file M73.g on channel File Apr 11 21:40:27 duet3 DuetControlServer[350]: [info] File: Finished macro file M73.g pi@duet3:~ $
Apr 11 21:44:54 duet3 DuetWebServer[458]: Request starting HTTP/1.1 GET http://192.168.7.101/Job/Status Apr 11 21:44:54 duet3 DuetWebServer[458]: info: DuetWebServer.Middleware.CustomEndpointMiddleware[0] Apr 11 21:44:54 duet3 DuetWebServer[458]: No endpoint found for GET request via /Job/Status Apr 11 21:44:54 duet3 DuetWebServer[458]: warn: DuetWebServer.Middleware.FallbackMiddleware[0] Apr 11 21:44:54 duet3 DuetWebServer[458]: Could not find resource /Job/Status. Redirecting to / Apr 11 21:44:54 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[2] Apr 11 21:44:54 duet3 DuetWebServer[458]: Request finished in 5.1706ms 302 Apr 11 21:44:54 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[1] Apr 11 21:44:54 duet3 DuetWebServer[458]: Request starting HTTP/1.1 GET http://192.168.7.101/ Apr 11 21:44:54 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware[6] Apr 11 21:44:54 duet3 DuetWebServer[458]: The file /index.html was not modified Apr 11 21:44:54 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[2] Apr 11 21:44:54 duet3 DuetWebServer[458]: Request finished in 19.7337ms 304 text/html Apr 11 21:44:54 duet3 DuetWebServer[458]: info: DuetWebServer.Controllers.WebSocketController[0] Apr 11 21:44:54 duet3 DuetWebServer[458]: WebSocket disconnected from ::ffff:192.168.7.210:61964 Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[1] Apr 11 21:44:55 duet3 DuetWebServer[458]: Request starting HTTP/1.1 GET http://192.168.7.101/rr_connect?password=r Apr 11 21:44:55 duet3 DuetWebServer[458]: info: DuetWebServer.Middleware.CustomEndpointMiddleware[0] Apr 11 21:44:55 duet3 DuetWebServer[458]: No endpoint found for GET request via /rr_connect Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[2] Apr 11 21:44:55 duet3 DuetWebServer[458]: Request finished in 6.0337ms 404 Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[1] Apr 11 21:44:55 duet3 DuetWebServer[458]: Request starting HTTP/1.1 GET http://192.168.7.101/machine Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Routing.EndpointMiddleware[0] Apr 11 21:44:55 duet3 DuetWebServer[458]: Executing endpoint 'DuetWebServer.Controllers.WebSocketController.Get (D Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker[3] Apr 11 21:44:55 duet3 DuetWebServer[458]: Route matched with {action = "Get", controller = "WebSocket"}. Executing Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[1] Apr 11 21:44:55 duet3 DuetWebServer[458]: Request starting HTTP/1.1 GET http://192.168.7.101/favicon.ico Apr 11 21:44:55 duet3 DuetWebServer[458]: info: DuetWebServer.Controllers.WebSocketController[0] Apr 11 21:44:55 duet3 DuetWebServer[458]: WebSocket connected from ::ffff:192.168.7.210:61974 Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware[6] Apr 11 21:44:55 duet3 DuetWebServer[458]: The file /favicon.ico was not modified Apr 11 21:44:55 duet3 DuetWebServer[458]: info: Microsoft.AspNetCore.Hosting.Diagnostics[2] Apr 11 21:44:55 duet3 DuetWebServer[458]: Request finished in 6.1615ms 304 image/x-icon
-
And again. About four minutes this time.
Apr 11 22:47:30 duet3 DuetControlServer[1699]: [info] Selected file /opt/dsf/sd/gcodes/Double_Filter_Cover.gcode Apr 11 22:47:30 duet3 DuetControlServer[1699]: [info] Starting file print Apr 11 22:47:30 duet3 DuetControlServer[1699]: [info] File: Optional macro file start.g not found Apr 11 22:47:30 duet3 DuetControlServer[1699]: [info] Executing nested macro file M73.g on channel File Apr 11 22:47:30 duet3 DuetControlServer[1699]: [info] File: Finished macro file M73.g Apr 11 22:48:20 duet3 DuetControlServer[1699]: [info] Executing nested macro file tpre0.g on channel File Apr 11 22:48:21 duet3 DuetControlServer[1699]: [info] File: Finished intermediate macro file tpre0.g Apr 11 22:48:21 duet3 DuetControlServer[1699]: [info] Executing nested macro file tpost0.g on channel File Apr 11 22:50:15 duet3 DuetControlServer[1699]: [info] Executing nested macro file tool_lock.g on channel File Apr 11 22:50:17 duet3 DuetControlServer[1699]: [info] File: Finished macro file tool_lock.g Apr 11 22:50:19 duet3 DuetControlServer[1699]: [info] File: Finished macro file tpost0.g Apr 11 22:50:19 duet3 DuetControlServer[1699]: [info] Executing nested macro file M900.g on channel File Apr 11 22:50:20 duet3 DuetControlServer[1699]: [info] File: Finished macro file M900.g Apr 11 22:50:20 duet3 DuetControlServer[1699]: [info] Executing nested macro file M73.g on channel File Apr 11 22:50:20 duet3 DuetControlServer[1699]: [info] File: Finished macro file M73.g Apr 11 22:51:24 duet3 DuetControlServer[1699]: [info] Executing nested macro file M73.g on channel File Apr 11 22:51:24 duet3 DuetControlServer[1699]: [info] File: Finished macro file M73.g
Apr 11 22:47:31 duet3 DuetWebServer[464]: Request finished in 20.2554ms 200 application/json Apr 11 22:47:31 duet3 DuetWebServer[464]: info: Microsoft.AspNetCore.Hosting.Diagnostics[1] Apr 11 22:47:31 duet3 DuetWebServer[464]: Request starting HTTP/1.1 GET http://192.168.7.101/machine/fileinfo/0:%2Fgcodes%2FDouble_Filter_Cover.gcode application/json Apr 11 22:47:31 duet3 DuetWebServer[464]: info: Microsoft.AspNetCore.Routing.EndpointMiddleware[0] Apr 11 22:47:31 duet3 DuetWebServer[464]: Executing endpoint 'DuetWebServer.Controllers.MachineController.GetFileinfo (DuetWebServer)' Apr 11 22:47:31 duet3 DuetWebServer[464]: info: Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker[3] Apr 11 22:47:31 duet3 DuetWebServer[464]: Route matched with {action = "GetFileinfo", controller = "Machine"}. Executing controller action with signature System.Threading.Tasks.Task`1[Microsoft.AspNetCore.Mvc.IActionResult] GetFileinfo(System.String) on controller DuetWebSe Apr 11 22:47:31 duet3 DuetWebServer[464]: info: Microsoft.AspNetCore.Mvc.Infrastructure.ContentResultExecutor[1] Apr 11 22:47:31 duet3 DuetWebServer[464]: Executing ContentResult with HTTP Response ContentType of application/json Apr 11 22:47:31 duet3 DuetWebServer[464]: info: Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker[2] Apr 11 22:47:31 duet3 DuetWebServer[464]: Executed action DuetWebServer.Controllers.MachineController.GetFileinfo (DuetWebServer) in 197.7868ms Apr 11 22:47:31 duet3 DuetWebServer[464]: info: Microsoft.AspNetCore.Routing.EndpointMiddleware[1] Apr 11 22:47:31 duet3 DuetWebServer[464]: Executed endpoint 'DuetWebServer.Controllers.MachineController.GetFileinfo (DuetWebServer)' Apr 11 22:47:31 duet3 DuetWebServer[464]: info: Microsoft.AspNetCore.Hosting.Diagnostics[2] Apr 11 22:47:31 duet3 DuetWebServer[464]: Request finished in 215.5405ms 200 application/json Apr 11 23:02:12 duet3 DuetWebServer[464]: info: DuetWebServer.Controllers.WebSocketController[0] Apr 11 23:02:12 duet3 DuetWebServer[464]: WebSocket disconnected from ::ffff:192.168.7.210:62174
-
It seems to suspiciously correlate with giving the tab running DWC focus. I'm going to start the job and disconnect.
-
I’ve been having a similar issue. I’m going to do some more testing, but I’m starting to suspect that it happens when I open another an instance on another computer to check on the print. Typically I have one window always opened on my main computer looking at the web page, it on occasion I will open a web page on an iPad when I’m downstairs hanging out with the family. I think that the times I’ve come up to find the printer stalled and unresponsive until I power cycle the printer, might be related to that. Otherwise, I have no repeatable factor to why it’s started acting in the manner it has, mirroring the behavior you’re seeing.
-
Just had the third failure in a row, about 30 minutes in. Absolutely no DWC after starting the job.
Now regressing. Deciding exactly how and to what...
-
My attempt at regressing to 1.2.5.0.
apt will not automatically deal with regressed dependencies, but it will tell you what would work. I've edited out 'false starts' and left only the commands that worked, in the order that they worked.
mkdir backup cp -r /opt/dsf/* backup ls -al backup/sd/sys sudo apt remove duetsoftwareframework sudo apt autopurge sudo apt autoclean sudo apt list -a duetsoftwareframework sudo apt install duetruntime=1.2.5.0 sudo apt install duetcontrolserver=1.2.5.0 sudo apt install duetsd=1.0.5 sudo apt install duettools=1.2.5.0 sudo apt install duetwebserver=1.2.3.1 sudo apt install duetwebcontrol=2.0.7-1 sudo apt install reprapfirmware=1.2.5.0-1 sudo apt install duetsoftwareframework=1.2.5.0 sudo reboot
The backup is not strictly required, apt uninstall won't delete non-empty directories.
-
At this point, duetcontrolserver is trying over and over to connect to the board. It can't, probably because of RRF mismatch. Time to 'bossac' the board. From the Pi:
~/BOSSA/bin/bossac -e -w -v -b /opt/dsf/sd/sys/Duet3Firmware_MB6HC.bin
This should be the correct release of firmware to match the DSF and friends, as installed by apt.
Worked... Had to press the reset button (or power cycle) after bossac. Also, good thing I made the backup, config.g was overlaid.
Now on:
Duet Web Control 2.0.7
Board: Duet 3 MB6HC v0.6 or 1.0 (MB6HC)
DSF Version: 1.2.5.0
Firmware: RepRapFirmware for Duet 3 MB6HC 3.01-RC4 (2020-03-16b1) -
Next step:
M997 B1
M997 B2M115 B0
M115 B1
M115 B24/12/2020, 1:24:31 AM M115 B2 Board EXP3HC firmware 3.01-RC4 (2020-03-16b1) 4/12/2020, 1:24:25 AM M115 B1 Board EXP3HC firmware 3.01-RC4 (2020-03-16b1) 4/12/2020, 1:24:16 AM M115 B0 FIRMWARE_NAME: RepRapFirmware for Duet 3 MB6HC FIRMWARE_VERSION: 3.01-RC4 ELECTRONICS: Duet 3 MB6HC v0.6 or 1.0 FIRMWARE_DATE: 2020-03-16b1
Now to power cycle one more time, home, level, and run same print job.
-
@Danal have exactly the same issue...
-
@dc42 said:
There are no plans for a RRF 3.01-RC7 release, because there are no bugs in 3.01-RC6.
-
@chas2706 said in Incident report: RRF 3 RC6 DWC 2.1.0 Lockup during print.:
because there are no bugs in 3.01-RC6.
I would think its obvious that we're talking about "known" bugs; and plans change.
Virtually no amount of testing can prove the absence of bugs.
-
Same print job finished just fine, about 3 hours, on RC4 and friends.
Duet Web Control 2.0.7
Board: Duet 3 MB6HC v0.6 or 1.0 (MB6HC)
DSF Version: 1.2.5.0
Firmware: RepRapFirmware for Duet 3 MB6HC 3.01-RC4 (2020-03-16b1) -
I would report all DSF issues to @chrishamm on github...
https://github.com/chrishamm/DuetSoftwareFramework/issues
or in this thread... https://forum.duet3d.com/topic/15343/dsf-1-3-1-unstable-released -
@gtj0 How would I determine whether to post in RRF 3.0RC6 vs. DSF 1.3?
-
@Danal run the job in Duet 3 standalone with the same settings and firmware version?
Ian
-
@Danal said in Incident report: RRF 3 RC6 DWC 2.1.0 Lockup during print.:
@gtj0 How would I determine whether to post in RRF 3.0RC6 vs. DSF 1.3?
I have asked that question many times and I've never received a satisfactory response.
@droftarts said in Incident report: RRF 3 RC6 DWC 2.1.0 Lockup during print.:
@Danal run the job in Duet 3 standalone with the same settings and firmware version?
Ian
Don't get me started.
It's not always easy to just "run in standalone mode". In my case, I have to remove the covers from the printer to get to the sd card and remove the cable between the Duet and the SBC.
I also don't believe it's the user's responsibility to have to determine which component of the system is at fault. That should be Duet3D's responsibility. The fact that RRF and DSF seem to be owned and operated by 2 separate companies irks me to no end.
-
I agree. From a user's perspective, DSF and RRF are the same thing. Gcode goes in one end and motion comes out the other.
For example, it is incredibly weird that M999 restarts RRF but not DSF, even when it is absolutely reproducible that any number of hangs are cleared ONLY by restarting DSF. Which a user of this "gcode everywhere" system has no way to do (other than on the Pi, and with sudo no less!)
I love Duet and have been a happy advocate for at least a few years. I am aghast at the latest directions and actions. I sincerely hope they course correct. (To be clear, I'm all in favor of SBC/Pi integration. It is the way that's being accomplished that is going to send a great company downhill if they don't change something).
-
Also, as regards how easy to "reproduce in stand-alone", I don't have any ethernet that will reach. I am not the first owner of this house, and there is not an inch of Cat anything in it (except about 1 foot (1/2 meter) between the cable modem and the main wireless router). I'm also not real motivated to make a special SD card, run special ether, etc, etc, to run the printer in a mode which I will literally never run. It goes back to DSF and RRF being layers of the same thing, and they should be supported that way.
-
OK, rant over. Sort of.
-
And here I am, just sad that RRF 2 isn't getting the attention it needs and deserves! We need an LTS team!