DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1
-
@Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
Could really do with that 'version compatibility' matrix ....
Your wish, and all that. See https://forum.duet3d.com/topic/15928/new-tool-duet-v3-xx-rcx-to-apt-dsf-2-x-x-version-mapping for more info.
Meanwhile a couple of examples:
./DuetVersionsAll.sh Expect this to take several seconds per line of output. duetsoftwareframework 2.1.1 contains reprapfirmware version 2.1.1-1 which is internal version 3.01-RC10 duetsoftwareframework 2.1.0 contains reprapfirmware version 2.1.0-1 which is internal version 3.01-RC9 duetsoftwareframework 2.0.0 contains reprapfirmware version 2.0.0-1 which is internal version 3.01-RC8 duetsoftwareframework 1.3.2 contains reprapfirmware version 1.3.2-1 which is internal version 3.01-RC7 duetsoftwareframework 1.3.1 contains reprapfirmware version 1.3.1-1 which is internal version 3.01-RC6 ...snip...
./DuetVersion.sh Highest Available DuetSoftwareFramework = 2.1.1 Currently Installed DuetSoftwareFramework = 2.1.1 Dependencies for DSF version 2.1.1 are: duetcontrolserver (= 2.1.1) duetsd (= 1.0.6) duettools (= 2.1.1) duetwebserver (= 2.1.0) duetwebcontrol (= 2.1.5) reprapfirmware (>= 2.1.1-1) reprapfirmware (<= 2.1.1-999) ) reprapfirmware apt version 2.1.1-1 is internal version 3.01-RC10
./DuetVersion.sh 22 Highest Available DuetSoftwareFramework = 2.1.1 Currently Installed DuetSoftwareFramework = 2.1.1 Command line arg specified to request Release -22 from highest available. Highest-22 is DuetSoftwareFramework = 1.0.4.1 Dependencies for DSF version 1.0.4.1 are: duetcontrolserver (= 1.0.4.1) duetsd (= 1.0.3) duettools (= 1.0.4.1) duetwebserver (= 1.1.1.0) duetwebcontrol (= 2.0.0-5) reprapfirmware (>= 1.0.4.1-1) reprapfirmware (<= 1.0.4.1-999) ) reprapfirmware apt version 1.0.4.1-1 is internal version 3.0beta10+1
The '22' argument is asking about the release 22 lines "back" from highest available.
-
@Garfield I can reproduce your crash. It's the M591 command for the filament sensor. I'll add more info to the GitHub issue.
-
Data Point: 2.5 hour print successful, nothing else running on the Pi, most of the time disconnected from DWC.
Edit. Several more data points:
It crashed just sitting there. I only noticed after I clicked the "jobs" tab in DWC, and the file listing was blank, this was maybe an hour after the print mentioned above finished. Eventually got a "Error retrieving file list" pop up. Existing SSH session was dead. New SSH session will not connect.
I had done a "emergency stop" about an hour before the above. DWC did reconnect after that, in fact I homed the printer. Of course, that would not affect SSH, and it did not at that time.
So... No way in... Power cycle coming up.
Power cycle #1 did not bring it back. Absolutely nothing on DWC and SSH won't connect, IP of Pi does not respond to ping. HDMI and USB console shows Chromium with DWC up, but will not show, nor move, a mouse cursor. Totally hung, even on the hardware console. (The HDMI/USB was "hot plugged" after boot; this has always worked in the past.)
But interesting that it booted to Chromium and DWC before it hung. The green popup "Connected to..." was still there. It should have timed out and gone away.
Power Cycle #2 worked. DWC and SSH fine. Going to start another print.
-
-
@Danal
Yup, that sounds in-line with the sort of things I'm seeing. I wonder if your first failed power cycle was temperature related? The first time I left the crash unnoticed it cooked the Pis CPU - I noticed the smell of something warm before realising it had all crashed. After this it took a couple of resets to get things to boot normally again.
I've had occasions where it'll boot and I can SSH in with a couple of simultaneous sessions but as soon as I click connect on the DWC session on the Pi (chrome and DWC auto-load on boot), then it'll cause the crash sometimes.@gtj0 said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
@Garfield I can reproduce your crash. It's the M591 command for the filament sensor. I'll add more info to the GitHub issue.
Which filament sensor do you use? I have a laser one that I disconnected a few versions back as it caused another bug and because I never managed to get useful data from it. I might try re-connecting to see if it's the same bug I was seeing before as the current descriptions sound very similar.
-
I use the Duet magnetic one, has worked pretty well up till now although the data reported wasn't always 'accurate'
-
Data point: Printing for 4.5 hours as I type this.
Filament Sensor: None.
Also, not that it should make any difference, it is a toolchanger.
-
@ChrisP Thanks for reporting this, I've been able to reproduce it and it looks like this is related to
CPUSchedulingPolicy=fifo
. Once that line is removed from/usr/lib/systemd/system/duetcontrolserver.service
and the Pi is restarted, the problem does not seem to occur any more. So I'll remove that particular line in the next version again.@Garfield I could reproduce your problem, too, and it will be fixed in the next DSF version. This error won't occur again in future versions. Sorry for the inconvenience.
I'm happy to help but please keep in mind that the
unstable
package feed is primarily for testing purposes and that bugs can occur - even though I perfectly understand they can be quite annoying. So thanks again to everyone testing it and reporting issues. -
No complaints here, apology not needed, got into the RC stream, knew the risks, happy to help diagnose and fix.
I started work on a test plan but it is kind of hard to know what to test beyond normal user interraction, when things blow up as a result of previously functional code it is hard to know how to report it, harder to know the information needed to assist in the debug / analysis / remediation.
I find it frustrating and interesting - just sometimes it is more frustration - and even then more frustration at a lack of ability to trace through the fault.
Blunt question - Is it possible to seperate the apps such that loss of DCS does not prevent access to the web console, yes I know information won't be updating but a big red status flag 'No DCS Communication' or similar could be added. Make it possible to access diagnostic logs and still be able to edit .g files from the web gui. Possibly have some 'macro' to collect basic diagnostic information for submission.
I'm not a linux fan, I don't fully understand the code, I'm trying to understand, sometimes the frustration hits the keyboard.
You want me to do some tests just reach out, I think I'm getting pretty good at switching versions in and out, which is a big thank you to all that have stepped in and helped.
-
-
Other 'problems' - well not a problem but how can I get 'All' my fans to display - and I do mean all not just those allocated to a tool.
My MCU fan will not appear, the thermostatic doesn't (would be nice to have that up there with adjustable on / off thresholds).
Hows that for off topic
-
@chrishamm confirmed functional thank you.
-
@Garfield Thanks for confirming! Thermostatic fans don't show up yet. If other fans aren't visible, check the "Print Status" page -> Fans -> Change visibility.
-
They don't appear in the 'visibility' options, only one that does is my part cooling. Will start another 'thread' ...
-
Put 2.1.2 on, and this is a very soft data point, I could easily be letting expectation affect perception...
The Pi seems "snappier" at the command prompt.
Will print and post hard data later.
-
@Danal With the DCS no longer classified as a "real-time" process, the command line should be a little snappier.
-
@chrishamm said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
@ChrisP Thanks for reporting this, I've been able to reproduce it and it looks like this is related to
CPUSchedulingPolicy=fifo
. Once that line is removed from/usr/lib/systemd/system/duetcontrolserver.service
and the Pi is restarted, the problem does not seem to occur any more. So I'll remove that particular line in the next version again.Yeh, this was the conclusion I came to too after commenting it out. Have updated to 2.1.2 and it's now all running well - thanks
@Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
No complaints here, apology not needed, got into the RC stream, knew the risks, happy to help diagnose and fix.
Yup, I'm the same. While the ideal is to have a working printer, tracking down the bugs is interesting too