DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1
-
Okay, so I managed to induce it this time simply by restarting the Pi from command line. I can tell when it's going to happen before the gui fully loads on the Pi because putty takes an age to show the login prompt. This time I managed to get 3 sessions open - one to run
top
to see the CPU usage, one for the journal log (below) and one to restart the DCS service. the service session up to 17:43:37 was exhibiting the problem. I then restarted the processsudo systemctl restart duetcontrolserver.service
(at 17:44:38) at which point DCS restarted and everything was fine. The log shows nothing else thoughpi@starttex:~ $ sudo journalctl -u duetcontrolserver -f -- Logs begin at Sun 2020-04-26 17:43:32 BST. -- Apr 26 17:43:35 starttex systemd[1]: Started Duet Control Server. Apr 26 17:43:38 starttex DuetControlServer[359]: Duet Control Server v2.1.1 Apr 26 17:43:38 starttex DuetControlServer[359]: Written by Christian Hammacher for Duet3D Apr 26 17:43:38 starttex DuetControlServer[359]: Licensed under the terms of the GNU Public License Version 3 Apr 26 17:43:39 starttex DuetControlServer[359]: [info] Settings loaded Apr 26 17:43:40 starttex DuetControlServer[359]: [info] Environment initialized Apr 26 17:43:40 starttex DuetControlServer[359]: [info] Connection to Duet established Apr 26 17:43:40 starttex DuetControlServer[359]: [info] IPC socket created at /var/run/dsf/dcs.sock Apr 26 17:44:37 starttex DuetControlServer[359]: [info] System time has been changed Apr 26 17:44:38 starttex systemd[1]: Stopping Duet Control Server... Apr 26 17:44:38 starttex DuetControlServer[359]: [warn] Received SIGTERM, shutting down... Apr 26 17:44:38 starttex systemd[1]: duetcontrolserver.service: Main process exited, code=exited, status=143/n/a Apr 26 17:44:38 starttex systemd[1]: duetcontrolserver.service: Failed with result 'exit-code'. Apr 26 17:44:38 starttex systemd[1]: Stopped Duet Control Server. Apr 26 17:44:38 starttex systemd[1]: Started Duet Control Server. Apr 26 17:44:38 starttex DuetControlServer[1719]: Duet Control Server v2.1.1 Apr 26 17:44:38 starttex DuetControlServer[1719]: Written by Christian Hammacher for Duet3D Apr 26 17:44:38 starttex DuetControlServer[1719]: Licensed under the terms of the GNU Public License Version 3 Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] Settings loaded Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] Environment initialized Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] Connection to Duet established Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] IPC socket created at /var/run/dsf/dcs.sock Apr 26 17:45:09 starttex DuetControlServer[1719]: [info] System time has been changed
-
@gloomyandy said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
...I forced a reload of DWC (to make sure I was using the updated DWC) and it refreshed and said it was trying to connect, at that point the window I had open running ssh to the pi popped up a notification that the pi connection had been lost. Following this I was unable to ssh back to the pi and the DWC web server was no longer responding...
This is exactly the behaviour I observe if I actually manage to connect.
-
Concur - RPi is unusable, unpredictable, certainly wouldn't trust this printer to do anything useful currently.
Can't even access via SSH
Only way to get it back is power cycle.
No point going any further with this version, now need to try and roll back this version - can't use it, can't even reliably help debug.
-
If you're unable to access SSH there is a serial console on the GPIO header, might be a bit tricky to get to it with the Duet ribbon cable in place unless you made your own cable for this purpose.
(of course there is also a HDMI port and a USB port for a keyboard..)
-
More hassle than I'm in the mood for, it is clear that the RPi runs for a while, then no more connection after a time ,,,,
Currently trying to roll back (never done it before) ... so have RPi SD card on the bench ....
-
@Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
More hassle than I'm in the mood for, it is clear that the RPi runs for a while, then no more connection after a time ,,,,
Currently trying to roll back (never done it before) ... so have RPi SD card on the bench ....
I've done this a few times now. The easiest way I've found is to remove whats there first:
sudo apt remove duet* reprap*
Then to go back you have to install the components by specifying the version for each package. This will get you back to RC9 etc:
sudo apt install duetsoftwareframework=2.1.0 duetcontrolserver=2.1.0 duettools=2.1.0 duetwebcontrol=2.1.4 reprapfirmware=2.1.0-1 duetruntime=2.1.0
-
Very much appreciate the heads up.
Notes taken and stored ....
-
I just had a crash with this in the console
Warning: Lost connection to Duet (Timeout while waiting for transfer ready pin)
Don't know if its the same.
I'm having to reboot the pi to get back in. -
@jay_s_uk said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
Timeout while waiting for transfer ready pin
that would imply DCS is blaming the duet; but not a guarantee DCS isnt at fault still i guess
-
@bearer
The pi has been rebooted and as soon as I try and do something, I lose all connection to the pi. I can't even SSH into it.
Cutting power and gets it back up and running again.First time round, the first thing I tried to do was run my tool unlock macro and the same thing happened again
-
Spoke too soon. The whole thing has died again.
-
Back down at RC9 but my CPU fan is still not working correctly - really don't want to go back to RC7 but can't handle the constant noise.
I KNOW this fan can work correctly - has done so since RC1 .... why can't I see fan 2 on the dashboard ? (doesn't even appear in the display filter)
Has something changed in gcode ????
M308 S2 Y"mcu-temp" A"CPU" M950 F2 C"!out4" A"MCU" Q32000 L5 M106 P2 T40:45 H2 ; set Duet cooling fan
-
@Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
More hassle than I'm in the mood for
you could probably ssh in, stop DuetControlServer service and run it in the foreground to try and capture any relevant debugging info. installing and using screen would pervent DCS from being terminated if the ssh session is terminated.
sudo apt install -y screen
and then runsudo systemctl stop duetcontrolserver
followed bysudo screen /opt/dsf/bin/DuetControlServer -l debug
-
At the time you couldn't ssh - the RPi wasn't responsive at all, if you had an SSH session open it just stopped responding.
I will try the screen though - what does that offer? - a non DWC web gui ?
-
@jay_s_uk said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
Spoke too soon.
if you've got access to console/ssh could you also run something like top and see if it spots something to suggest DCS get stuck in a loop?
-
@Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
I will try the screen though - what does that offer?
its a terminal multiplexer / window manager or sometihng like so. it achieves that dcs will keep running if you have a network glitch. if you run dcs in the foreground and ssh stops all the processes in that shell are terminated - with screen they can keep running.
-
First message
[warn] RepRapFirmware got a bad header checksum
and then [screen is terminating]
-
@Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
[warn] RepRapFirmware got a bad header checksum
I was getting those when I very first setup my D3 & Pi. Firmly re-seating the ribbon cable on both ends cleared them up.
-
@bearer said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
@jay_s_uk said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:
Spoke too soon.
if you've got access to console/ssh could you also run something like top and see if it spots something to suggest DCS get stuck in a loop?
Terminal dies as soon as the web connection does.
I'll try and run DCS through the session and see what it spits out. It'll be later on though as I'm on bedtime duty now. -
the only thing I could think of that with respect to DCS to complaining about the RDY pin is basically an interrupt storm which can grind the Pi to a halt. not sure if relevant though.