Duet3D Logo Duet3D
    • Tags
    • Documentation
    • Order
    • Register
    • Login

    DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1

    Scheduled Pinned Locked Moved
    Beta Firmware
    12
    132
    7.3k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ChrisPundefined
      ChrisP
      last edited by

      @chrishamm
      I've been having big issues since the latest update to 3.01-R10 / DWC 2.1.5 / DSF 2.1.1 where the Pi has become incredibly unstable to the extent that it drops WiFi connection and totally locks up the Pi. I can move the mouse around on the screen but can't otherwise interact. If left in this state long enough it ends up nuking the CPU such that the max temperature warning icon shows on the LCD, meaning that it hit 85 degrees.

      It's taken a while to track down to provide any proof as the system either works perfectly, or is usually completely unresponsive. It first happened immediately after I updated and had to power cycle the whole system to get things working again. While I haven't found a 100% foolproof way of recreating the issue, it usually seems to happen immediately after a cold start rather than mid print etc. If I power cycle, its seems to boot fine after. I'm wondering whether it's something to do with communicating with the D3 and waiting for a response? Twice it also appears to have happened when a new DWC was opened on a remote PC, but I wouldn't like to say whether those were anything more than coincidence. Either way, it's frequent enough that I'll be going back to RC9 etc for now.

      The last time it happened (turning on from cold), I just about managed to SSH in and grab the screen shot below. 400% is clearly as bad as things can get as it's effectively maxing out all 4 cores. In the 30 seconds or so that I managed to maintain connection, the CPU usage for DCS didn't drop below 374%. After that the whole Pi crashes. Reverting back to RC9 etc and everything seems fine.
      2020-04-26 (3).png

      1 Reply Last reply Reply Quote 0
      • gtj0undefined
        gtj0
        last edited by

        Can you do a journalctl -fu duetcontrolserver and see if there are any issues?
        If you do a systemctl stop duetwebserver does the usage go back top normal?

        1 Reply Last reply Reply Quote 0
        • ChrisPundefined
          ChrisP
          last edited by ChrisP

          To be honest, the time I got the screenshot was the only time that SSH has managed to connect long enough to do anything useful. I was about to try pulling the journal log before it died again. I can't do it straight on the Pi as aside from the mouse moving, the gui is unresponsive. Similarly, I have no way to try stopping until I can maintain a connection long enough, but my suspicion is that stopping the service will recover the system, but it'd be interesting to know what restarting does.

          Since the time I've got the screenshot I've managed to recreate the issue a good number of times, but not such that I can interact with the system 😕

          Edit: Is there an easy way to load a terminal in the gui on boot and run journalctl -fu duetcontrolserver?

          1 Reply Last reply Reply Quote 0
          • gtj0undefined
            gtj0
            last edited by

            Try this... With the Pi running, even on the previous DSF release, so a systemctl stop duetwebserver duetcontrolserver and systemctl disable duetwebserver duetcontrolserver. That will keep the services from starting on reboot. Upgrade to the latest packages. I don't remember if the upgrade re-enables the services so to be safe, do the disable again and reboot.

            Once the system is back up and running, in one terminal window do the journalctl -fu duetcontrolserver, in another run top, and in another do a systemctl start duetcontrolserver. See what happens. If everything is stable, do a start on the duetwebserver and connect via DWC and see what happens.

            1 Reply Last reply Reply Quote 0
            • gloomyandyundefined
              gloomyandy
              last edited by

              I'm pretty sure I have also seen this problem, but I've not been able to reproduce it. It was just after updating to RC10/2.1.1. I did the rPi update/upgrade and then went to check if it was working using a browser and DWC. I forced a reload of DWC (to make sure I was using the updated DWC) and it refreshed and said it was trying to connect, at that point the window I had open running ssh to the pi popped up a notification that the pi connection had been lost. Following this I was unable to ssh back to the pi and the DWC web server was no longer responding.

              I rebooted the pi and everything seemed fine. I was then moving around testing various things and the same thing happened again. To be honest I thought at the time it was my pi overheating or something, but having seen this report I'm now suspect it was the same issue.

              ChrisPundefined 1 Reply Last reply Reply Quote 0
              • ChrisPundefined
                ChrisP
                last edited by

                Okay, so I managed to induce it this time simply by restarting the Pi from command line. I can tell when it's going to happen before the gui fully loads on the Pi because putty takes an age to show the login prompt. This time I managed to get 3 sessions open - one to run top to see the CPU usage, one for the journal log (below) and one to restart the DCS service. the service session up to 17:43:37 was exhibiting the problem. I then restarted the process sudo systemctl restart duetcontrolserver.service (at 17:44:38) at which point DCS restarted and everything was fine. The log shows nothing else though 😕

                pi@starttex:~ $ sudo journalctl -u duetcontrolserver -f
                -- Logs begin at Sun 2020-04-26 17:43:32 BST. --
                Apr 26 17:43:35 starttex systemd[1]: Started Duet Control Server.
                Apr 26 17:43:38 starttex DuetControlServer[359]: Duet Control Server v2.1.1
                Apr 26 17:43:38 starttex DuetControlServer[359]: Written by Christian Hammacher for Duet3D
                Apr 26 17:43:38 starttex DuetControlServer[359]: Licensed under the terms of the GNU Public License Version 3
                Apr 26 17:43:39 starttex DuetControlServer[359]: [info] Settings loaded
                Apr 26 17:43:40 starttex DuetControlServer[359]: [info] Environment initialized
                Apr 26 17:43:40 starttex DuetControlServer[359]: [info] Connection to Duet established
                Apr 26 17:43:40 starttex DuetControlServer[359]: [info] IPC socket created at /var/run/dsf/dcs.sock
                Apr 26 17:44:37 starttex DuetControlServer[359]: [info] System time has been changed
                Apr 26 17:44:38 starttex systemd[1]: Stopping Duet Control Server...
                Apr 26 17:44:38 starttex DuetControlServer[359]: [warn] Received SIGTERM, shutting down...
                Apr 26 17:44:38 starttex systemd[1]: duetcontrolserver.service: Main process exited, code=exited, status=143/n/a
                Apr 26 17:44:38 starttex systemd[1]: duetcontrolserver.service: Failed with result 'exit-code'.
                Apr 26 17:44:38 starttex systemd[1]: Stopped Duet Control Server.
                Apr 26 17:44:38 starttex systemd[1]: Started Duet Control Server.
                Apr 26 17:44:38 starttex DuetControlServer[1719]: Duet Control Server v2.1.1
                Apr 26 17:44:38 starttex DuetControlServer[1719]: Written by Christian Hammacher for Duet3D
                Apr 26 17:44:38 starttex DuetControlServer[1719]: Licensed under the terms of the GNU Public License Version 3
                Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] Settings loaded
                Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] Environment initialized
                Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] Connection to Duet established
                Apr 26 17:44:39 starttex DuetControlServer[1719]: [info] IPC socket created at /var/run/dsf/dcs.sock
                Apr 26 17:45:09 starttex DuetControlServer[1719]: [info] System time has been changed
                
                1 Reply Last reply Reply Quote 0
                • ChrisPundefined
                  ChrisP @gloomyandy
                  last edited by

                  @gloomyandy said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:

                  ...I forced a reload of DWC (to make sure I was using the updated DWC) and it refreshed and said it was trying to connect, at that point the window I had open running ssh to the pi popped up a notification that the pi connection had been lost. Following this I was unable to ssh back to the pi and the DWC web server was no longer responding...

                  This is exactly the behaviour I observe if I actually manage to connect.

                  1 Reply Last reply Reply Quote 0
                  • Garfieldundefined
                    Garfield
                    last edited by

                    Concur - RPi is unusable, unpredictable, certainly wouldn't trust this printer to do anything useful currently.

                    Can't even access via SSH

                    Only way to get it back is power cycle.

                    No point going any further with this version, now need to try and roll back this version - can't use it, can't even reliably help debug.

                    1 Reply Last reply Reply Quote 0
                    • A Former User?
                      A Former User
                      last edited by

                      If you're unable to access SSH there is a serial console on the GPIO header, might be a bit tricky to get to it with the Duet ribbon cable in place unless you made your own cable for this purpose.

                      (of course there is also a HDMI port and a USB port for a keyboard..)

                      1 Reply Last reply Reply Quote 0
                      • Garfieldundefined
                        Garfield
                        last edited by

                        More hassle than I'm in the mood for, it is clear that the RPi runs for a while, then no more connection after a time ,,,,

                        Currently trying to roll back (never done it before) ... so have RPi SD card on the bench ....

                        ChrisPundefined A Former User? 2 Replies Last reply Reply Quote 0
                        • ChrisPundefined
                          ChrisP @Garfield
                          last edited by

                          @Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:

                          More hassle than I'm in the mood for, it is clear that the RPi runs for a while, then no more connection after a time ,,,,

                          Currently trying to roll back (never done it before) ... so have RPi SD card on the bench ....

                          I've done this a few times now. The easiest way I've found is to remove whats there first: sudo apt remove duet* reprap*

                          Then to go back you have to install the components by specifying the version for each package. This will get you back to RC9 etc: sudo apt install duetsoftwareframework=2.1.0 duetcontrolserver=2.1.0 duettools=2.1.0 duetwebcontrol=2.1.4 reprapfirmware=2.1.0-1 duetruntime=2.1.0

                          1 Reply Last reply Reply Quote 0
                          • Garfieldundefined
                            Garfield
                            last edited by

                            Very much appreciate the heads up.

                            Notes taken and stored ....

                            1 Reply Last reply Reply Quote 0
                            • jay_s_ukundefined
                              jay_s_uk
                              last edited by

                              I just had a crash with this in the console

                              Warning: Lost connection to Duet (Timeout while waiting for transfer ready pin)
                              

                              Don't know if its the same.
                              I'm having to reboot the pi to get back in.

                              Owns various duet boards and is the main wiki maintainer for the Teamgloomy LPC/STM32 port of RRF. Assume I'm running whatever the latest beta/stable build is

                              A Former User? 1 Reply Last reply Reply Quote 0
                              • A Former User?
                                A Former User @jay_s_uk
                                last edited by

                                @jay_s_uk said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:

                                Timeout while waiting for transfer ready pin

                                that would imply DCS is blaming the duet; but not a guarantee DCS isnt at fault still i guess

                                1 Reply Last reply Reply Quote 0
                                • jay_s_ukundefined
                                  jay_s_uk
                                  last edited by

                                  @bearer

                                  The pi has been rebooted and as soon as I try and do something, I lose all connection to the pi. I can't even SSH into it.
                                  Cutting power and gets it back up and running again.

                                  First time round, the first thing I tried to do was run my tool unlock macro and the same thing happened again

                                  Owns various duet boards and is the main wiki maintainer for the Teamgloomy LPC/STM32 port of RRF. Assume I'm running whatever the latest beta/stable build is

                                  1 Reply Last reply Reply Quote 0
                                  • jay_s_ukundefined
                                    jay_s_uk
                                    last edited by

                                    Spoke too soon. The whole thing has died again.

                                    Owns various duet boards and is the main wiki maintainer for the Teamgloomy LPC/STM32 port of RRF. Assume I'm running whatever the latest beta/stable build is

                                    A Former User? 1 Reply Last reply Reply Quote 0
                                    • Garfieldundefined
                                      Garfield
                                      last edited by

                                      Back down at RC9 but my CPU fan is still not working correctly - really don't want to go back to RC7 but can't handle the constant noise.

                                      I KNOW this fan can work correctly - has done so since RC1 .... why can't I see fan 2 on the dashboard ? (doesn't even appear in the display filter)

                                      Has something changed in gcode ????

                                      M308 S2 Y"mcu-temp" A"CPU"
                                      M950 F2 C"!out4" A"MCU" Q32000 L5  
                                      M106 P2 T40:45 H2      ; set Duet cooling fan	
                                      
                                      1 Reply Last reply Reply Quote 0
                                      • A Former User?
                                        A Former User @Garfield
                                        last edited by A Former User

                                        @Garfield said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:

                                        More hassle than I'm in the mood for

                                        you could probably ssh in, stop DuetControlServer service and run it in the foreground to try and capture any relevant debugging info. installing and using screen would pervent DCS from being terminated if the ssh session is terminated.

                                        sudo apt install -y screen and then run sudo systemctl stop duetcontrolserver followed by sudo screen /opt/dsf/bin/DuetControlServer -l debug

                                        1 Reply Last reply Reply Quote 0
                                        • Garfieldundefined
                                          Garfield
                                          last edited by

                                          At the time you couldn't ssh - the RPi wasn't responsive at all, if you had an SSH session open it just stopped responding.

                                          I will try the screen though - what does that offer? - a non DWC web gui ?

                                          A Former User? 1 Reply Last reply Reply Quote 0
                                          • A Former User?
                                            A Former User @jay_s_uk
                                            last edited by

                                            @jay_s_uk said in DCS Crash with 3.01-R10 / DWC 2.1.5 / DSF 2.1.1:

                                            Spoke too soon.

                                            if you've got access to console/ssh could you also run something like top and see if it spots something to suggest DCS get stuck in a loop?

                                            jay_s_ukundefined 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Unless otherwise noted, all forum content is licensed under CC-BY-SA