• Tags
  • Documentation
  • Order
  • Register
  • Login
Duet3D Logo Duet3D
  • Tags
  • Documentation
  • Order
  • Register
  • Login

Crashes during printing - "SPI connection has been reset".

Scheduled Pinned Locked Moved
General Discussion
4
24
1.5k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • undefined
    jbjhjm
    last edited by jbjhjm 10 Dec 2021, 06:52 12 Oct 2021, 06:38

    Updated to the latest beta 4. Reading through the changelog I don't think it will change anything regarding the SPI connection issue.

    Any ideas how to continue debugging next time it happens?
    What technical reason is there for the SPI connection failure message to appear?

    1 Reply Last reply Reply Quote 0
    • undefined
      chrishamm administrators @jbjhjm
      last edited by 12 Oct 2021, 10:00

      @jbjhjm Those log settings are far from ideal and in fact I got the same symptoms with persistent logging and long prints as well. The reason is that systemd flushes lots and lots of messages to the SD card in regular intervals, which probably stalls IO access and/or the stdout line at some point (due to the massive amount of log messages; more than 3x the regular G-code file length per print). Either reset LogLevel back to "info" or change the journald storage to "volatile".

      When DCS becomes unresponsive at some point (probably during a full SPI transfer and longer than 500ms), RRF thinks the Pi lost communication so it invalidates everything.

      If the resets persist with the standard log level, try out a different SD card and/or reduce IO load on the Pi as far as possible.

      Duet software engineer

      undefined 1 Reply Last reply 12 Oct 2021, 10:33 Reply Quote 0
      • undefined
        jbjhjm @chrishamm
        last edited by 12 Oct 2021, 10:33

        @chrishamm oh dear, thanks for the warning. Will revert the log settings to be more lightweight.
        I did only tweak these yesterday, so all crashes until now happened with standard log settings.
        I'm using the SD card shipped with the 6HC, but can try to get a different one.

        So your guess is the error appears because the Pi is too busy to respond to RRF in time?
        Besides the RRF communication it handles streaming camera data. I can try to lower fps/resolution.

        undefined 1 Reply Last reply 12 Oct 2021, 11:35 Reply Quote 0
        • undefined
          chrishamm administrators @jbjhjm
          last edited by 12 Oct 2021, 11:35

          @jbjhjm Yes, I think so. If the logging provider hangs during SPI transfers, it's likely to reset the connection state.

          Duet software engineer

          1 Reply Last reply Reply Quote 1
          • undefined
            jbjhjm
            last edited by 13 Oct 2021, 13:17

            Experienced a crash again, this time it's been different though.
            Duet suddenly stated the print was 100% done. No other errors.

            I was not able to do a M115 / M122 this time, but the Raspi still has persistent logging enabled.
            I will check these logs later and see if anything useful is in there.

            One thing making me suspicious is I'm having a terrible lot of network disconnects.
            Whenever a print fails, DWC is offline too, and outside of erronous situations DWC/Webcam sometimes reacts very slow too.

            I'll do some investigation on how to monitor CPU and network load on the Pi.

            undefined 1 Reply Last reply 14 Oct 2021, 11:00 Reply Quote 0
            • undefined
              T3P3Tony administrators @jbjhjm
              last edited by 14 Oct 2021, 11:00

              @jbjhjm its also worth ensuring you are giving the Pi enough voltage. look for undervoltage events in the logs

              www.duet3d.com

              undefined 1 Reply Last reply 14 Oct 2021, 11:12 Reply Quote 0
              • undefined
                jbjhjm @T3P3Tony
                last edited by 14 Oct 2021, 11:12

                @t3p3tony thank you, will do!

                I applied a number of changes today, let's see how they work out:

                • using 3.4.0b5 now
                • disabled logging as suggested by @chrishamm - my yesterday's print logged a whopping 1.5 GB. 😵
                • reduced webcam resolution
                • installed htop to track CPU/Mem performance
                • [external] modified my wifi setup; a wifi repeater was causing network performance issues. It would not surprise me if it also affected DWC / pi performance

                htop stats say that 70-90 % of CPU load is caused by a chromium process. Unfortunately chromium always runs many parallel processes so it is difficult to investigate what this is actually doing.

                undefined 1 Reply Last reply 14 Oct 2021, 12:17 Reply Quote 0
                • undefined
                  T3P3Tony administrators @jbjhjm
                  last edited by 14 Oct 2021, 12:17

                  @jbjhjm if you are not running DWC local to the Pi then you can obviously not run chromium at all. If you are running DWC on that Pi then see what its at with only DWC open

                  www.duet3d.com

                  undefined 1 Reply Last reply 14 Oct 2021, 16:03 Reply Quote 0
                  • undefined
                    jbjhjm @T3P3Tony
                    last edited by 14 Oct 2021, 16:03

                    @t3p3tony no voltage issues reported so far by vcgencmd.
                    I'm not sure I understand what you meant with your last comment.

                    DWC is running on the Pi (at least as far as I understand Duet's SPC mode, all that is handled by the Pi while the mainboard only handles printing and reports back values?).
                    DWC also is the only opened chromium tab.
                    Nontheless chromium runs a bunch of different processes.
                    In Time/CPU columns you can see though that there is just one chromium process that uses lots of CPU time.

                    b686affd-ce45-475b-a89b-fb58d995947a-image.png

                    undefined 1 Reply Last reply 14 Oct 2021, 16:32 Reply Quote 0
                    • undefined
                      T3P3Tony administrators @jbjhjm
                      last edited by 14 Oct 2021, 16:32

                      @jbjhjm I mean are you running the Pi headless and connecting via a network interface on the Pi to the webserver, or do you have a screen connected to a pi and running DWC in a browser on the Pi?

                      www.duet3d.com

                      undefined 1 Reply Last reply 14 Oct 2021, 16:35 Reply Quote 0
                      • undefined
                        jbjhjm @T3P3Tony
                        last edited by 14 Oct 2021, 16:35

                        @t3p3tony ah now I get you. It's both, the pi has a permanently attached screen, and I often access DWC through network too for more complex tasks and if I'm not in the same room.

                        1 Reply Last reply Reply Quote 0
                        • undefined
                          jbjhjm
                          last edited by jbjhjm 14 Oct 2021, 16:59

                          ok it seems that the pi's network connection has again crashed just a few minutes ago; it's still listed in the routers active devices list, responds to pings, but DWC does not load anymore. The print is still being executed though.
                          So I checked what happened on the touch panel: Chrome displayed a white page + a note that it has crashed and if it should reload.
                          Now this is weird: I dismissed that message, exited fullscreen and then saw another instance of Chrome running below the crashed instance!
                          I have no clue why it is there. I did not tweak the startup routine provided by duetPi.
                          After closing the crashed chrome window, it seems that the network connection was recreated too...
                          Nothing really useful in journal (re-enabled logging hoping to hunt down networking issues). Just way too many network connection losses and reconnects. This is related to the bad wifi that I still have to improve. Disallowed auto-switching frequency bands and 2.4/5Ghz, hopefully that will make the connection a bit more robust.

                          undefined 1 Reply Last reply 14 Oct 2021, 17:11 Reply Quote 0
                          • undefined
                            T3P3Tony administrators @jbjhjm
                            last edited by 14 Oct 2021, 17:11

                            @jbjhjm I hope @chrishamm has some ideas about what to look for in the logs as a cause of this.

                            www.duet3d.com

                            undefined 1 Reply Last reply 14 Oct 2021, 17:51 Reply Quote 0
                            • undefined
                              jbjhjm @T3P3Tony
                              last edited by 14 Oct 2021, 17:51

                              @t3p3tony when my next print is completed I'll do a full restart and check chrome status right after, if two instances are running and such.
                              If someone can point me into the right direction for finding the duetPi startup script, I'll check if there's anything unusual going on.

                              Attached bootlog.txt by the way.
                              Don't know enough about raspi + linux to spot anything useful unfortunately.

                              undefined 1 Reply Last reply 14 Oct 2021, 18:15 Reply Quote 0
                              • undefined
                                jbjhjm @jbjhjm
                                last edited by 14 Oct 2021, 18:15

                                The duplicate chromium seems to be related to beta5.
                                Just did a full restart and the screen showed a crashed chromium window right away.
                                This has never happened before so I'm quite sure it has to do with beta5.
                                Opened a but report to discuss this separately.
                                https://forum.duet3d.com/topic/25542/3-4-b5-bug-chromium-crashes-on-startup-sbc

                                undefined 1 Reply Last reply 15 Oct 2021, 09:58 Reply Quote 0
                                • undefined
                                  T3P3Tony administrators @jbjhjm
                                  last edited by 15 Oct 2021, 09:58

                                  @jbjhjm its not really crashed as such as I outlined in the other thread, rather its showing that chrome was not shutdown properly. I will leave discussion of that to the tother thread, but the huge number of chrome tasks is unusual and I am not seeing those.

                                  www.duet3d.com

                                  undefined 1 Reply Last reply 15 Oct 2021, 18:10 Reply Quote 0
                                  • undefined
                                    jbjhjm @T3P3Tony
                                    last edited by 15 Oct 2021, 18:10

                                    @t3p3tony chrome shutdown/crash is fixed by the solution proposed in other topic!
                                    About the number of processes, that's my fault. I just noticed that htop was showing not only processes but every thread too. I do still see a dozen processes but that is not unusual for chrome.

                                    a9b58230-96a3-475c-9b3d-fbdcf8a3d1f6-image.png

                                    undefined 1 Reply Last reply 15 Oct 2021, 18:23 Reply Quote 0
                                    • undefined
                                      T3P3Tony administrators @jbjhjm
                                      last edited by 15 Oct 2021, 18:23

                                      @jbjhjm ok, so we still need to see if you have SPI disconnect errors now.

                                      www.duet3d.com

                                      undefined 1 Reply Last reply 16 Oct 2021, 08:58 Reply Quote 0
                                      • undefined
                                        jbjhjm @T3P3Tony
                                        last edited by 16 Oct 2021, 08:58

                                        @t3p3tony I will let you know if anything new occurs. Maybe b5 and the tweaked raspi settings helped to make it go away. As the error did not occur often in the past, I'll continue and observe for some days.

                                        1 Reply Last reply Reply Quote 0
                                        15 out of 24
                                        • First post
                                          15/24
                                          Last post
                                        Unless otherwise noted, all forum content is licensed under CC-BY-SA