• Tags
  • Documentation
  • Order
  • Register
  • Login
Duet3D Logo Duet3D
  • Tags
  • Documentation
  • Order
  • Register
  • Login

CAN bus anomalies with 6HC and 3HC

Scheduled Pinned Locked Moved Solved
Duet Hardware and wiring
6
52
2.9k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • undefined
    Phaedrux Moderator
    last edited by 14 Apr 2022, 04:01

    You could measure the continuity and resistance on the ribbon cable, that would tell us if it's acceptable or not.

    Z-Bot CoreXY Build | Thingiverse Profile

    1 Reply Last reply Reply Quote 0
    • undefined
      chrishamm administrators @adammhaile
      last edited by 14 Apr 2022, 19:10

      @adammhaile is this with the new SanDisk card?

      Duet software engineer

      undefined 1 Reply Last reply 14 Apr 2022, 19:11 Reply Quote 0
      • undefined
        adammhaile @chrishamm
        last edited by 14 Apr 2022, 19:11

        @chrishamm said in CAN bus anomalies with 6HC and 3HC:

        is this with the new SanDisk card?

        Yes. Same card recommended above.

        undefined 1 Reply Last reply 14 Apr 2022, 19:33 Reply Quote 0
        • undefined
          chrishamm administrators @adammhaile
          last edited by 14 Apr 2022, 19:33

          @adammhaile Please check if the disconnects persist with the new card. If they do, I'll be happy to share a new firmware build that tells us whether the timeout is caused by the SBC or by Reprapfirmware. We've got another trace but I cannot comment on that one yet.

          Duet software engineer

          undefined 1 Reply Last reply 14 Apr 2022, 19:37 Reply Quote 0
          • undefined
            adammhaile @chrishamm
            last edited by adammhaile 14 Apr 2022, 19:37

            @chrishamm said in CAN bus anomalies with 6HC and 3HC:

            Please check if the disconnects persist with the new card. If they do, I'll be happy to share a new firmware build that tells us whether the timeout is caused by the SBC or by Reprapfirmware. We've got another trace but I cannot comment on that one yet.

            Sure - been running prints off this SD all morning. So far so good - but it was pretty random before so we'll see.
            And by timeouts do you mean the SPI connection reset?

            I'll keep putting this through it's paces either through tomorrow morning or until it fails again - then I'll remove and packup to send back.

            undefined 1 Reply Last reply 14 Apr 2022, 23:57 Reply Quote 0
            • undefined
              adammhaile @adammhaile
              last edited by 14 Apr 2022, 23:57

              Alright @chrishamm @Phaedrux @dc42 - had it printing since early this morning and now (8pm) it locked up while not printing... couldn't even run any commands to get diagnostics. I could talk to the Pi, but no comms with the controllers until I power cycled.
              I'm going to get these boards taken out of the machine now to

              undefined 1 Reply Last reply 15 Apr 2022, 10:54 Reply Quote 0
              • undefined
                T3P3Tony administrators @adammhaile
                last edited by 15 Apr 2022, 10:54

                @adammhaile thanks for confirming that. I hope the replacement sorts the issue.

                www.duet3d.com

                undefined 1 Reply Last reply 25 Apr 2022, 13:52 Reply Quote 0
                • undefined
                  adammhaile @T3P3Tony
                  last edited by adammhaile 25 Apr 2022, 13:52

                  @t3p3tony @chrishamm @Phaedrux
                  <sigh> Got the replacements, installed them, and all seemed to be going fine... but was just running a print and it stopped again mid-print. I unfortunately wasn't even able to view the duetcontrolserver log. I could try to run journalctl but it just never returned - and this was from the Pi terminal directly. I couldn't ssh into it.
                  I was able get this though, which is the streaming output of the CodeLogger -t executed -q

                  0794bf43-04c7-4492-8cc9-27754bf19830-image.png

                  So, I'll admit - probably not something with the actual Duet boards - though I'm completely stumped as to what it could be.
                  Guess maybe I'll try swapping the Pi again - the fact that even SSH locks up is suspect to me. Implies that it's not the Duet failing... I guess?

                  I'm running mjpg-streamer and that gcode scroll on the display above from the SBC Pi - we've previously checked that the CPU usage is still low... but could it maybe still be one of those causing it? I've got other Duet 3 SBC machines that have a camera running off the same Pi without issue.

                  undefined 1 Reply Last reply 26 Apr 2022, 17:57 Reply Quote 0
                  • undefined
                    Phaedrux Moderator
                    last edited by 26 Apr 2022, 05:45

                    Start eliminating extras until you find the smoking gun.

                    Z-Bot CoreXY Build | Thingiverse Profile

                    undefined 1 Reply Last reply 26 Apr 2022, 15:44 Reply Quote 0
                    • undefined
                      adammhaile @Phaedrux
                      last edited by 26 Apr 2022, 15:44

                      @phaedrux said in CAN bus anomalies with 6HC and 3HC:

                      Start eliminating extras until you find the smoking gun.

                      Yup - working on it. Re-running the same ~12 hour print every day, changing one thing each time.

                      1 Reply Last reply Reply Quote 0
                      • undefined
                        chrishamm administrators @adammhaile
                        last edited by chrishamm 26 Apr 2022, 17:57

                        @adammhaile Those M905 codes are usually a symptom of excessive load on the Pi - DCS sends them whenever a scheduled delay takes +5s longer than expected (the current acceptable maximum for a 4s delay is already set to 9s, so plenty of time) or when the datetime has been changed (hence the corresponding messages in the duetcontrolserver journal).

                        So it would be interesting to see what actually prevents DCS from getting computing power and/or IO access to linked libs. Just to exclude IO from/to the microSD card, you could temporarily copy the entire DSF directory to /tmp and run DCS from there (that is from the Pi's memory):

                        cp -r /opt/dsf/bin /tmp/dsf
                        sudo systemctl stop duetcontrolserver
                        /tmp/dsf/DuetControlServer
                        

                        Just be aware that you'll have to keep the terminal where you run this open, else a potential print would be aborted. If the same disconnects persists, it must be related to the CPU, memory, or kernel. It would be interesting to see what happens then.

                        Duet software engineer

                        undefined 1 Reply Last reply 26 Apr 2022, 20:59 Reply Quote 0
                        • undefined
                          adammhaile @chrishamm
                          last edited by 26 Apr 2022, 20:59

                          @chrishamm Interesting - will keep that in mind.
                          Strangely the SD card speed test is exactly the same with the new Duet boards, new Pi, and new faster SD card (the one recommended above). So I'd say that test is a red herring. I'm currently seeing 150+ MB/s reads on this card from the Pi.

                          Will test with DCS in RAM if my current batch of tests fail to rule anything out.

                          1 Reply Last reply Reply Quote 2
                          • undefined
                            adammhaile
                            last edited by 29 Apr 2022, 04:26

                            @t3p3tony @chrishamm @Phaedrux @dc42
                            Update: I've been stress testing the machine after swapping the Pi again and tweaking some of the wiring runs to help avoid signal noise. The last 4 days I've started a 12 hour print every morning so that I could have it printing as long as possible without having to leave it running while I'm asleep.

                            The first 3 days went beautifully but today I got another of the same failures.
                            However, this time was a little different...

                            Previously the SBC Pi was running a dhcp server to provide an IP for a direct ethernet connected Pi running android and a touch screen DWC interface.
                            I ditched that when I rebuilt the DuetPi Lite image and instead installed a small WiFi/Ethernet router. So now the SBC Pi and DWC Android Pi are wired to that and I can connect to the machine over it's own WiFi access point (this is so I can get to it at MRRF 🙂 ).

                            The advantage of this was that previously when it would fail, I would never be able to connect to the Pi over SSH which limited my ability to diagnose. But this time I could hit its ethernet connection (which, unlike the WiFi, doesn't go down when the failure occurs it would seem) via the WiFi router I added.

                            So that led me to digging into various system logs and eventually dmesg, where I found a stream of errors, such as these:

                            [31055.833034] ieee80211 phy0: brcmf_cfg80211_get_station: GET STA INFO failed, -110
                            [31062.393030] ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
                            [31064.953033] ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
                            [31064.953060] ieee80211 phy0: brcmf_cfg80211_get_station: GET STA INFO failed, -110
                            [31071.513043] ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
                            

                            And it was those that led me to this Duet Forum post:

                            https://forum.duet3d.com/topic/23889/pi4-network-disconnect

                            Now, the crazy thing is: they were seeing exactly what I was.

                            • SPI comms reset
                            • WiFi dropping
                            • Print stopping midway through and DWC showing it was done
                            • Using a Pi 4 with a Duet 3 (mini, but still)

                            Fortunately, they seem to have a solution! In that post they recommended not using 5Ghz networks and disabling the wifi power management features. From further research it seems that WiFi with 5GHz is somewhat of an ongoing issue with the Pi 4 and it will sometimes just kill WiFi when it shouldn't.... Now, why that seems to affect SPI, I have no idea. But they seem to be related...

                            While I could disable power saving on WiFi, the 2.4GHz network part was a problem for silly reasons with my home WiFi access point.
                            So, for now I've switched to only using ethernet and have connected a separate USB ethernet adapter for the connection between the SBC Pi and the DWC Android Pi.

                            Whether or not this will also fix the problem for me... time will tell. I'll be back to long test prints tomorrow morning and let you know how it goes in a few days.
                            But so far this seems promising given they were seeing pretty identical symptoms. My fallback at this point is tearing a Pi 3B+ out of something else in my house and using that. I have another Duet 3 machine with a 3B+ and it's been rock solid.

                            undefined 1 Reply Last reply 29 Apr 2022, 10:47 Reply Quote 3
                            • undefined
                              chrishamm administrators @adammhaile
                              last edited by 29 Apr 2022, 10:47

                              @adammhaile Thanks for the info, that's very good to know. When I build the next DuetPi image, I'll disable WiFi power saving by default.

                              Duet software engineer

                              undefined 2 Replies Last reply 29 Apr 2022, 11:25 Reply Quote 2
                              • undefined
                                adammhaile @chrishamm
                                last edited by 29 Apr 2022, 11:25

                                @chrishamm said in CAN bus anomalies with 6HC and 3HC:

                                @adammhaile Thanks for the info, that's very good to know. When I build the next DuetPi image, I'll disable WiFi power saving by default.

                                Awesome.
                                Silly request while you are in there... is there any chance you would considering setting up the dsf user to have a home directory and default shell so that you can log into it?
                                With complex configurations, such as mine, it's extremely helpful to be able to ssh directly as that user. I currently use usermod to do this on each Duet SBC I have and then will use Visual Studio Code's remote features to directly connect to the sd directory contents over SSH. This has also allowed me to keep the entire config backed up on GitHub

                                undefined 1 Reply Last reply 13 May 2022, 19:59 Reply Quote 0
                                • undefined
                                  adammhaile @chrishamm
                                  last edited by 13 May 2022, 16:02

                                  @chrishamm @dc42 @Phaedrux @T3P3Tony
                                  Just wanted to follow up on this and let you know that, at this point, I'm quite confident we found the problem. So, thank you all for sticking with me through all the ups and downs of solving this. It's been a crazy ride for sure.
                                  And hopefully with the patches made to the Pi image this won't be a problem for people in the future who want to use a Pi4.

                                  undefined 1 Reply Last reply 13 May 2022, 16:20 Reply Quote 2
                                  • undefined
                                    T3P3Tony administrators @adammhaile
                                    last edited by 13 May 2022, 16:20

                                    @adammhaile thank you for following up, and for your patience and help in finding the issue. Its only through such valuable (and patient!) feedback that we can support such a wide range of machines, configurations and uses.

                                    www.duet3d.com

                                    1 Reply Last reply Reply Quote 2
                                    • undefined Phaedrux marked this topic as a question 13 May 2022, 18:06
                                    • undefined Phaedrux has marked this topic as solved 13 May 2022, 18:06
                                    • undefined
                                      chrishamm administrators @adammhaile
                                      last edited by 13 May 2022, 19:59

                                      @adammhaile Thanks again for the feedback - looks like I missed your previous question. You should be able to assign a default home directory to the dsf user (usermod -d /home/dsf dsf) and allow it to log in (chsh -s /bin/bash dsf + passwd dsf). At least on DuetPi the pi user should have access to the dsf files as well.

                                      Duet software engineer

                                      undefined 1 Reply Last reply 13 May 2022, 20:03 Reply Quote 0
                                      • undefined
                                        adammhaile @chrishamm
                                        last edited by 13 May 2022, 20:03

                                        @chrishamm said in CAN bus anomalies with 6HC and 3HC:

                                        At least on DuetPi the pi user should have access to the dsf files as well.

                                        Read yes, write no. And if I make a file via the pi user it seems that DWC doesn't see it. It was kind of weird... it didn't act quite right.
                                        And yes, I did exactly what you said to make the directory and set the shell. Just thought there was no reason it couldn't be setup like that from the start 🙂

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Unless otherwise noted, all forum content is licensed under CC-BY-SA