Duet3D Logo Duet3D
    • Tags
    • Documentation
    • Order
    • Register
    • Login

    Chasing bad SPI checksums.

    Scheduled Pinned Locked Moved Solved
    Beta Firmware
    7
    24
    1.1k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dc42undefined
      dc42 administrators @Danal
      last edited by

      They are between the Pi and the Duet.

      Duet WiFi hardware designer and firmware engineer
      Please do not ask me for Duet support via PM or email, use the forum
      http://www.escher3d.com, https://miscsolutions.wordpress.com

      1 Reply Last reply Reply Quote 0
      • T3P3Tonyundefined
        T3P3Tony administrators @Danal
        last edited by

        @Danal what speed are you running the SPI bus at?

        www.duet3d.com

        1 Reply Last reply Reply Quote 0
        • Danalundefined
          Danal
          last edited by

          Hmmm... to all of my reading, this is determined by the application at the "open" of the connection.

          The app specifies this as a divisor of the system clock frequency. That means two things that we should think about:

          1. This is an RPI 4, and therefore may operate at a higher clock speed than other SBCs that have been tested with the Duet 3 6HC. Have you guys used an R4?

          2. When the System-on-a-Chip manages thermals, it can vary the clocks without notice. This does affect the "divided down" SPI speed. And, it can occur in the middle of an SPI transaction. In theory, given that SPI uses explicit clocking, this shouldn't hurt anything. In practice? There seems to be lots of discussion around this and no firm conclusions.

          Delta / Kossel printer fanatic

          T3P3Tonyundefined 1 Reply Last reply Reply Quote 0
          • Danalundefined
            Danal
            last edited by

            And... I found this in the config.json for the SPI device driver in the duetsoftwareframework:

             "SpiDevice": "/dev/spidev0.0",
             "SpiFrequency": 2000000,
             "SpiTransferTimeout": 500,
             "MaxSpiRetries": 3,
             "SpiPollDelay": 25,

            Delta / Kossel printer fanatic

            1 Reply Last reply Reply Quote 0
            • T3P3Tonyundefined
              T3P3Tony administrators @Danal
              last edited by

              @Danal said in Bad checksums - Which connection?:

              Hmmm... to all of my reading, this is determined by the application at the "open" of the connection.

              The app specifies this as a divisor of the system clock frequency. That means two things that we should think about:

              1. This is an RPI 4, and therefore may operate at a higher clock speed than other SBCs that have been tested with the Duet 3 6HC. Have you guys used an R4?

              2. When the System-on-a-Chip manages thermals, it can vary the clocks without notice. This does affect the "divided down" SPI speed. And, it can occur in the middle of an SPI transaction. In theory, given that SPI uses explicit clocking, this shouldn't hurt anything. In practice? There seems to be lots of discussion around this and no firm conclusions.

              I use a RPI4 on my Toolchanger and i beleive a few other users are using them as well. I do have a heatsink case on mine though.

              Yes in config.json the SPI speed is set. 2Mhz is not that high - it is capable of much higher - but you might want to try 1Mhz to see if it makes any difference to the CRC errors. Also normal things like keeping the ribbon cable length short and not too close to other noise emitters. Also i assume you are using up todate DSF/RRF?

              www.duet3d.com

              1 Reply Last reply Reply Quote 0
              • Danalundefined
                Danal
                last edited by

                The ribbon cable is the one in the box with the 6HC. The RPI is mounted "alongside" (I'll get a photo in a few min).

                Up to date: sudo apt-get update followed by upgrade does not upgrade anything.

                I am on the duet3d.list; should I be on duet3d-unstable.list?

                Delta / Kossel printer fanatic

                T3P3Tonyundefined 1 Reply Last reply Reply Quote 0
                • T3P3Tonyundefined
                  T3P3Tony administrators @Danal
                  last edited by

                  @Danal I think that stable and unstable match right now. Unstable gives me:

                  duetcontrolserver 1.1.0.5
                  duetruntime 1.1.0.5
                  duetsd 1.0.4
                  duetsoftwareframework 1.1.0.5
                  duettools 1.1.0.5
                  duetwebcontrol 2.0.4-1
                  duetwebserver 1.1.2.1

                  The ribbon cable out the box is short and the mounting scheme is as i have it so that all looks good.

                  www.duet3d.com

                  1 Reply Last reply Reply Quote 0
                  • Danalundefined
                    Danal
                    last edited by

                    I will give 1000000 a try, for debugging if nothing else.

                    Delta / Kossel printer fanatic

                    1 Reply Last reply Reply Quote 0
                    • Danalundefined
                      Danal
                      last edited by Danal

                      And, I do have a restart that occured about an hour into a multi hour print.

                      Edit: There are a few bad checksums before the restart; at the same time, there have been similar messages that did not cause a restart.

                      reset-subset.txt

                      Delta / Kossel printer fanatic

                      chrishammundefined 1 Reply Last reply Reply Quote 0
                      • T3P3Tonyundefined
                        T3P3Tony administrators
                        last edited by

                        so you got a CRC arror that caused a reset? of what RRF, DCS or? Hopefully @chrishamm can investigate

                        www.duet3d.com

                        1 Reply Last reply Reply Quote 0
                        • chrishammundefined
                          chrishamm administrators @Danal
                          last edited by

                          @Danal It looks like your Duet 3 restarted/crashed before the print was cancelled:

                          Nov 08 00:41:17 duet3 DuetControlServer[351]: [warn] Controller has been reset

                          What RRF version are you using, can you share the output of M122?

                          Duet software engineer

                          Danalundefined 1 Reply Last reply Reply Quote 0
                          • wilrikerundefined
                            wilriker
                            last edited by

                            Just a short one from me: I have it running with a RPi4 at ~20.83MHz (set it to 21MHz and it will be get to the actual value through core clock divisor) stable. But I also have fixed my core clock in /boot/config.txt to not have those uncontrollable frequency changes:

                            core_freq=250
                            core_freq_min=250
                            

                            Manuel
                            Duet 3 6HC (v0.6) with RPi 4B on a custom Cartesian
                            with probably always latest firmware/DWC (incl. betas or self-compiled)
                            My Tool Collection

                            Danalundefined 1 Reply Last reply Reply Quote 0
                            • Danalundefined
                              Danal @chrishamm
                              last edited by

                              @chrishamm said in Bad checksums - Which connection?:

                              @Danal It looks like your Duet 3 restarted/crashed before the print was cancelled:

                              Nov 08 00:41:17 duet3 DuetControlServer[351]: [warn] Controller has been reset

                              What RRF version are you using, can you share the output of M122?

                              Board: Duet 3 MB6HC (MB6HC)
                              DSF Version: 1.1.0.5
                              Firmware: RepRapFirmware for Duet 3 MB6HC v0.6 or 1.0 3.0beta12 (2019-11-02b1)
                              Duet Web Control 2.0.4

                              Unfortunately, I did not obtain M122 prior to a power cycle.

                              Delta / Kossel printer fanatic

                              1 Reply Last reply Reply Quote 0
                              • Danalundefined
                                Danal @wilriker
                                last edited by

                                @wilriker said in Bad checksums - Which connection?:

                                Just a short one from me: I have it running with a RPi4 at ~20.83MHz (set it to 21MHz and it will be get to the actual value through core clock divisor) stable. But I also have fixed my core clock in /boot/config.txt to not have those uncontrollable frequency changes:

                                core_freq=250
                                core_freq_min=250
                                

                                I was thinking about trying that, as the Pi4 is plenty fast enough.

                                Thanks for the detailed settings, I will do that next.

                                Delta / Kossel printer fanatic

                                1 Reply Last reply Reply Quote 0
                                • T3P3Tonyundefined
                                  T3P3Tony administrators
                                  last edited by

                                  @Danal said in Bad checksums - Which connection?:

                                  core_freq=250
                                  core_freq_min=250

                                  @wilriker are there any downsides to this approach?

                                  www.duet3d.com

                                  T3P3Tonyundefined wilrikerundefined 2 Replies Last reply Reply Quote 0
                                  • T3P3Tonyundefined
                                    T3P3Tony administrators @T3P3Tony
                                    last edited by

                                    @T3P3Tony I also see from the documentation:

                                    For Pi 4B the default value is 500, 600 is the only other accepted value;

                                    www.duet3d.com

                                    1 Reply Last reply Reply Quote 0
                                    • wilrikerundefined
                                      wilriker @T3P3Tony
                                      last edited by

                                      @T3P3Tony said in Bad checksums - Which connection?:

                                      @Danal said in Bad checksums - Which connection?:

                                      core_freq=250
                                      core_freq_min=250

                                      @wilriker are there any downsides to this approach?

                                      CPU scales independently from the core clock and I could not find what else this has an influence on. So far I have not encountered any downsides.

                                      Re 500 and 600: that's for overclocking. But you can always underclock. Confirmed on mine.

                                      I will post some more details findings later. In short SPI clock is only about 40% of what your set it to on RPi 4 unless you fix the core clock.

                                      Manuel
                                      Duet 3 6HC (v0.6) with RPi 4B on a custom Cartesian
                                      with probably always latest firmware/DWC (incl. betas or self-compiled)
                                      My Tool Collection

                                      1 Reply Last reply Reply Quote 1
                                      • T3P3Tonyundefined
                                        T3P3Tony administrators
                                        last edited by

                                        @T3P3Tony said in Bad checksums - Which connection?:

                                        core_freq=250
                                        core_freq_min=250

                                        but i i want to be greedy, don't care about power, and have a heatsink i would potentially fix this at 500 or 600?

                                        I just checked (vcgencmd measure_clock arm) and i am running at 600 right now (probably due to using VNC with a big screen resolution which is quite processor intensive i guess).

                                        might test:

                                        core_freq=600
                                        core_freq_min=600
                                        

                                        Also what is not clear is what the actual difference between doing that and using force_turbo=1
                                        is

                                        www.duet3d.com

                                        wilrikerundefined 1 Reply Last reply Reply Quote 0
                                        • wilrikerundefined
                                          wilriker @T3P3Tony
                                          last edited by

                                          @T3P3Tony said in Bad checksums - Which connection?:

                                          @T3P3Tony said in Bad checksums - Which connection?:

                                          core_freq=250
                                          core_freq_min=250

                                          but i i want to be greedy, don't care about power, and have a heatsink i would potentially fix this at 500 or 600?

                                          I don't think this brings very much improvement but yes, you could fix that at 500 (I would not overclock something I don't even know what it all influences).

                                          I just had a thought that it might influence RAM clock but I have no evidence to back that up.

                                          I just checked (vcgencmd measure_clock arm) and i am running at 600 right now (probably due to using VNC with a big screen resolution which is quite processor intensive i guess).

                                          That's the CPU clock you checked. So although you are running VNC it's still stepped down to it's minimum speed. 😁 It's the clock core you were looking for.

                                          Also what is not clear is what the actual difference between doing that and using force_turbo=1
                                          is

                                          force_turbo=1 will do some more things, some of them being very sketchy when it comes to system stability - but it also blows an "efuse" (i.e. it sets a non-reversible flag) to void your warranty.

                                          Now to my (most likely confusing) explanation about SPI speed: what you set is the maximum frequency. For the first two Pi generations that is equal to the frequency at any point in time but starting with RPi 3 this changed. Before that the core clock (which is the base for the SPI frequency) was fixed at 250MHz. RPi 3 has a scaling core clock 250-400MHz. And RPi 4 has a range of 200-500MHz (that lower minimum will bite is later).

                                          SPI frequency is derived from the core frequency divided by an even number and also scales with it. And since all RPis should be interchangeable they all use the common denominator of 250MHz for that. The first even number is 2 and thus 125MHz is the maximum SPI frequency any RPi can achieve.
                                          But since they introduced possibly higher core frequencies with RPi 3 and later they had to make sure that the above mentioned maximum frequency is not exceeded. They solved it by first deriving the actual maximum by checking for the next lower divider (e.g. setting to 21MHz will result in 125/6 =~ 20.83MHz) and then scale that value again so this will only be reached at maximum core clock.

                                          Simple example: set SPI frequency to 125MHz. On a RPi 4 with a maximum of 500MHz this means when the core clock actually runs at 250MHz the SPI frequency will also be halfed to 62.5MHz. And now the core clock in RPi 4 can even go down below that to 200MHz and guess what: SPI frequency will follow that as well.

                                          The biggest problem here is that whenever core clock changes SPI frequency will change with it not awaiting any data transfer to finish - some devices handle this better than others.
                                          And of course also you get lower speeds than what you requested without even noticing most of the time since with current resource usage when attached to a Duet 3 mine runs at only insignificantly higher usage than fully idle. core clock in mine was virtually fixed at 200MHz.

                                          That's why I fixed my minimum (new with RPi 4) and maximum to 250MHz. I don't know how it influences SPI frequency if you fix it at any value above that (my scope is orders of magnitude too slow to check that - might get myself a more capable one some day) and whenever I try to formulate a guess in my head I find a reason why this might still be different. So I just don't know.

                                          Manuel
                                          Duet 3 6HC (v0.6) with RPi 4B on a custom Cartesian
                                          with probably always latest firmware/DWC (incl. betas or self-compiled)
                                          My Tool Collection

                                          gtj0undefined 1 Reply Last reply Reply Quote 0
                                          • gtj0undefined
                                            gtj0 @wilriker
                                            last edited by

                                            @wilriker Can I just say "ew"?

                                            Get a Jetson Nano. 🤣

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Unless otherwise noted, all forum content is licensed under CC-BY-SA