Duet3D Logo Duet3D
    • Tags
    • Documentation
    • Order
    • Register
    • Login

    [3.6.0-beta.3] DWC connection issues Duet2 / Duex5

    Scheduled Pinned Locked Moved
    Beta Firmware
    6
    33
    960
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • timschneiderundefined
      timschneider @timschneider
      last edited by

      @dc42
      can you give me a hint what I can do to track this error down?

      I'm able to reproduce it within seconds.

      Thanks
      Tim

      DaveAundefined droftartsundefined 2 Replies Last reply Reply Quote 0
      • DaveAundefined
        DaveA @timschneider
        last edited by

        @timschneider
        Is it possible you have a variation of my problem which has been solver?

        https://forum.duet3d.com/topic/37329/3-6-0-beta-3-duet-2-issues-lost-communication/3

        1 Reply Last reply Reply Quote 0
        • droftartsundefined
          droftarts administrators @timschneider
          last edited by

          @timschneider I've updated my Mini 5+ to WiFi firmware v2.2.1, which is the beta.3 release, and tried the same test, but I don't get a disconnect. I hit M122 repeatedly 22 times. It seems like I only got 18 responses, but may be I'd hit return before the M122 had returned fully. I also tested with the WiFi firmware v2.2.0 (in the 3.5.4 release), and that was fine.

          Can you revert the WiFi firmware to v2.2.0 and see if still causes the same problem?

          @DaveA May possibly be related, but yours was specific to the Duet 2 Ethernet, this is WiFi. But then I don't write firmware!

          Ian

          Bed-slinger - Mini5+ WiFi/1LC | RRP Fisher v1 - D2 WiFi | Polargraph - D2 WiFi | TronXY X5S - 6HC/Roto | CNC router - 6HC | Tractus3D T1250 - D2 Eth

          timschneiderundefined 1 Reply Last reply Reply Quote 0
          • timschneiderundefined
            timschneider @droftarts
            last edited by

            @droftarts
            thank you for your tests!

            I was able to hook up via USB and grab some debug logs.

            It is somehow related to the connection mangement and kind of ran out of buffer, even if it appears that the buffers are wasted.

            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key= flags=fd99 }<\n>
            New conn on socket 2 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 287 bytes<\n>
            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key= flags=fd99 }<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            ...
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            New conn on socket 3 for local port 80<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            New conn on socket 3 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 287 bytes<\n>
            
            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key= flags=fd99 }<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            New conn on socket 0 for local port 80<\n>
            New conn on socket 0 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 302 bytes<\n>
            New conn on socket 1 for local port 80<\n>
            
            New conn on socket 1 for local port 80<\n>
            New conn on socket 1 for local port 80<\n>
            New conn on socket 1 for local port 80<\n>
            New conn on socket 1 for local port 80<\n>
            New conn on socket 1 for local port 80<\n>
            New conn on socket 1 for local port 80<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            New conn on socket 1 for local port 80<\n>
            New conn on socket 1 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 288 bytes<\n>
            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=job flags=d5 }<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            New conn on socket 0 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 419 bytes<\n>
            Sending reply, file = no<\n>
            HTTP req, command words { GET /rr_reply HTTP/1.0 }, parameters { }<\n>
            Sending G-Code reply to HTTP client 1 of 2 (length 3324)<\n>
            New conn on socket 0 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 431 bytes<\n>
            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { flags=d99fn }<\n>
            New conn on socket 2 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            
            New conn on socket 1 for local port 80<\n>
            New conn on socket 1 for local port 80<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            New conn on socket 1 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 288 bytes<\n>
            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=job flags=d5 }<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            New conn on socket 0 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 445 bytes<\n>
            HTTP req, command words { GET /rr_gcode HTTP/1.0 }, parameters { gcode=M122 }<\n>
            Sending JSON reply, length 12<\n>
            New conn on socket 0 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 288 bytes<\n>
            New conn on socket 1 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 288 bytes<\n>
            New conn on socket 2 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 431 bytes<\n>
            New conn on socket 3 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 288 bytes<\n>
            Peer disconnected on socket 2<\n>
            New conn on socket 2 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 302 bytes<\n>
            Timed out gcodeReply, free buffers=1<\n>
            Timed out gcodeReply, free buffers=2<\n>
            Timed out gcodeReply, free buffers=3<\n>
            Timed out gcodeReply, free buffers=5<\n>
            Timed out gcodeReply, free buffers=7<\n>
            Timed out gcodeReply, free buffers=9<\n>
            Timed out gcodeReply, free buffers=11<\n>
            Timed out gcodeReply, free buffers=12<\n>
            Timed out gcodeReply, free buffers=13<\n>
            Released gcodeReply, free buffers=26<\n>
            Released gcodeReply, free buffers=26<\n>
            Peer disconnected on socket 1<\n>
            Peer disconnected on socket 0<\n>
            New conn on socket 1 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 309 bytes<\n>
            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=sensors.filamentMonitors flags=d4 }<\n>
            Sending reply, file = no<\n>
            Webserver: rejecting message with: 401 Not authorized<\n>
            New conn on socket 0 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 302 bytes<\n>
            Sending reply, file = no<\n>
            HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=move.compensation flags=d4 }<\n>
            Webserver: rejecting message with: 401 Not authorized<\n>
            New conn on socket 1 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            

            Full logs attached
            output_2025-01-27_14-41-50.txt output_2025-01-27_14-26-22.txt output_2025-01-27_14-16-17.txt output_2025-01-27_14-14-06.txt output_2025-01-27_14-12-58.txt

            timschneiderundefined 1 Reply Last reply Reply Quote 0
            • timschneiderundefined
              timschneider @timschneider
              last edited by

              @droftarts
              it is on both WiFi Versions - just checked v2.2.0

              HTTP req, command words { GET /rr_gcode HTTP/1.0 }, parameters { gcode=m122 }<\n>
              Sending JSON reply, length 12<\n>
              Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
              Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
              New conn on socket 0 for local port 80<\n>
              HTTP connection accepted<\n>
              Found responder<\n>
              Received 287 bytes<\n>
              New conn on socket 1 for local port 80<\n>
              HTTP connection accepted<\n>
              Found responder<\n>
              Received 308 bytes<\n>
              New conn on socket 2 for local port 80<\n>
              HTTP connection accepted<\n>
              Found responder<\n>
              Received 442 bytes<\n>
              Peer disconnected on socket 2<\n>
              New conn on socket 2 for local port 80<\n>
              HTTP connection accepted<\n>
              Found responder<\n>
              Received 458 bytes<\n>
              Timed out gcodeReply, free buffers=1<\n>
              Timed out gcodeReply, free buffers=2<\n>
              Timed out gcodeReply, free buffers=3<\n>
              Timed out gcodeReply, free buffers=5<\n>
              Timed out gcodeReply, free buffers=7<\n>
              Timed out gcodeReply, free buffers=9<\n>
              Timed out gcodeReply, free buffers=11<\n>
              Timed out gcodeReply, free buffers=12<\n>
              Timed out gcodeReply, free buffers=13<\n>
              Released gcodeReply, free buffers=26<\n>
              Released gcodeReply, free buffers=26<\n>
              Peer disconnected on socket 2<\n>
              New conn on socket 2 for local port 80<\n>
              HTTP connection accepted<\n>
              

              full log
              output_2025-01-27_14-56-34.txt

              droftartsundefined 1 Reply Last reply Reply Quote 0
              • droftartsundefined
                droftarts administrators @timschneider
                last edited by

                @timschneider Great, thanks, I'll poke @dc42 with this info.

                Ian

                Bed-slinger - Mini5+ WiFi/1LC | RRP Fisher v1 - D2 WiFi | Polargraph - D2 WiFi | TronXY X5S - 6HC/Roto | CNC router - 6HC | Tractus3D T1250 - D2 Eth

                1 Reply Last reply Reply Quote 0
                • chrishammundefined
                  chrishamm administrators @timschneider
                  last edited by

                  @timschneider What HTTP clients do you have connected and do you have a PanelDue attached, too? It looks like the G-code reply (e.g. for M122) queues up because at least one client does not request it in regular intervals.

                  Duet software engineer

                  timschneiderundefined 1 Reply Last reply Reply Quote 0
                  • timschneiderundefined
                    timschneider @chrishamm
                    last edited by timschneider

                    @chrishamm ah good catch.

                    in the logs above I've one DWC and sometimes more, upto 6 DWCs from the same machine via browser and one python client querying the object model.

                    HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=move.compensation flags=d4 }<\n>
                    

                    is this client supposed to query rr_reply even if it is not asking for? all clients (should) use propper session keys.

                    the upto 6 DWCs just to poke the out of buffer and check if it makes a difference.
                    To be clear, either 1 DWC and python client or 6 DWC and 1 python clients.

                    EDIT: no panel due / standalone mode

                    EDIT 2:

                    rr_reply
                    Retrieve the last G-code reply. The G-code reply is buffered per connected HTTP client and it is discarded when every HTTP client has fetched it or when the firmware is short on memory and the client has not requested it within reasonable time (1 second).
                    
                    chrishammundefined 1 Reply Last reply Reply Quote 0
                    • chrishammundefined
                      chrishamm administrators @timschneider
                      last edited by

                      @timschneider Yes, for the time being I suggest you request rr_reply as well. AFAIR G-code replies are saved up to 8 seconds. We need to discuss how connections from "bots" could be handled better, perhaps an extra key for rr_connect could be added to indicate that the new client does not wish to poll G-code replies.

                      Duet software engineer

                      timschneiderundefined 1 Reply Last reply Reply Quote 0
                      • timschneiderundefined
                        timschneider @chrishamm
                        last edited by

                        @chrishamm
                        ok, checked it and indeed reducing the clients to one DWC session helped to mitigate the problem - or to say I was not able to trigger it via DWC with repeating M122 gcodes. But, i took the chance to rush down the rabbit hole and was able to trigger it with M122 M122 M122 as one command.

                        HTTP connection accepted<\n>
                        Found responder<\n>
                        Received 466 bytes<\n>
                        HTTP req, command words { GET /rr_gcode HTTP/1.0 }, parameters { gcode=M122 M122 M122 M122 }<\n>
                        Sending JSON reply, length 12<\n>
                        New conn on socket 0 for local port 80<\n>
                        HTTP connection accepted<\n>
                        Found responder<\n>
                        Received 431 bytes<\n>
                        Peer disconnected on socket 0<\n>
                        New conn on socket 0 for local port 80<\n>
                        HTTP connection accepted<\n>
                        Found responder<\n>
                        Received 431 bytes<\n>
                        Peer disconnected on socket 0<\n>
                        New conn on socket 0 for local port 80<\n>
                        HTTP connection accepted<\n>
                        Found responder<\n>
                        Received 431 bytes<\n>
                        Released gcodeReply, free buffers=26<\n>
                        Peer disconnected on socket 0<\n>
                        New conn on socket 0 for local port 80<\n>
                        HTTP connection accepted<\n>
                        Found responder<\n>
                        Received 458 bytes<\n>
                        HTTP req, command words { GET /rr_connect HTTP/1.0 }, parameters { password=meltingplot time=2025-1-27T15:33:25 sessionKey=yes }<\n>
                        Sending JSON reply, length 94<\n>
                        New conn on socket 0 for local port 80<\n>
                        HTTP connection accepted<\n>
                        Found responder<\n>
                        
                        
                        { GET /rr_gcode HTTP/1.0 }, parameters { gcode=m122 }Bad recv status sizeError: failed to retrieve WiFi status message: another SPI transfer is pendingBad recv status sizeBad recv status sizeError: failed to retrieve WiFi status message: another SPI transfer is pendingBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeReleased gcodeReply, free buffers=26Error: failed to retrieve WiFi status message: SPI timeoutBad recv status size
                        

                        don't know if this is somehow related, just catched it as a by-catch.

                        OK, lesson learned - teach the bot to query rr_reply every second, just in case. But still there is some problem with memory management and error handling.

                        timschneiderundefined 1 Reply Last reply Reply Quote 2
                        • timschneiderundefined
                          timschneider @timschneider
                          last edited by timschneider

                          @chrishamm @droftarts
                          it was to early - it will crash even with only 2 DWC connections opened. I also tried to poll rr_reply ever second but even that is not enough. If you have bad luck and two DWC sessions open, it will crash on the first or second try.

                          EDIT:
                          same for
                          M409 K"" F"d2"

                          chrishammundefined 1 Reply Last reply Reply Quote 0
                          • chrishammundefined
                            chrishamm administrators @timschneider
                            last edited by

                            @timschneider Thanks for the report. I've logged this at https://github.com/Duet3D/RepRapFirmware/issues/1073

                            chrishamm created this issue in Duet3D/RepRapFirmware

                            closed [Bug]: Long G-code replies can crash WiFi boards in standalone mode #1073

                            Duet software engineer

                            timschneiderundefined 1 Reply Last reply Reply Quote 1
                            • timschneiderundefined
                              timschneider @chrishamm
                              last edited by timschneider

                              @droftarts
                              @chrishamm thank you for open up a bug report for it, but i'm almost certain that this is just one bug out of at least two.

                              somehow the rr_reply can add up and even a quit short and simple output can crash the webserver / frontend.
                              the following output shows the output in the console when at least two clients are connected, in this case two DWC sessions

                              15721bd2-60c3-48f4-a537-53c0975f5fb0-grafik.png

                              after adding up, it will kill all open sessions
                              e488bf63-74f0-4b26-a17c-33ffe633c98a-grafik.png

                              if only one DWC session is open the output looks like the following and it is impossible (via DWC) to add up the output
                              c293a3d7-594f-4c60-83a2-6a0df5c3bfbc-grafik.png

                              We use a reverse proxy to connect to duet and I observed the above mentioned behavior in the past, but alwas thought it is related to the reverse proxy and was not paying attention to it. The above shown screenshots are NOT connected to the reverse proxy - I was directly connected to the duet.

                              EDIT: I've checked if i'm able to add up the output in DSF setup or crash the frontend - and it was not possible.

                              chrishammundefined dc42undefined 2 Replies Last reply Reply Quote 0
                              • chrishammundefined
                                chrishamm administrators @timschneider
                                last edited by

                                @timschneider DSF has a lot more memory available so it's very unlikely to get it to crash. If long G-code replies are not released quickly enough in standalone mode, that can lead to output buffer starvation which makes the whole webserver unresponsive (for a few seconds at least). That's why I proposed to call rr_reply relatively frequently from "bot" clients. The other issue which had to do with failed SPI transfers is being investigated.

                                Duet software engineer

                                timschneiderundefined 1 Reply Last reply Reply Quote 0
                                • timschneiderundefined
                                  timschneider @chrishamm
                                  last edited by timschneider

                                  @chrishamm This explanation does not seem to explain the behaviour shown above.

                                  In the above example there are only two DWC sessions open and the overall output is just about 1100 characters.

                                  I did another test with the following macros:

                                  M98 P"0:/macros/test-bufferoverflow.g"

                                  echo "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
                                  echo "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
                                  echo "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
                                  

                                  M98 P"0:/macros/test-bufferoverflow2.g" -> two times the content of test-bufferoverflow.g (6 times echo with 210 chars)
                                  M98 P"0:/macros/test-bufferoverflow3.g" -> two times the content of test-bufferoverflow2.g (12 times echo with 210 chars)
                                  M98 P"0:/macros/test-bufferoverflow4.g" -> two times the content of test-bufferoverflow3.g (24 times echo with 210 chars)

                                  Executing macro M98 P"0:/macros/test-bufferoverflow4.g" will work
                                  537915c1-d331-41f9-b478-5c6dbf4c4d26-grafik.png

                                  Executing two times macro M98 P"0:/macros/test-bufferoverflow4.g" will fail
                                  30bb2aa2-f63b-430c-bc4c-d08c53500383-grafik.png

                                  Executing two times macro M98 P"0:/macros/test-bufferoverflow3.g" will fail
                                  40aed0e4-3513-40c2-9da8-aca34277ce0b-grafik.png

                                  Executing two times macro M98 P"0:/macros/test-bufferoverflow2.g" will sometimes fail
                                  176098bd-3145-41e1-a078-3870551de1ec-grafik.png

                                  Executing three times macro M98 P"0:/macros/test-bufferoverflow2.g" will fail
                                  32014b1b-a69f-4bf2-9711-7aed9d92a653-grafik.png

                                  so it looks like it is realted to the memory, but it is no "out of memory" due to the size of the output.

                                  with only one DWC session i'm able to get it crashing with two times M98 P"0:/macros/test-bufferoverflow4.g" - but not with the other macros.

                                  The problem from my side is, that when two clients are connected and issue a G-Code with response, this sometimes crashes the webserver and both DWC connections are reconnecting. From theorie it should work as long as there are output buffers, but somehow it appears they get wasted somewhere.

                                  Output of M122

                                  30.1.2025, 18:36:37 	M122
                                  === Diagnostics ===
                                  RepRapFirmware for Duet 2 WiFi/Ethernet version 3.6.0-beta.3 (2025-01-16 19:09:12) running on Duet WiFi 1.02 or later + DueX5v0.11
                                  Board ID: 08DJM-9178L-L4MSJ-6J1FL-3S86J-TB2LN
                                  Used output buffers: 2 of 26 (26 max)
                                  === RTOS ===
                                  Static ram: 23520
                                  Dynamic ram: 68716 of which 220 recycled
                                  Never used RAM 27540, free system stack 182 words
                                  Tasks: NETWORK(1,ready,16.6%,195) HEAT(3,nWait 5,0.1%,283) Move(4,nWait 5,0.0%,361) DUEX(5,nWait 5,0.0%,23) MAIN(1,running,82.8%,712) IDLE(0,ready,0.5%,29), total 100.0%
                                  Owned mutexes:
                                  === Platform ===
                                  Last reset 30:48:46 ago, cause: power up
                                  Last software reset at 2025-01-27 15:25, reason: User, Gcodes spinning, available RAM 28108, slot 2
                                  Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0xe000ed38 SP 0x00000000 Task MAIN Freestk 0 n/a
                                  Error status: 0x0c
                                  MCU temperature: min 28.7, current 29.2, max 30.3
                                  Supply voltage: min 23.8, current 24.0, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes
                                  Heap OK, handles allocated/used 99/7, heap memory allocated/used/recyclable 2048/1484/1348, gc cycles 5713
                                  Events: 0 queued, 0 completed
                                  Date/time: 2025-01-30 18:36:30
                                  Slowest loop: 17.84ms; fastest: 0.18ms
                                  I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
                                  === Storage ===
                                  Free file entries: 8
                                  SD card 0 detected, interface speed: 20.0MBytes/sec
                                  SD card longest read time 8.5ms, write time 0.0ms, max retries 0
                                  === Move ===
                                  Segments created 0, maxWait 0ms, bed comp in use: none, height map offset 0.000, hiccups added 0/0 (0.00ms), max steps late 0, ebfmin 0.00, ebfmax 0.00
                                  Pos req/act/dcf: 0.00/0/0.00 0.00/0/0.00 0.00/0/0.00
                                  no step interrupt scheduled
                                  Driver 0: standstill, SG min n/a
                                  Driver 1: standstill, SG min n/a
                                  Driver 2: standstill, SG min n/a
                                  Driver 3: standstill, SG min n/a
                                  Driver 4: standstill, SG min n/a
                                  Driver 5: standstill, SG min n/a
                                  Driver 6: standstill, SG min n/a
                                  Driver 7: standstill, SG min n/a
                                  Driver 8: standstill, SG min n/a
                                  Driver 9: standstill, SG min n/a
                                  Driver 10: 
                                  Driver 11: 
                                  === DDARing 0 ===
                                  Scheduled moves 0, completed 0, LaErrors 0, Underruns [0, 0, 0]
                                  Segments left 0
                                  Code queue is empty
                                  === Heat ===
                                  Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0
                                  === GCodes ===
                                  Movement locks held by null
                                  HTTP is idle in state(s) 0
                                  Telnet is idle in state(s) 0
                                  File is idle in state(s) 0
                                  USB is idle in state(s) 0
                                  Aux is idle in state(s) 0
                                  Trigger is idle in state(s) 0
                                  Queue is idle in state(s) 0
                                  LCD is idle in state(s) 0
                                  Daemon is doing "G4 P100" in state(s) 0 0, running macro
                                  Autopause is idle in state(s) 0
                                  === Filament sensors ===
                                  check 0 clear 399166717
                                  Extruder 0: pos 2160.00, errs: frame 0 parity 0 ovrun 0 pol 0 ovdue 0
                                  === DueX ===
                                  Read count 0, 0.00 reads/min
                                  === Network ===
                                  Slowest loop: 21.90ms; fastest: 0.07ms
                                  Responder states: HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0)
                                  HTTP sessions: 1 of 8
                                  === WiFi ===
                                  Interface state: active
                                  Module is connected to access point 
                                  Failed messages: pending 0, notrdy 0, noresp 0
                                  Firmware version 2.2.1
                                  MAC address bc:dd:c2:31:9a:40
                                  Module reset reason: Turned on by main processor, Vcc 3.46, flash size 4194304, free heap 42496
                                  WiFi IP address 10.42.0.2
                                  Signal strength -21dBm, channel 11, mode 802.11n, reconnections 0
                                  Clock register 00002002
                                  Socket states: 0 0 0 0 0 0 0 0
                                  
                                  1 Reply Last reply Reply Quote 0
                                  • dc42undefined
                                    dc42 administrators @timschneider
                                    last edited by dc42

                                    @timschneider if you enable all Move debugging in recent versions of RRF then huge amounts of debug output are generated, which means all bets are off. M111 should only be used to enable debugging selectively, using the P and D parameters so select the modules and subsystems to debug.

                                    Duet 2 is somewhat memory constrained so it has less output buffer memory available than Duet 3 boards. That said, it does appear from the output you have posted that there may be a particular issue with the interface to the WiFi module when there is a lot of data to output. This is on my list to investigate.

                                    Duet WiFi hardware designer and firmware engineer
                                    Please do not ask me for Duet support via PM or email, use the forum
                                    http://www.escher3d.com, https://miscsolutions.wordpress.com

                                    timschneiderundefined 1 Reply Last reply Reply Quote 1
                                    • timschneiderundefined
                                      timschneider @dc42
                                      last edited by timschneider

                                      @dc42 @chrishamm
                                      thank you both for your time and patience.
                                      I moved forward with the knowledge of the low memory issue and reprogrammed the python client to be a fully fledged digital twin of the duet printer as per object model. In order to archive this I followed the description of M409 and implemented the a and next handling aswell as the seqs handling. So what I do now is, poll the rr_model endpoint with key=seqs to keep track of the sequence numbers and keep the load on the duet as low as possible. If a change happens, it will act accordingly, e.g. fetch a reply or partially fetch the updated object model recursive with max depth of level + 1, again to keep the load as low as possible.

                                      but even that didn't solved the problem - it appears in some cases to make it even worse (i guess two output buffers per json response)

                                      In the following there is one DWC Client and one python client connected.

                                      Anyhow I wanted to understand the problem and at least find a workaround, but I was not able to.

                                      What I found was that on an error - DWC is doing a reconnect straight, which makes it even worse, as this will increase the number of connected clients and will leave gcode replies unreceived and the buffer allocated, I guess?

                                      3e8b16bb-46b9-4129-b06d-7b3b0200399a-grafik.png

                                      Using the same macro as above. Directly after sending the command, the following rr_model request in DWC will fail - python client will also receive a HTTP 503 busy and will backoff 5 sec. before even trying to do anything (leaving the session open / timeout=8s - and leaving the reply probably not fetched for this session / due to the 503 busy response). DWC will open a new session and the internal connected clients count is increased, still keeping the buffer allocated. DWC tries to fetch a rr_model and will fail again with HTTP 503 - again reconnect / client count +1 / buffer allocated and so on ...

                                      a5881d65-8451-47a2-b801-e79fe44ea5b3-grafik.png

                                      The last rr_reply in the timeline is empty - and the reconnect and following rr_model request are successfull.

                                      Maybe it is possible to change the behaviour of the DWC reconnect mechanism and check if the HTTP response is 503 - wait and do not reconnect and on the duet side if multiple json requests fail - drop pending gcode replies? Or as an alternative always serve the rr_model seqs requests to give a client a chance to consume rr_reply to free up output buffers. Just randomly polling the rr_reply endpoint will steal the response from DWC, so thats also not an option.

                                      timschneiderundefined chrishammundefined 2 Replies Last reply Reply Quote 1
                                      • timschneiderundefined
                                        timschneider @timschneider
                                        last edited by timschneider

                                        @timschneider
                                        wow that did the trick, when the python client receives an 503 it will just query the rr_reply immediately until the response is empty to free up buffers on the duet - DWC will not reconnect and everything works 🙂 even with M98 P"0:/macros/test-bufferoverflow3.g", but DWC will sometimes not receive the output of that macro as the python client consumed both replies.

                                        It will even survive M98 P"0:/macros/test-bufferoverflow4.g"

                                        Using this method having 6 concurrent DWC connections open and one python client is no problem anymore - it will even work with the M98 P"0:/macros/test-bufferoverflow4.g"

                                        Edit: there is still some problem in data and memory management, sometimes i do receive partial data Response payload is not completed: <ContentLengthError: 400, message='Not enough data for satisfy content length header.'> due to connection closing mid transmitting from duet to the reverse proxy.

                                        1 Reply Last reply Reply Quote 0
                                        • chrishammundefined
                                          chrishamm administrators @timschneider
                                          last edited by

                                          @timschneider What do you have "AJAX retries" in the DWC settings set to?

                                          Duet software engineer

                                          timschneiderundefined 2 Replies Last reply Reply Quote 0
                                          • timschneiderundefined
                                            timschneider @chrishamm
                                            last edited by

                                            @chrishamm
                                            d8570a95-0de9-412c-b915-f1a40a72e98c-grafik.png

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Unless otherwise noted, all forum content is licensed under CC-BY-SA