• Tags
  • Documentation
  • Order
  • Register
  • Login
Duet3D Logo Duet3D
  • Tags
  • Documentation
  • Order
  • Register
  • Login

[3.6.0-beta.3] DWC connection issues Duet2 / Duex5

Scheduled Pinned Locked Moved
Beta Firmware
6
33
858
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • undefined
    timschneider @timschneider
    last edited by 27 Jan 2025, 12:17

    @dc42
    can you give me a hint what I can do to track this error down?

    I'm able to reproduce it within seconds.

    Thanks
    Tim

    undefined undefined 2 Replies Last reply 27 Jan 2025, 12:46 Reply Quote 0
    • undefined
      DaveA @timschneider
      last edited by 27 Jan 2025, 12:46

      @timschneider
      Is it possible you have a variation of my problem which has been solver?

      https://forum.duet3d.com/topic/37329/3-6-0-beta-3-duet-2-issues-lost-communication/3

      1 Reply Last reply Reply Quote 0
      • undefined
        droftarts administrators @timschneider
        last edited by 27 Jan 2025, 13:17

        @timschneider I've updated my Mini 5+ to WiFi firmware v2.2.1, which is the beta.3 release, and tried the same test, but I don't get a disconnect. I hit M122 repeatedly 22 times. It seems like I only got 18 responses, but may be I'd hit return before the M122 had returned fully. I also tested with the WiFi firmware v2.2.0 (in the 3.5.4 release), and that was fine.

        Can you revert the WiFi firmware to v2.2.0 and see if still causes the same problem?

        @DaveA May possibly be related, but yours was specific to the Duet 2 Ethernet, this is WiFi. But then I don't write firmware!

        Ian

        Bed-slinger - Mini5+ WiFi/1LC | RRP Fisher v1 - D2 WiFi | Polargraph - D2 WiFi | TronXY X5S - 6HC/Roto | CNC router - 6HC | Tractus3D T1250 - D2 Eth

        undefined 1 Reply Last reply 27 Jan 2025, 13:43 Reply Quote 0
        • undefined
          timschneider @droftarts
          last edited by 27 Jan 2025, 13:43

          @droftarts
          thank you for your tests!

          I was able to hook up via USB and grab some debug logs.

          It is somehow related to the connection mangement and kind of ran out of buffer, even if it appears that the buffers are wasted.

          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key= flags=fd99 }<\n>
          New conn on socket 2 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 287 bytes<\n>
          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key= flags=fd99 }<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          ...
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          New conn on socket 3 for local port 80<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          New conn on socket 3 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 287 bytes<\n>
          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key= flags=fd99 }<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          New conn on socket 0 for local port 80<\n>
          New conn on socket 0 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 302 bytes<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 288 bytes<\n>
          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=job flags=d5 }<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          New conn on socket 0 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 419 bytes<\n>
          Sending reply, file = no<\n>
          HTTP req, command words { GET /rr_reply HTTP/1.0 }, parameters { }<\n>
          Sending G-Code reply to HTTP client 1 of 2 (length 3324)<\n>
          New conn on socket 0 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 431 bytes<\n>
          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { flags=d99fn }<\n>
          New conn on socket 2 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          New conn on socket 1 for local port 80<\n>
          New conn on socket 1 for local port 80<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          New conn on socket 1 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 288 bytes<\n>
          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=job flags=d5 }<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
          New conn on socket 0 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 445 bytes<\n>
          HTTP req, command words { GET /rr_gcode HTTP/1.0 }, parameters { gcode=M122 }<\n>
          Sending JSON reply, length 12<\n>
          New conn on socket 0 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 288 bytes<\n>
          New conn on socket 1 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 288 bytes<\n>
          New conn on socket 2 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 431 bytes<\n>
          New conn on socket 3 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 288 bytes<\n>
          Peer disconnected on socket 2<\n>
          New conn on socket 2 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 302 bytes<\n>
          Timed out gcodeReply, free buffers=1<\n>
          Timed out gcodeReply, free buffers=2<\n>
          Timed out gcodeReply, free buffers=3<\n>
          Timed out gcodeReply, free buffers=5<\n>
          Timed out gcodeReply, free buffers=7<\n>
          Timed out gcodeReply, free buffers=9<\n>
          Timed out gcodeReply, free buffers=11<\n>
          Timed out gcodeReply, free buffers=12<\n>
          Timed out gcodeReply, free buffers=13<\n>
          Released gcodeReply, free buffers=26<\n>
          Released gcodeReply, free buffers=26<\n>
          Peer disconnected on socket 1<\n>
          Peer disconnected on socket 0<\n>
          New conn on socket 1 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 309 bytes<\n>
          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=sensors.filamentMonitors flags=d4 }<\n>
          Sending reply, file = no<\n>
          Webserver: rejecting message with: 401 Not authorized<\n>
          New conn on socket 0 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>
          Received 302 bytes<\n>
          Sending reply, file = no<\n>
          HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=move.compensation flags=d4 }<\n>
          Webserver: rejecting message with: 401 Not authorized<\n>
          New conn on socket 1 for local port 80<\n>
          HTTP connection accepted<\n>
          Found responder<\n>

          Full logs attached
          output_2025-01-27_14-41-50.txt output_2025-01-27_14-26-22.txt output_2025-01-27_14-16-17.txt output_2025-01-27_14-14-06.txt output_2025-01-27_14-12-58.txt

          undefined 1 Reply Last reply 27 Jan 2025, 13:57 Reply Quote 0
          • undefined
            timschneider @timschneider
            last edited by 27 Jan 2025, 13:57

            @droftarts
            it is on both WiFi Versions - just checked v2.2.0

            HTTP req, command words { GET /rr_gcode HTTP/1.0 }, parameters { gcode=m122 }<\n>
            Sending JSON reply, length 12<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            Ran out of output buffers at ../src/Networking/HttpResponder.cpp(1109)<\n>
            New conn on socket 0 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 287 bytes<\n>
            New conn on socket 1 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 308 bytes<\n>
            New conn on socket 2 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 442 bytes<\n>
            Peer disconnected on socket 2<\n>
            New conn on socket 2 for local port 80<\n>
            HTTP connection accepted<\n>
            Found responder<\n>
            Received 458 bytes<\n>
            Timed out gcodeReply, free buffers=1<\n>
            Timed out gcodeReply, free buffers=2<\n>
            Timed out gcodeReply, free buffers=3<\n>
            Timed out gcodeReply, free buffers=5<\n>
            Timed out gcodeReply, free buffers=7<\n>
            Timed out gcodeReply, free buffers=9<\n>
            Timed out gcodeReply, free buffers=11<\n>
            Timed out gcodeReply, free buffers=12<\n>
            Timed out gcodeReply, free buffers=13<\n>
            Released gcodeReply, free buffers=26<\n>
            Released gcodeReply, free buffers=26<\n>
            Peer disconnected on socket 2<\n>
            New conn on socket 2 for local port 80<\n>
            HTTP connection accepted<\n>

            full log
            output_2025-01-27_14-56-34.txt

            undefined 1 Reply Last reply 27 Jan 2025, 14:03 Reply Quote 0
            • undefined
              droftarts administrators @timschneider
              last edited by 27 Jan 2025, 14:03

              @timschneider Great, thanks, I'll poke @dc42 with this info.

              Ian

              Bed-slinger - Mini5+ WiFi/1LC | RRP Fisher v1 - D2 WiFi | Polargraph - D2 WiFi | TronXY X5S - 6HC/Roto | CNC router - 6HC | Tractus3D T1250 - D2 Eth

              1 Reply Last reply Reply Quote 0
              • undefined
                chrishamm administrators @timschneider
                last edited by 27 Jan 2025, 14:10

                @timschneider What HTTP clients do you have connected and do you have a PanelDue attached, too? It looks like the G-code reply (e.g. for M122) queues up because at least one client does not request it in regular intervals.

                Duet software engineer

                undefined 1 Reply Last reply 27 Jan 2025, 14:14 Reply Quote 0
                • undefined
                  timschneider @chrishamm
                  last edited by timschneider 27 Jan 2025, 14:14

                  @chrishamm ah good catch.

                  in the logs above I've one DWC and sometimes more, upto 6 DWCs from the same machine via browser and one python client querying the object model.

                  HTTP req, command words { GET /rr_model HTTP/1.0 }, parameters { key=move.compensation flags=d4 }<\n>

                  is this client supposed to query rr_reply even if it is not asking for? all clients (should) use propper session keys.

                  the upto 6 DWCs just to poke the out of buffer and check if it makes a difference.
                  To be clear, either 1 DWC and python client or 6 DWC and 1 python clients.

                  EDIT: no panel due / standalone mode

                  EDIT 2:

                  rr_reply
                  Retrieve the last G-code reply. The G-code reply is buffered per connected HTTP client and it is discarded when every HTTP client has fetched it or when the firmware is short on memory and the client has not requested it within reasonable time (1 second).
                  undefined 1 Reply Last reply 27 Jan 2025, 14:26 Reply Quote 0
                  • undefined
                    chrishamm administrators @timschneider
                    last edited by 27 Jan 2025, 14:26

                    @timschneider Yes, for the time being I suggest you request rr_reply as well. AFAIR G-code replies are saved up to 8 seconds. We need to discuss how connections from "bots" could be handled better, perhaps an extra key for rr_connect could be added to indicate that the new client does not wish to poll G-code replies.

                    Duet software engineer

                    undefined 1 Reply Last reply 27 Jan 2025, 14:44 Reply Quote 0
                    • undefined
                      timschneider @chrishamm
                      last edited by 27 Jan 2025, 14:44

                      @chrishamm
                      ok, checked it and indeed reducing the clients to one DWC session helped to mitigate the problem - or to say I was not able to trigger it via DWC with repeating M122 gcodes. But, i took the chance to rush down the rabbit hole and was able to trigger it with M122 M122 M122 as one command.

                      HTTP connection accepted<\n>
                      Found responder<\n>
                      Received 466 bytes<\n>
                      HTTP req, command words { GET /rr_gcode HTTP/1.0 }, parameters { gcode=M122 M122 M122 M122 }<\n>
                      Sending JSON reply, length 12<\n>
                      New conn on socket 0 for local port 80<\n>
                      HTTP connection accepted<\n>
                      Found responder<\n>
                      Received 431 bytes<\n>
                      Peer disconnected on socket 0<\n>
                      New conn on socket 0 for local port 80<\n>
                      HTTP connection accepted<\n>
                      Found responder<\n>
                      Received 431 bytes<\n>
                      Peer disconnected on socket 0<\n>
                      New conn on socket 0 for local port 80<\n>
                      HTTP connection accepted<\n>
                      Found responder<\n>
                      Received 431 bytes<\n>
                      Released gcodeReply, free buffers=26<\n>
                      Peer disconnected on socket 0<\n>
                      New conn on socket 0 for local port 80<\n>
                      HTTP connection accepted<\n>
                      Found responder<\n>
                      Received 458 bytes<\n>
                      HTTP req, command words { GET /rr_connect HTTP/1.0 }, parameters { password=meltingplot time=2025-1-27T15:33:25 sessionKey=yes }<\n>
                      Sending JSON reply, length 94<\n>
                      New conn on socket 0 for local port 80<\n>
                      HTTP connection accepted<\n>
                      Found responder<\n>
                      { GET /rr_gcode HTTP/1.0 }, parameters { gcode=m122 }Bad recv status sizeError: failed to retrieve WiFi status message: another SPI transfer is pendingBad recv status sizeBad recv status sizeError: failed to retrieve WiFi status message: another SPI transfer is pendingBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeError: failed to retrieve WiFi status message: SPI timeoutBad recv status sizeReleased gcodeReply, free buffers=26Error: failed to retrieve WiFi status message: SPI timeoutBad recv status size
                      

                      don't know if this is somehow related, just catched it as a by-catch.

                      OK, lesson learned - teach the bot to query rr_reply every second, just in case. But still there is some problem with memory management and error handling.

                      undefined 1 Reply Last reply 27 Jan 2025, 16:47 Reply Quote 2
                      • undefined
                        timschneider @timschneider
                        last edited by timschneider 27 Jan 2025, 16:47

                        @chrishamm @droftarts
                        it was to early - it will crash even with only 2 DWC connections opened. I also tried to poll rr_reply ever second but even that is not enough. If you have bad luck and two DWC sessions open, it will crash on the first or second try.

                        EDIT:
                        same for
                        M409 K"" F"d2"

                        undefined 1 Reply Last reply 28 Jan 2025, 13:59 Reply Quote 0
                        • undefined
                          chrishamm administrators @timschneider
                          last edited by 28 Jan 2025, 13:59

                          @timschneider Thanks for the report. I've logged this at https://github.com/Duet3D/RepRapFirmware/issues/1073

                          chrishamm created this issue 28 Jan 2025, 13:58 in Duet3D/RepRapFirmware

                          closed [Bug]: Long G-code replies can crash WiFi boards in standalone mode #1073

                          Duet software engineer

                          undefined 1 Reply Last reply 30 Jan 2025, 15:10 Reply Quote 1
                          • undefined
                            timschneider @chrishamm
                            last edited by timschneider 30 Jan 2025, 15:10

                            @droftarts
                            @chrishamm thank you for open up a bug report for it, but i'm almost certain that this is just one bug out of at least two.

                            somehow the rr_reply can add up and even a quit short and simple output can crash the webserver / frontend.
                            the following output shows the output in the console when at least two clients are connected, in this case two DWC sessions

                            15721bd2-60c3-48f4-a537-53c0975f5fb0-grafik.png

                            after adding up, it will kill all open sessions
                            e488bf63-74f0-4b26-a17c-33ffe633c98a-grafik.png

                            if only one DWC session is open the output looks like the following and it is impossible (via DWC) to add up the output
                            c293a3d7-594f-4c60-83a2-6a0df5c3bfbc-grafik.png

                            We use a reverse proxy to connect to duet and I observed the above mentioned behavior in the past, but alwas thought it is related to the reverse proxy and was not paying attention to it. The above shown screenshots are NOT connected to the reverse proxy - I was directly connected to the duet.

                            EDIT: I've checked if i'm able to add up the output in DSF setup or crash the frontend - and it was not possible.

                            undefined undefined 2 Replies Last reply 30 Jan 2025, 15:41 Reply Quote 0
                            • undefined
                              chrishamm administrators @timschneider
                              last edited by 30 Jan 2025, 15:41

                              @timschneider DSF has a lot more memory available so it's very unlikely to get it to crash. If long G-code replies are not released quickly enough in standalone mode, that can lead to output buffer starvation which makes the whole webserver unresponsive (for a few seconds at least). That's why I proposed to call rr_reply relatively frequently from "bot" clients. The other issue which had to do with failed SPI transfers is being investigated.

                              Duet software engineer

                              undefined 1 Reply Last reply 30 Jan 2025, 17:39 Reply Quote 0
                              • undefined
                                timschneider @chrishamm
                                last edited by timschneider 30 Jan 2025, 17:39

                                @chrishamm This explanation does not seem to explain the behaviour shown above.

                                In the above example there are only two DWC sessions open and the overall output is just about 1100 characters.

                                I did another test with the following macros:

                                M98 P"0:/macros/test-bufferoverflow.g"

                                echo "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
                                echo "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
                                echo "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"

                                M98 P"0:/macros/test-bufferoverflow2.g" -> two times the content of test-bufferoverflow.g (6 times echo with 210 chars)
                                M98 P"0:/macros/test-bufferoverflow3.g" -> two times the content of test-bufferoverflow2.g (12 times echo with 210 chars)
                                M98 P"0:/macros/test-bufferoverflow4.g" -> two times the content of test-bufferoverflow3.g (24 times echo with 210 chars)

                                Executing macro M98 P"0:/macros/test-bufferoverflow4.g" will work
                                537915c1-d331-41f9-b478-5c6dbf4c4d26-grafik.png

                                Executing two times macro M98 P"0:/macros/test-bufferoverflow4.g" will fail
                                30bb2aa2-f63b-430c-bc4c-d08c53500383-grafik.png

                                Executing two times macro M98 P"0:/macros/test-bufferoverflow3.g" will fail
                                40aed0e4-3513-40c2-9da8-aca34277ce0b-grafik.png

                                Executing two times macro M98 P"0:/macros/test-bufferoverflow2.g" will sometimes fail
                                176098bd-3145-41e1-a078-3870551de1ec-grafik.png

                                Executing three times macro M98 P"0:/macros/test-bufferoverflow2.g" will fail
                                32014b1b-a69f-4bf2-9711-7aed9d92a653-grafik.png

                                so it looks like it is realted to the memory, but it is no "out of memory" due to the size of the output.

                                with only one DWC session i'm able to get it crashing with two times M98 P"0:/macros/test-bufferoverflow4.g" - but not with the other macros.

                                The problem from my side is, that when two clients are connected and issue a G-Code with response, this sometimes crashes the webserver and both DWC connections are reconnecting. From theorie it should work as long as there are output buffers, but somehow it appears they get wasted somewhere.

                                Output of M122

                                30.1.2025, 18:36:37 M122
                                === Diagnostics ===
                                RepRapFirmware for Duet 2 WiFi/Ethernet version 3.6.0-beta.3 (2025-01-16 19:09:12) running on Duet WiFi 1.02 or later + DueX5v0.11
                                Board ID: 08DJM-9178L-L4MSJ-6J1FL-3S86J-TB2LN
                                Used output buffers: 2 of 26 (26 max)
                                === RTOS ===
                                Static ram: 23520
                                Dynamic ram: 68716 of which 220 recycled
                                Never used RAM 27540, free system stack 182 words
                                Tasks: NETWORK(1,ready,16.6%,195) HEAT(3,nWait 5,0.1%,283) Move(4,nWait 5,0.0%,361) DUEX(5,nWait 5,0.0%,23) MAIN(1,running,82.8%,712) IDLE(0,ready,0.5%,29), total 100.0%
                                Owned mutexes:
                                === Platform ===
                                Last reset 30:48:46 ago, cause: power up
                                Last software reset at 2025-01-27 15:25, reason: User, Gcodes spinning, available RAM 28108, slot 2
                                Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0xe000ed38 SP 0x00000000 Task MAIN Freestk 0 n/a
                                Error status: 0x0c
                                MCU temperature: min 28.7, current 29.2, max 30.3
                                Supply voltage: min 23.8, current 24.0, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes
                                Heap OK, handles allocated/used 99/7, heap memory allocated/used/recyclable 2048/1484/1348, gc cycles 5713
                                Events: 0 queued, 0 completed
                                Date/time: 2025-01-30 18:36:30
                                Slowest loop: 17.84ms; fastest: 0.18ms
                                I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
                                === Storage ===
                                Free file entries: 8
                                SD card 0 detected, interface speed: 20.0MBytes/sec
                                SD card longest read time 8.5ms, write time 0.0ms, max retries 0
                                === Move ===
                                Segments created 0, maxWait 0ms, bed comp in use: none, height map offset 0.000, hiccups added 0/0 (0.00ms), max steps late 0, ebfmin 0.00, ebfmax 0.00
                                Pos req/act/dcf: 0.00/0/0.00 0.00/0/0.00 0.00/0/0.00
                                no step interrupt scheduled
                                Driver 0: standstill, SG min n/a
                                Driver 1: standstill, SG min n/a
                                Driver 2: standstill, SG min n/a
                                Driver 3: standstill, SG min n/a
                                Driver 4: standstill, SG min n/a
                                Driver 5: standstill, SG min n/a
                                Driver 6: standstill, SG min n/a
                                Driver 7: standstill, SG min n/a
                                Driver 8: standstill, SG min n/a
                                Driver 9: standstill, SG min n/a
                                Driver 10:
                                Driver 11:
                                === DDARing 0 ===
                                Scheduled moves 0, completed 0, LaErrors 0, Underruns [0, 0, 0]
                                Segments left 0
                                Code queue is empty
                                === Heat ===
                                Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0
                                === GCodes ===
                                Movement locks held by null
                                HTTP is idle in state(s) 0
                                Telnet is idle in state(s) 0
                                File is idle in state(s) 0
                                USB is idle in state(s) 0
                                Aux is idle in state(s) 0
                                Trigger is idle in state(s) 0
                                Queue is idle in state(s) 0
                                LCD is idle in state(s) 0
                                Daemon is doing "G4 P100" in state(s) 0 0, running macro
                                Autopause is idle in state(s) 0
                                === Filament sensors ===
                                check 0 clear 399166717
                                Extruder 0: pos 2160.00, errs: frame 0 parity 0 ovrun 0 pol 0 ovdue 0
                                === DueX ===
                                Read count 0, 0.00 reads/min
                                === Network ===
                                Slowest loop: 21.90ms; fastest: 0.07ms
                                Responder states: HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0)
                                HTTP sessions: 1 of 8
                                === WiFi ===
                                Interface state: active
                                Module is connected to access point
                                Failed messages: pending 0, notrdy 0, noresp 0
                                Firmware version 2.2.1
                                MAC address bc:dd:c2:31:9a:40
                                Module reset reason: Turned on by main processor, Vcc 3.46, flash size 4194304, free heap 42496
                                WiFi IP address 10.42.0.2
                                Signal strength -21dBm, channel 11, mode 802.11n, reconnections 0
                                Clock register 00002002
                                Socket states: 0 0 0 0 0 0 0 0
                                1 Reply Last reply Reply Quote 0
                                • undefined
                                  dc42 administrators @timschneider
                                  last edited by dc42 30 Jan 2025, 20:08

                                  @timschneider if you enable all Move debugging in recent versions of RRF then huge amounts of debug output are generated, which means all bets are off. M111 should only be used to enable debugging selectively, using the P and D parameters so select the modules and subsystems to debug.

                                  Duet 2 is somewhat memory constrained so it has less output buffer memory available than Duet 3 boards. That said, it does appear from the output you have posted that there may be a particular issue with the interface to the WiFi module when there is a lot of data to output. This is on my list to investigate.

                                  Duet WiFi hardware designer and firmware engineer
                                  Please do not ask me for Duet support via PM or email, use the forum
                                  http://www.escher3d.com, https://miscsolutions.wordpress.com

                                  undefined 1 Reply Last reply 2 Feb 2025, 12:46 Reply Quote 1
                                  • undefined
                                    timschneider @dc42
                                    last edited by timschneider 2 Feb 2025, 13:25 2 Feb 2025, 12:46

                                    @dc42 @chrishamm
                                    thank you both for your time and patience.
                                    I moved forward with the knowledge of the low memory issue and reprogrammed the python client to be a fully fledged digital twin of the duet printer as per object model. In order to archive this I followed the description of M409 and implemented the a and next handling aswell as the seqs handling. So what I do now is, poll the rr_model endpoint with key=seqs to keep track of the sequence numbers and keep the load on the duet as low as possible. If a change happens, it will act accordingly, e.g. fetch a reply or partially fetch the updated object model recursive with max depth of level + 1, again to keep the load as low as possible.

                                    but even that didn't solved the problem - it appears in some cases to make it even worse (i guess two output buffers per json response)

                                    In the following there is one DWC Client and one python client connected.

                                    Anyhow I wanted to understand the problem and at least find a workaround, but I was not able to.

                                    What I found was that on an error - DWC is doing a reconnect straight, which makes it even worse, as this will increase the number of connected clients and will leave gcode replies unreceived and the buffer allocated, I guess?

                                    3e8b16bb-46b9-4129-b06d-7b3b0200399a-grafik.png

                                    Using the same macro as above. Directly after sending the command, the following rr_model request in DWC will fail - python client will also receive a HTTP 503 busy and will backoff 5 sec. before even trying to do anything (leaving the session open / timeout=8s - and leaving the reply probably not fetched for this session / due to the 503 busy response). DWC will open a new session and the internal connected clients count is increased, still keeping the buffer allocated. DWC tries to fetch a rr_model and will fail again with HTTP 503 - again reconnect / client count +1 / buffer allocated and so on ...

                                    a5881d65-8451-47a2-b801-e79fe44ea5b3-grafik.png

                                    The last rr_reply in the timeline is empty - and the reconnect and following rr_model request are successfull.

                                    Maybe it is possible to change the behaviour of the DWC reconnect mechanism and check if the HTTP response is 503 - wait and do not reconnect and on the duet side if multiple json requests fail - drop pending gcode replies? Or as an alternative always serve the rr_model seqs requests to give a client a chance to consume rr_reply to free up output buffers. Just randomly polling the rr_reply endpoint will steal the response from DWC, so thats also not an option.

                                    undefined undefined 2 Replies Last reply 2 Feb 2025, 13:01 Reply Quote 1
                                    • undefined
                                      timschneider @timschneider
                                      last edited by timschneider 2 Feb 2025, 14:49 2 Feb 2025, 13:01

                                      @timschneider
                                      wow that did the trick, when the python client receives an 503 it will just query the rr_reply immediately until the response is empty to free up buffers on the duet - DWC will not reconnect and everything works 🙂 even with M98 P"0:/macros/test-bufferoverflow3.g", but DWC will sometimes not receive the output of that macro as the python client consumed both replies.

                                      It will even survive M98 P"0:/macros/test-bufferoverflow4.g"

                                      Using this method having 6 concurrent DWC connections open and one python client is no problem anymore - it will even work with the M98 P"0:/macros/test-bufferoverflow4.g"

                                      Edit: there is still some problem in data and memory management, sometimes i do receive partial data Response payload is not completed: <ContentLengthError: 400, message='Not enough data for satisfy content length header.'> due to connection closing mid transmitting from duet to the reverse proxy.

                                      1 Reply Last reply Reply Quote 0
                                      • undefined
                                        chrishamm administrators @timschneider
                                        last edited by 3 Feb 2025, 11:31

                                        @timschneider What do you have "AJAX retries" in the DWC settings set to?

                                        Duet software engineer

                                        undefined 2 Replies Last reply 3 Feb 2025, 12:37 Reply Quote 0
                                        • undefined
                                          timschneider @chrishamm
                                          last edited by 3 Feb 2025, 12:37

                                          @chrishamm
                                          d8570a95-0de9-412c-b915-f1a40a72e98c-grafik.png

                                          1 Reply Last reply Reply Quote 0
                                          13 out of 33
                                          • First post
                                            13/33
                                            Last post
                                          Unless otherwise noted, all forum content is licensed under CC-BY-SA