Duet3D Logo Duet3D
    • Tags
    • Documentation
    • Order
    • Register
    • Login

    [3.4.5] DSF-Python - timeout failures

    Scheduled Pinned Locked Moved
    DSF Development
    3
    21
    905
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • oozeBotundefined
      oozeBot @chrishamm
      last edited by

      Thanks! In the meantime, we added a heartbeat that is run within the daemon to see if that protects against it timing out.

      1 Reply Last reply Reply Quote 0
      • Falcounetundefined
        Falcounet @chrishamm
        last edited by

        @chrishamm @oozeBot Yes but I'm abroad for now so I couldn't look at this before the end of the week

        oozeBotundefined 1 Reply Last reply Reply Quote 2
        • oozeBotundefined
          oozeBot @Falcounet
          last edited by

          @Falcounet @chrishamm

          FYI - adding a call to a custom mCode every 10 seconds through daemon.g did not fix the issue.. Thanks

          1 Reply Last reply Reply Quote 0
          • oozeBotundefined
            oozeBot
            last edited by

            Bumping this so it doesn't get lost.. plus we've noticed that this is not happening on all our machines, yet they all use the same OS image and are all running 3.4.5.

            We've now protected against this through a second "watchdog" service, but we'd obviously like it to not timeout in the first place as it's happening several times a day.

            Please let us know what we can help test/research between our machines to help diagnose the issue. Thanks

            chrishammundefined 1 Reply Last reply Reply Quote 0
            • chrishammundefined
              chrishamm administrators @oozeBot
              last edited by

              @Falcounet Any idea?

              Duet software engineer

              Falcounetundefined 1 Reply Last reply Reply Quote 0
              • Falcounetundefined
                Falcounet @chrishamm
                last edited by

                @chrishamm @oozeBot From what I see, you are not running the last version of dsfPython but that shouldn't change your issue anyway.

                It doesn't seems easy for me to reproduce your issue so maybe you can try the following :

                1. Backup /usr/local/lib/python3.9/dist-packages/dsf/connections.py as connections.py.bak
                2. Edit /usr/local/lib/python3.9/dist-packages/dsf/connections.py and comment lines 161 and 162 : https://github.com/Duet3D/dsf-python/blob/8fd345ed6455102b4750e1e4470e52028e1b291e/src/dsf/connections.py#L161-L162
                3. See if the issue still persists
                1 Reply Last reply Reply Quote 2
                • oozeBotundefined
                  oozeBot
                  last edited by

                  Thanks! We’ll update to the latest version, make the change, and then let it bake for awhile to see if that fixes it. We’ll report back soon..

                  Falcounetundefined 1 Reply Last reply Reply Quote 0
                  • Falcounetundefined
                    Falcounet @oozeBot
                    last edited by Falcounet

                    @oozeBot If you update first, the file will be /usr/local/lib/python3.9/dist-packages/dsf/connections/base_connection.py at lines 115 & 116 : https://github.com/Duet3D/dsf-python/blob/main/src/dsf/connections/base_connection.py#L115-L116

                    oozeBotundefined 1 Reply Last reply Reply Quote 1
                    • oozeBotundefined
                      oozeBot @Falcounet
                      last edited by

                      @Falcounet

                      Maybe we are missing something , but after upgrading to the latest version, it appears something has changed with the imports. The snippet below worked fine in the previous version, but with the latest version, it fails to import MessageType and LogLevel from dsf.commands.basecommands and InterceptionMode from dsf.initmessages.clientinitmessages.

                      Any thoughts on why and how to get past this? Were they renamed? Seems unlikely but reverting to 3.3.2 resolves the issue.

                      Thanks

                      from dsf.commands.basecommands import MessageType
                      from dsf.commands.basecommands import LogLevel
                      from dsf.commands.code import CodeType
                      from dsf.connections import CommandConnection, InterceptConnection
                      from dsf.initmessages.clientinitmessages import InterceptionMode
                      

                      One of the errors..

                      Jun 05 23:41:47 elevate OCS.py[564]:     from dsf.commands.basecommands import MessageType
                      Jun 05 23:41:47 elevate OCS.py[564]: ModuleNotFoundError: No module named 'dsf.commands.basecommands'
                      
                      Falcounetundefined 1 Reply Last reply Reply Quote 0
                      • Falcounetundefined
                        Falcounet @oozeBot
                        last edited by

                        @oozeBot They are renamed because dsf-python has been refactored mainly to follow DuetAPI

                        Your imports should be changed as :

                        from dsf.commands.code import CodeType
                        from dsf.connections import CommandConnection, InterceptConnection, InterceptionMode
                        from dsf.object_model import LogLevel, MessageType
                        
                        oozeBotundefined 1 Reply Last reply Reply Quote 1
                        • oozeBotundefined
                          oozeBot @Falcounet
                          last edited by

                          @Falcounet It's been over 24 hours since the change was made and there have been no timeouts with the service.. however, I did upgrade to the latest version and remove those two lines of code at the same time. I've added the two lines back and will let it run for another 24 hours to see if something else in the latest version fixed the issue and then report back.

                          chrishammundefined 1 Reply Last reply Reply Quote 2
                          • chrishammundefined
                            chrishamm administrators @oozeBot
                            last edited by

                            @oozeBot @Falcounet DSF may send zero-byte payloads to check if the socket is still open. I don't know if the Python client can actually detect that, if it does, those two lines should remain removed.

                            Duet software engineer

                            oozeBotundefined 1 Reply Last reply Reply Quote 1
                            • oozeBotundefined
                              oozeBot @chrishamm
                              last edited by

                              @chrishamm @Falcounet - just caught an error with the latest version.. so it appears Chris is right - DSF is sending zero-byte payloads which triggers this condition.

                              I'll remove those lines and let it run to see if that, in fact, fixes it.

                              Jun 08 13:45:59 elevate OCS.py[566]: Traceback (most recent call last):
                              Jun 08 13:45:59 elevate OCS.py[566]:   File "/opt/dsf/sd/scripts/OCS.py", line 101, in <module>
                              Jun 08 13:45:59 elevate OCS.py[566]:     cde = intercept_connection.receive_code()
                              Jun 08 13:45:59 elevate OCS.py[566]:   File "/usr/local/lib/python3.9/dist-packages/dsf/connections/intercept_connectio>
                              Jun 08 13:45:59 elevate OCS.py[566]:     return self.receive(commands.code.Code)
                              Jun 08 13:45:59 elevate OCS.py[566]:   File "/usr/local/lib/python3.9/dist-packages/dsf/connections/base_connection.py">
                              Jun 08 13:45:59 elevate OCS.py[566]:     json_string = self.receive_json()
                              Jun 08 13:45:59 elevate OCS.py[566]:   File "/usr/local/lib/python3.9/dist-packages/dsf/connections/base_connection.py">
                              Jun 08 13:45:59 elevate OCS.py[566]:     raise TimeoutError
                              Jun 08 13:45:59 elevate OCS.py[566]: TimeoutError
                              Jun 08 13:46:00 elevate systemd[1]: ocs.service: Main process exited, code=exited, status=1/FAILURE
                              
                              1 Reply Last reply Reply Quote 1
                              • oozeBotundefined
                                oozeBot
                                last edited by

                                @chrishamm @Falcounet - it has been over 3 days without a service crash since those two lines were removed on the latest version.. so it's pretty clear that's the issue.

                                Please let me know when a new official release is available that removes these two lines so we can update all our machines and test once again. Thanks!

                                oozeBotundefined 1 Reply Last reply Reply Quote 0
                                • oozeBotundefined
                                  oozeBot @oozeBot
                                  last edited by

                                  @chrishamm @Falcounet - it has been over 4 months without a service crash since those two lines were removed on the latest version.. so it's pretty clear that's the issue.

                                  When will the codebase be updated to correct this issue in a new release? Thanks

                                  Falcounetundefined 1 Reply Last reply Reply Quote 1
                                  • Falcounetundefined
                                    Falcounet @oozeBot
                                    last edited by

                                    @oozeBot The codebase was updated some months ago but I forgot to release the new version, sorry.
                                    dsf-python 3.4.6 is released today

                                    oozeBotundefined 1 Reply Last reply Reply Quote 1
                                    • oozeBotundefined
                                      oozeBot @Falcounet
                                      last edited by

                                      @Falcounet Thanks! The refactoring broke our code which worked in 3.4.5. I'm still learning and without more examples, I'm not certain what needs to change. Can you guide me through the changes to our declarations to get this working for the new version?

                                      Our Code:

                                      from dsf.commands.code import CodeType
                                      from dsf.connections import CommandConnection, InterceptConnection, InterceptionMode, SubscribeConnection, SubscriptionMode
                                      from dsf.object_model import LogLevel, MessageType
                                      

                                      Errors presented in 3.4.6

                                      Nov 01 14:12:45 workbench1 systemd[1]: Started oozeBot Control Server.
                                      Nov 01 14:12:45 workbench1 OCS.py[868]: Traceback (most recent call last):
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:   File "/opt/dsf/sd/scripts/OCS.py", line 17, in <module>
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:     from dsf.commands.code import CodeType
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:   File "/usr/local/lib/python3.7/dist-packages/dsf/__init__.py", line 10, in <module>
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:     from . import commands, connections, http, object_model
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:   File "/usr/local/lib/python3.7/dist-packages/dsf/connections/__init__.py", line 47, in <module>
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:     from .base_command_connection import BaseCommandConnection
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:   File "/usr/local/lib/python3.7/dist-packages/dsf/connections/base_command_connection.py", line 3, in <module>
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:     from .base_connection import BaseConnection
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:   File "/usr/local/lib/python3.7/dist-packages/dsf/connections/base_connection.py", line 6, in <module>
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:     from .init_messages import client_init_messages, server_init_message
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:   File "/usr/local/lib/python3.7/dist-packages/dsf/connections/init_messages/__init__.py", line 1, in <module>
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:     from . import client_init_messages, server_init_message
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:   File "/usr/local/lib/python3.7/dist-packages/dsf/connections/init_messages/client_init_messages.py", line 43, in <module>
                                      Nov 01 14:12:45 workbench1 OCS.py[868]:     auto_flush: bool = True):
                                      Nov 01 14:12:45 workbench1 OCS.py[868]: TypeError: 'type' object is not subscriptable
                                      Nov 01 14:12:45 workbench1 systemd[1]: ocs.service: Main process exited, code=exited, status=1/FAILURE
                                      Nov 01 14:12:45 workbench1 systemd[1]: ocs.service: Failed with result 'exit-code'.
                                      
                                      Falcounetundefined 1 Reply Last reply Reply Quote 0
                                      • Falcounetundefined
                                        Falcounet @oozeBot
                                        last edited by

                                        @oozeBot I will need more of the source code to understand what is going on, not only the imports.

                                        oozeBotundefined 1 Reply Last reply Reply Quote 0
                                        • oozeBotundefined
                                          oozeBot @Falcounet
                                          last edited by oozeBot

                                          For posterity - this must have been an issue with Python as upgrading to 3.12 resolved the issue I just reported.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Unless otherwise noted, all forum content is licensed under CC-BY-SA