• Tags
  • Documentation
  • Order
  • Register
  • Login
Duet3D Logo Duet3D
  • Tags
  • Documentation
  • Order
  • Register
  • Login

Duet 2.05 memory leak?

Scheduled Pinned Locked Moved
Firmware installation
9
132
6.6k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • undefined
    kazolar
    last edited by kazolar 24 Apr 2020, 15:41

    @dc42 I'll run the test this weekend, Sunday most likely. I need to get as many shields made as possible today and tomorrow -- I have a pickup of 200 shields for a local hospital on Sunday morning.
    @droftarts I actually had some issues with i2c a while back, and David suggested to thicken up the grounding wire between duet and duex5 -- I got the thickest wire that could be possibly inserted into a ferrule and did that -- also David had added some code to reset i2c if it gets an issue -- but my m122 has not shown any i2c errors since I improved the grounding -- but trust me there is a very short 14awg silicone shielded wire crimped in ferrules that is connecting my 2 boards -- so my grounding between them is no longer an issue.

    1 Reply Last reply Reply Quote 0
    • undefined
      kazolar
      last edited by kazolar 25 Apr 2020, 06:50

      @dc42 M21 and combined with M22 do not help, I just tried it after I replaced the 4 Z stepper connectors, recall a while back I had an issue with the included stepper connectors getting singed, you suggest molex, and I had those in place for well over a year with no singing until today -- this time the singing was very bad melting the housing. I ended up taking duex5 out and I soldered wires with heavy duty JS style connectors in the place of the Z stepper connectors (I have 4 lead screws -- upon further inspection 2 had developed some singe marks.
      Singing.png

      I didn't save M122 -- I will make sure to save it on Sunday, I hope not to be dealing with unrelated electrical issues. What was telling that even with M21 being run, the loop times were high -- max loop times until stuttering started were 50-70ms. None were lower than 12ms -- I check M122 every minute or so. When stuttering started I canceled the print -- kept the bed at temperature, cleared everything, then I reset the board -- just M999 (or stop button in this case) -- then I started the same print -- it's running now -- all loop times are 5ms and lower.

      1 Reply Last reply Reply Quote 0
      • undefined
        kazolar
        last edited by 25 Apr 2020, 15:56

        @dc42
        here is the M122 right before it starts failing
        M122
        What's interesting -- which makes this look more like a hardware issue is that the first print after the machine has been off for a couple of hours fails. High loop times -- then underruns. Then prints done afterwards work. I did not do anything different than preheat -- reset, hit print -- first print fails the same fashion

        === Diagnostics ===
        RepRapFirmware for Duet 2 WiFi/Ethernet version 2.05 running on Duet Ethernet 1.02 or later + DueX5
        Board ID: 08DGM-9T6BU-FG3S0-7JTD4-3S06K-1A4ZD
        Used output buffers: 1 of 24 (21 max)
        === RTOS ===
        Static ram: 25708
        Dynamic ram: 96332 of which 0 recycled
        Exception stack ram used: 472
        Never used ram: 8560
        Tasks: NETWORK(ready,616) HEAT(blocked,1144) DUEX(blocked,164) MAIN(running,1668) IDLE(ready,156)
        Owned mutexes: I2C(DUEX)
        === Platform ===
        Last reset 00:20:52 ago, cause: software
        Last software reset at 2020-04-25 11:06, reason: User, spinning module GCodes, available RAM 8504 bytes (slot 1)
        Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f000 BFAR 0xe000ed38 SP 0xffffffff Task 0x4e49414d
        Error status: 0
        Free file entries: 9
        SD card 0 detected, interface speed: 20.0MBytes/sec
        SD card longest block write time: 0.0ms, max retries 0
        MCU temperature: min 25.6, current 26.0, max 26.2
        Supply voltage: min 24.1, current 24.4, max 24.6, under voltage events: 0, over voltage events: 0, power good: yes
        Driver 0: ok, SG min/max 0/313
        Driver 1: standstill, SG min/max not available
        Driver 2: standstill, SG min/max 0/252
        Driver 3: ok, SG min/max 0/305
        Driver 4: ok, SG min/max 0/332
        Driver 5: standstill, SG min/max not available
        Driver 6: standstill, SG min/max 64/229
        Driver 7: standstill, SG min/max 144/295
        Driver 8: standstill, SG min/max 88/251
        Driver 9: standstill, SG min/max 41/218
        Date/time: 2020-04-25 11:27:19
        Cache data hit count 2554842663
        Slowest loop: 47.93ms; fastest: 0.08ms
        I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
        === Move ===
        Hiccups: 0, FreeDm: 154, MinFreeDm: 8, MaxWait: 0ms
        Bed compensation in use: none, comp offset 0.000
        === DDARing ===
        Scheduled moves: 8715, completed moves: 8675, StepErrors: 0, LaErrors: 0, Underruns: 6, 0
        === Heat ===
        Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1
        Heater 0 is on, I-accum = 1.0
        Heater 1 is on, I-accum = 0.6
        Heater 2 is on, I-accum = 0.7
        Heater 4 is on, I-accum = 0.5
        === GCodes ===
        Segments left: 1
        Stack records: 2 allocated, 0 in use
        Movement lock held by null
        http is idle in state(s) 0
        telnet is idle in state(s) 0
        file is doing "G1 X228.135 Y232.279 E1.2574" in state(s) 0
        serial is idle in state(s) 0
        aux is idle in state(s) 0
        daemon is idle in state(s) 0
        queue is idle in state(s) 0
        autopause is idle in state(s) 0
        Code queue is empty.
        === Network ===
        Slowest loop: 7.26ms; fastest: 0.06ms
        Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0)
        HTTP sessions: 2 of 8
        Interface state 5, link 100Mbps full duplex

        1 Reply Last reply Reply Quote 0
        • undefined
          dc42 administrators
          last edited by 25 Apr 2020, 16:02

          Is there any metalwork (e.g. printer enclosure) close to the SD card socket? If so, is that metalwork connected to Duet ground, either directly or through a resistor? I'm wondering whether static buildup might be a factor.

          Duet WiFi hardware designer and firmware engineer
          Please do not ask me for Duet support via PM or email, use the forum
          http://www.escher3d.com, https://miscsolutions.wordpress.com

          1 Reply Last reply Reply Quote 0
          • undefined
            kazolar
            last edited by 25 Apr 2020, 16:41

            @dc42 nope, printer is in a plastic enclosure -- no metal parts touch the network jack
            Here is the actual failure:
            12:22:43 PMM122
            === Diagnostics ===
            RepRapFirmware for Duet 2 WiFi/Ethernet version 2.05 running on Duet Ethernet 1.02 or later + DueX5
            Board ID: 08DGM-9T6BU-FG3S0-7JTD4-3S06K-1A4ZD
            Used output buffers: 1 of 24 (22 max)
            === RTOS ===
            Static ram: 25708
            Dynamic ram: 96332 of which 0 recycled
            Exception stack ram used: 448
            Never used ram: 8584
            Tasks: NETWORK(ready,616) HEAT(blocked,1144) DUEX(suspended,164) MAIN(running,1668) IDLE(ready,156)
            Owned mutexes:
            === Platform ===
            Last reset 00:37:46 ago, cause: software
            Last software reset at 2020-04-25 11:44, reason: User, spinning module GCodes, available RAM 8748 bytes (slot 0)
            Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0441f000 BFAR 0xe000ed38 SP 0xffffffff Task 0x4e49414d
            Error status: 0
            Free file entries: 9
            SD card 0 detected, interface speed: 20.0MBytes/sec
            SD card longest block write time: 0.0ms, max retries 0
            MCU temperature: min 25.9, current 26.2, max 26.6
            Supply voltage: min 24.1, current 24.3, max 24.6, under voltage events: 0, over voltage events: 0, power good: yes
            Driver 0: ok, SG min/max 0/333
            Driver 1: standstill, SG min/max not available
            Driver 2: ok, SG min/max 0/1023
            Driver 3: ok, SG min/max 0/721
            Driver 4: ok, SG min/max 0/332
            Driver 5: standstill, SG min/max not available
            Driver 6: standstill, SG min/max 26/241
            Driver 7: standstill, SG min/max 135/297
            Driver 8: standstill, SG min/max 71/249
            Driver 9: standstill, SG min/max 29/222
            Date/time: 2020-04-25 12:22:39
            Cache data hit count 4294967295
            Slowest loop: 151.20ms; fastest: 0.08ms
            I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
            === Move ===
            Hiccups: 0, FreeDm: 120, MinFreeDm: 6, MaxWait: 0ms
            Bed compensation in use: none, comp offset 0.000
            === DDARing ===
            Scheduled moves: 26743, completed moves: 26706, StepErrors: 0, LaErrors: 0, Underruns: 69, 78
            === Heat ===
            Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1
            Heater 0 is on, I-accum = 1.0
            Heater 1 is on, I-accum = 0.6
            Heater 2 is on, I-accum = 0.6
            Heater 4 is on, I-accum = 0.5
            === GCodes ===
            Segments left: 0
            Stack records: 2 allocated, 0 in use
            Movement lock held by null
            http is idle in state(s) 0
            telnet is idle in state(s) 0
            file is idle in state(s) 0
            serial is idle in state(s) 0
            aux is idle in state(s) 0
            daemon is idle in state(s) 0
            queue is idle in state(s) 0
            autopause is idle in state(s) 0
            Code queue is empty.
            === Network ===
            Slowest loop: 77.03ms; fastest: 0.06ms
            Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0)
            HTTP sessions: 1 of 8
            Interface state 5, link 100Mbps full duplex

            undefined undefined 2 Replies Last reply 25 Apr 2020, 16:43 Reply Quote 0
            • undefined
              dc42 administrators @kazolar
              last edited by 25 Apr 2020, 16:43

              @kazolar said in Duet 2.05 memory leak?:

              Here is the actual failure:

              Do you mean after you have just had the SD card read error?

              Duet WiFi hardware designer and firmware engineer
              Please do not ask me for Duet support via PM or email, use the forum
              http://www.escher3d.com, https://miscsolutions.wordpress.com

              1 Reply Last reply Reply Quote 0
              • undefined
                kazolar
                last edited by 25 Apr 2020, 16:43

                @dc42 here is the observable behavior -- happened 3 times in a row.
                First print -- high loop times, inevitable failure due to underrun -- I tried to cancel it, looked like it would fail about 15 minutes in, but the next print failed. So basically 30-40 is required.
                Next print running now
                Is going to be fine -- loop times are great:
                M122
                === Diagnostics ===
                RepRapFirmware for Duet 2 WiFi/Ethernet version 2.05 running on Duet Ethernet 1.02 or later + DueX5
                Board ID: 08DGM-9T6BU-FG3S0-7JTD4-3S06K-1A4ZD
                Used output buffers: 3 of 24 (21 max)
                === RTOS ===
                Static ram: 25708
                Dynamic ram: 96332 of which 0 recycled
                Exception stack ram used: 464
                Never used ram: 8568
                Tasks: NETWORK(ready,748) HEAT(blocked,1236) DUEX(suspended,168) MAIN(running,1668) IDLE(ready,156)
                Owned mutexes:
                === Platform ===
                Last reset 00:14:25 ago, cause: software
                Last software reset at 2020-04-25 12:28, reason: User, spinning module GCodes, available RAM 8584 bytes (slot 1)
                Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0441f000 BFAR 0xe000ed38 SP 0xffffffff Task 0x4e49414d
                Error status: 0
                Free file entries: 9
                SD card 0 detected, interface speed: 20.0MBytes/sec
                SD card longest block write time: 0.0ms, max retries 0
                MCU temperature: min 25.9, current 26.3, max 26.5
                Supply voltage: min 24.1, current 24.3, max 24.6, under voltage events: 0, over voltage events: 0, power good: yes
                Driver 0: ok, SG min/max 0/319
                Driver 1: standstill, SG min/max not available
                Driver 2: ok, SG min/max 0/242
                Driver 3: ok, SG min/max 0/290
                Driver 4: ok, SG min/max 0/301
                Driver 5: standstill, SG min/max not available
                Driver 6: standstill, SG min/max 59/235
                Driver 7: standstill, SG min/max 151/286
                Driver 8: standstill, SG min/max 82/248
                Driver 9: standstill, SG min/max 47/214
                Date/time: 2020-04-25 12:43:13
                Cache data hit count 1720212453
                Slowest loop: 4.75ms; fastest: 0.08ms
                I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
                === Move ===
                Hiccups: 0, FreeDm: 152, MinFreeDm: 56, MaxWait: 0ms
                Bed compensation in use: none, comp offset 0.000
                === DDARing ===
                Scheduled moves: 4720, completed moves: 4708, StepErrors: 0, LaErrors: 0, Underruns: 3, 0
                === Heat ===
                Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1
                Heater 0 is on, I-accum = 1.0
                Heater 1 is on, I-accum = 0.4
                Heater 2 is on, I-accum = 0.5
                Heater 4 is on, I-accum = 0.4
                === GCodes ===
                Segments left: 1
                Stack records: 2 allocated, 0 in use
                Movement lock held by null
                http is idle in state(s) 0
                telnet is idle in state(s) 0
                file is doing "G1 X110.719 Y48.457 E36.9405" in state(s) 0
                serial is idle in state(s) 0
                aux is idle in state(s) 0
                daemon is idle in state(s) 0
                queue is idle in state(s) 0
                autopause is idle in state(s) 0
                Code queue is empty.
                === Network ===
                Slowest loop: 10.72ms; fastest: 0.06ms
                Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0)
                HTTP sessions: 1 of 8
                Interface state 5, link 100Mbps full duplex

                1 Reply Last reply Reply Quote 0
                • undefined
                  kazolar @kazolar
                  last edited by 25 Apr 2020, 16:44

                  @dc42 stuttering -- failure of print -- just had it, yes. I reset and started again -- working fine now

                  1 Reply Last reply Reply Quote 0
                  • undefined
                    kazolar
                    last edited by kazolar 25 Apr 2020, 16:52

                    @dc42 i think we're going around in circles -- 2.5.1 likely would fix the software issue, but there is clearly a hardware fault -- i HAVE to let it run for 30+ minutes and fail before reset and start a series of prints that will succeed. That 30 minutes time frame -- maybe the time that the cold solder joint -- somewhere heats up things start running fine. This board was purchased on Aug 3 2018 -- Filastruder order #36936
                    It's either covered by warranty or it's not. If not I'll order a new one from filastruder.

                    I said that I was ready to go ahead and do this -- I was asked to slice this in different slicers and try different SD cards -- I have. I am printing face shields for hospital workers. I have a donation of 200 being picked to go to a local hospital tomorrow. I'd rather lose more sleep and waste time troubleshooting what at this point feels like a bad solder joint -- is it worth my time to take the board out look at under a scope and look for it -- it could be on the sd card connector -- another trace. I resolves itself after 30 minutes of printing. The question is at this point how is the replacement being handled.

                    1 Reply Last reply Reply Quote 0
                    • undefined
                      kazolar
                      last edited by 25 Apr 2020, 19:21

                      @dc42 what makes you think it's an SD card error still? I had the same underrun error reproduced with Ocotprint. I am using a branded SD card which tests out at the top of speed spectrums. The predictable behavior now is that after the machine has been either idle or off for some time, it takes 1 failed print due to an underrun --which happens around 30-40min (not same spot each time anymore) and then after a reset the next set of prints if I don't let it cool down work fine --so as some suggested a cold solder joint which ends up warming up and working fine after some heat up time is the culprit here. I'm going to wait for a response as to how to proceed -- If I don't get one -- I'll order a new duet 2 from filastruder.
                      @droftarts Considering the importance of these prints, and my past experience here, I rather disappointed and surprised with the level of support. YouTubers who do questionable videos (I can name some -- who I used to watch) get free boards sent to them, I guess I don't have that kind of a public image -- I will be releasing the build vlog of this printer, which has been long overdue as I've filmed it starting 3 years ago. The last few episodes will feature my experience with using this machine to print these shields. I am donating countless hours of my time, our group is getting daily requests from local hospitals -- and even those further away for ear relievers. I am kinda getting burn out mostly due the issues with the duet board. I was dealing with these behaviors initially -- and was willing to keep troubleshooting this. Now we're at a crossroads. I have played the -- do this, do that steps, which would be fine if we weren't in time crunch in a pandemic and every shield I produce is another doctor/nurse, EMT worker who won't get sick. I don't quite understand the lack of urgency here, or maybe you're not taking this seriously enough -- I had first hand accounts of doctors dying in New York, and young nurses succumbing to the illness in the Bronx.

                      undefined undefined 2 Replies Last reply 25 Apr 2020, 19:36 Reply Quote 0
                      • undefined
                        bot @kazolar
                        last edited by 25 Apr 2020, 19:36

                        As for your first question directed at dc42: octoprint sometimes waits for an OK response from the firmware, and maybe for other reasons, the stuttering might occur on any setup. It doesn't rule out the SD card being a problem.

                        *not actually a robot

                        undefined 1 Reply Last reply 25 Apr 2020, 19:39 Reply Quote 0
                        • undefined
                          kazolar @bot
                          last edited by kazolar 25 Apr 2020, 19:39

                          @bot
                          So yes, an SD card problem may be there, which is likely a hardware issue since after 30-40 minutes of a failed print, the next print is fine. In fact the next 3-4 prints are fine, provided I don't let the board/system cool off

                          undefined 1 Reply Last reply 25 Apr 2020, 19:42 Reply Quote 0
                          • undefined
                            bot @kazolar
                            last edited by 25 Apr 2020, 19:42

                            I'm just watching this thread with interest, but have no idea or opinion about why it may be occurring. I was just pointing out the reasoning that octoprint didn't rule out a faulty sd card.

                            *not actually a robot

                            undefined 1 Reply Last reply 25 Apr 2020, 19:43 Reply Quote 0
                            • undefined
                              kazolar @bot
                              last edited by kazolar 25 Apr 2020, 19:43

                              @bot I am using a brand new SanDisk 32 gb card. I have used 4 different cards. Are they all faulty?

                              And the fault magically clears after 40 minutes. And works fine for the next 15 hours...hell I had a night when I caught it in the morning right after it finished a print and was able to run another with no issues. Start cold and the SD card has a fault in it ... I'm software engineer and this makes no sense to me.

                              1 Reply Last reply Reply Quote 0
                              • undefined
                                dc42 administrators @kazolar
                                last edited by dc42 25 Apr 2020, 19:46

                                @kazolar said in Duet 2.05 memory leak?:

                                @dc42 what makes you think it's an SD card error still? I had the same underrun error reproduced with Ocotprint.

                                This thread is long, so it's hard to find your post in which you mentioned Octoprint (the browser search facility doesn't find it). I recall that you said you had experienced underruns with Octoprint, but AFAIR you didn't say you had experienced the same pattern of underruns i.e. the first print working and subsequent prints not. I replied that underruns when using Octoprint/USB were common on all electronics.

                                @kazolar said in Duet 2.05 memory leak?:

                                And the fault magically clears after 40 minutes.

                                40 minutes of what? The machine standing idle, or something else?

                                Duet WiFi hardware designer and firmware engineer
                                Please do not ask me for Duet support via PM or email, use the forum
                                http://www.escher3d.com, https://miscsolutions.wordpress.com

                                undefined 2 Replies Last reply 25 Apr 2020, 19:50 Reply Quote 0
                                • undefined
                                  bot
                                  last edited by 25 Apr 2020, 19:50

                                  One thought I randomly had last night: Power supplies often have programmed behaviour, based on times or temperatures. Perhaps your PSU is affecting the electronics when a fan kicks in, or something.

                                  *not actually a robot

                                  undefined 1 Reply Last reply 25 Apr 2020, 19:53 Reply Quote 0
                                  • undefined
                                    kazolar @dc42
                                    last edited by 25 Apr 2020, 19:50

                                    @dc42 ok, so what's next then. It's warmed up and printing fine at present. I'll be able to get 2 more prints in today, but in the morning tomorrow it will require a failed print to clear the cobwebs again, get it warmed up to get going again...that feels like faulty hardware. Are we still searching for some fix. You tell me it's off warranty. I'll order a new on filastruder today.

                                    1 Reply Last reply Reply Quote 0
                                    • undefined
                                      kazolar @bot
                                      last edited by 25 Apr 2020, 19:53

                                      @bot fan on the main psu is hard wired with no temp sensor in the PSU, and runs all the time. The fan in the PSU runs off its own 12v regulator. The fans for cooling the duet and other electronics are 12v fans running of a separate fanless power supply. duet now, after fixing the z motor connectors is running at nice 26c. So clearly they were underspec'ed and running hot - enough to raise the board temp 3c.

                                      1 Reply Last reply Reply Quote 0
                                      • undefined
                                        kazolar @dc42
                                        last edited by kazolar 25 Apr 2020, 19:56

                                        @dc42 if I start cold. Even if the machine has been powered for hours after finishing a print. Or off. Same things. The first print using the same procedure - heatup everything, wait for it come to temp, then hit reset start the print. This print fails with high loop times and eventual underruns, that takes 30-40 min. If I do the same procedure immediately following this print, subsequent prints work fine.

                                        1 Reply Last reply Reply Quote 0
                                        • undefined
                                          dc42 administrators @kazolar
                                          last edited by dc42 25 Apr 2020, 20:04

                                          I have just re-read your original post, and spotted something I had forgotten:

                                          @kazolar said in Duet 2.05 memory leak?:

                                          I have done repeated duplication prints without issue -- but this leak happens during triplication -- so obviously more steppers are involved.

                                          I can think of the following ways in which a fault might happen only when doing a triplicate print:

                                          1. A power issue, if the PSU can't handle the additional heater and stepper motor load. What PSU are you using, and are the VIN terminal block screws still tight? However, your M122 report does not show any power outages, assuming you ran the M122 report just once at the end of the print because M122 resets the min and max voltages.

                                          2. A temperature issue, because more stepper drivers and heaters are running. However, your M122 report shows a low MCU temperature, assuming you ran the M122 report just once at the end of the print because M122 resets the min and max temperatures.

                                          3. A firmware issue that is causing DriveMovement objects to be lost from the system. However, your M122 report shows that there were never less than 56 of these free.

                                          4. Noise on the bus between the Duet and DueX being worse when you drive another stepper motor on the DueX. This could cause I2C data corruption, or possibly spurious interrupts form the DueX to the Duet (which would lengthen the loop time).

                                          None of these explain the increased loop time when this issue occurs. Whereas if SD card reads take longer, that would explain the increased loop time. Also, AFAIR you reported that re-uploading the print file cleared the problem - correct?

                                          PS - it might be better to start a new thread about this issue (linking back to this one), starting by stating what you observe when the issue occurs (stuttering?), what precursors it has (increased underruns? increased loop time?), what conditions are needed to make it occur (triplicate not duplicate?), what has been proven to clear ot pre-empt the problem (M999? Power down/up? Upload the file again?) and what has been proven not to clear it.

                                          Duet WiFi hardware designer and firmware engineer
                                          Please do not ask me for Duet support via PM or email, use the forum
                                          http://www.escher3d.com, https://miscsolutions.wordpress.com

                                          1 Reply Last reply Reply Quote 0
                                          119 out of 132
                                          • First post
                                            119/132
                                            Last post
                                          Unless otherwise noted, all forum content is licensed under CC-BY-SA