Duet 2.05 memory leak?
-
@droftarts yes, I was already doing all those tips, that's how I am getting the yields I am.
I actually use the design CNC kitchen as inspiration to iterate on the design we started with. I had published the design I've been printing:
Enhanced low weight Modified Prusa Face Shield with a Visor found on #Thingiverse https://www.thingiverse.com/thing:4273009
There is 3 of us in our group, I have much more capacity, so I've distributed many more shields, but other members have been catching up after adding new machines to their effort. Our employer has stepped up to cover our expenses, even paying for other members to get additional machines (sidewinder x1). Since I've switched to the visor design, our other members have also started printing it, and I'm passing on the tips to squeeze as much speed as possible out of every print. These don't have to be pretty, they just have to work. -
@dc42 the problem is not fixed. This time it doesn't complain about any driver issues -- just starts stuttering exactly 30 minutes after starting the print. I am trying it after board reset. I may go back to an earlier version. Not sure what's going on.
EDIT:
Went back to 2.03 RC2 -- so far all is good.
Gonna stay on that for now -
Did you check a M122 to see if there were hiccups?
-
@Phaedrux no hickups -- it just starts stuttering. I am back to version 2.03 RC2 -- 2nd print with no resets is running fine -- now that's no indication that it's bug free, but it's working, so until I see a reason to move from it I'm staying on this build
-
So doing a bunch of M122s during the print, and finally caught the issue -- underruns -- @dc42 the count resets too often, and would be nice to get an error on the screen when it gets critical. I switched to a brand new class 10 sd cards, and stuttering and all weirdness stopped -- back to version 2.05.1. As smart as Duet is -- the fact that an SD card is not up to snuff, and/or is dying, should be something you can detect. Took me over 2 weeks hunting for the issue. Underruns keep resetting, so it's almost impossible to go on that. Now underruns are 0,0 -- and UI on the LCD is more responsive, shows the list of files and macros in an instant.
-
@kazolar Thanks for your persistence, and your report. SD card problems can have strange, and often not very obvious, effects. I don't know if the firmware can be set to detect SD card issues, that's one for @dc42. You can test an SD card with M122 P104 S[file size in MB], usually between 2 and 2.5Mbytes/sec. For me: Duet 2 WiFi - 2.23Mbytes/sec, Duet Maestro 2.42Mbytes/sec for a 10MB file.
Ian
-
@kazolar underruns, and any of the other stats like that, are reset each time you run M122.
-
@droftarts there is gotta be something to respond to underruns of some level. Clearly underruns were getting out of hand, if the firmware simply starts complaining about underruns how it complains about stepper phase warnings and other things of that nature, then it makes troubleshooting a lot easier, and resetting underruns seems to happen more often than just running m122. I canceled the print and all the stats in m122 underrun line was cleared out.
-
@kazolar How are the underruns actually reported in the M122? Is it just with the error status, or does it show in some other field? If you managed to save a copy of an M122 that shows it, that would be useful.
Ian
-
@droftarts here is what an M122 report looks like with underruns. This is from my own print just now. For me, it seems the underruns are from tiny segments created by simplify3d for support structures, combined with high speeds and some amount of PA.
4/13/2020, 9:29:48 AM M122 === Diagnostics === RepRapFirmware for Duet 2 WiFi/Ethernet version 2.05.1.1-simple_dynamic_unretraction running on Duet Ethernet 1.02 or later + DueX2 Board ID: 08DGM-956GU-DJMSN-6J9D4-3SJ6K-1BNBF Used output buffers: 1 of 24 (16 max) === RTOS === Static ram: 25712 Dynamic ram: 93652 of which 0 recycled Exception stack ram used: 480 Never used ram: 11228 Tasks: NETWORK(ready,628) HEAT(blocked,1232) DUEX(suspended,160) MAIN(running,3712) IDLE(ready,160) Owned mutexes: === Platform === Last reset 23:30:20 ago, cause: power up Last software reset at 2020-04-11 22:50, reason: Stuck in spin loop, spinning module GCodes, available RAM 11048 bytes (slot 2) Software reset code 0x4043 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f80f BFAR 0xe000ed38 SP 0x20001f4c Task 0x5754454e Stack: 00404463 004047e4 81000000 b0000000 412a3fa5 00000000 00000000 3331bb4c 41880000 3e178897 3e1cd04f bdb7f86e 423985c3 4050ac00 3cce8f96 40a00000 4453b9c2 c0000000 40f4ffb7 20000010 00404459 000003c8 00404aa9 Error status: 0 Free file entries: 9 SD card 0 detected, interface speed: 20.0MBytes/sec SD card longest block write time: 0.0ms, max retries 0 MCU temperature: min 36.6, current 37.6, max 38.8 Supply voltage: min 23.9, current 24.6, max 25.0, under voltage events: 0, over voltage events: 0, power good: yes Driver 0: ok, SG min/max 0/1023 Driver 1: standstill, SG min/max 0/1023 Driver 2: standstill, SG min/max 0/135 Driver 3: ok, SG min/max 0/1023 Driver 4: standstill, SG min/max not available Driver 5: standstill, SG min/max not available Driver 6: standstill, SG min/max not available Date/time: 2020-04-13 09:29:42 Cache data hit count 4294967295 Slowest loop: 17.11ms; fastest: 0.07ms I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0 === Move === Hiccups: 0, FreeDm: 158, MinFreeDm: 117, MaxWait: 0ms Bed compensation in use: none, comp offset 0.000 === DDARing === Scheduled moves: 1295584, completed moves: 1295544, StepErrors: 0, LaErrors: 0, Underruns: 595, 0 === Heat === Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1 Heater 0 is on, I-accum = 0.2 Heater 1 is on, I-accum = 0.5 === GCodes === Segments left: 1 Stack records: 1 allocated, 0 in use Movement lock held by null http is idle in state(s) 0 telnet is idle in state(s) 0 file is doing "G1 X-29.037 Y13.502 E0.0004" in state(s) 0 serial is idle in state(s) 0 aux is idle in state(s) 0 daemon is idle in state(s) 0 queue is idle in state(s) 0 autopause is idle in state(s) 0 Code queue is empty. === Network === Slowest loop: 15.90ms; fastest: 0.06ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0) HTTP sessions: 1 of 8 Interface state 5, link 100Mbps full duplex
-
@bot said in Duet 2.05 memory leak?:
=== DDARing ===
Scheduled moves: 1295584, completed moves: 1295544, StepErrors: 0, LaErrors: 0, Underruns: 595, 0Thanks, I know where to look now!
Ian
-
@droftarts I think I read that the first number is a warning, the 2nd number will cause stutter or a pause if it gets bad. I can tell from switching SD cards, my gcode uploads are faster now -- hitting 700kb/sec -- almost maxing out the 100mb link -- never had over 500 before.
-
@kazolar I think the first value isn't a warning, just an indication that the lookahead function couldn't do something (not sure what) with the time given. It doesn't slow down the print, but is likely not ideal. The second number is a prepare move underrun, which means that the move could not be prepared in time and so the movement must wait. This is much worse than the first one.
Also, since I'm interested in SD card performance at the moment, I noticed your last comment and must correct you somewhat, just for your info: 700 kB/s is not nearly maxing out a 100 Mbps link. 100 Mbps = 12.5 MB/s
-
@bot yep, 2nd number is the one that i would hope would be something Duet would alarm about -- yep -- I got my decimal off --was thinking 10mb, i think sleeping more than 5 hours per day maybe catching up to me. What's curious though is I never saw numbers above 500kb/s transfers -- with the new sd card it was in the high 700 -- touching 800, I mean even before I had issues with the SD card -- the old one wasn't that fast to begin with.
-
@dc42 -- what kind of an sd card do i need:
it's happening again:
Scheduled moves: 24643, completed moves: 24607, StepErrors: 0, LaErrors: 0, Underruns: 80, 300 -
10:42:27 AMSD write speed for 20.0Mbyte file was 2.18Mbytes/sec
10:42:18 AMM122 P104 S20
Testing SD card write speed...I printed the same file multiple times -- no issues -- now I get an error -- does it have anything to do with duet being powered on/off. Works fine when I copy over a newly sliced/copied file -- but for a file that has been other for a couple of days - - start getting underruns again
Should i send a sd card dismount command m22 i think before powering the machine off -- something is definitely getting corrupted -- i did a bunch of commands, started, canceled a print -- and a newly copied file works fine. -
Pending any other surprises -- I now am sending M22 before the printer is powered off (unless it's a an unexpected shutdown) -- with that enabled, I have not had any more underrun problems.
-
@dc42 totally reproducible now -- printed fine with the same file for 2 days, 3rd day after power up -- 30 minutes in underruns . Stopped it, didn't reset, didn't power off, deleted the file, uploaded the exact same file -- now no underruns, working fine. I guess I can keep doing this procedure, but why are "old" files now going stale on the sd card somehow?
-
@kazolar And this is on 2.05? Or 2.02?
-
@kazolar I can't see how anything in the firmware is doing this. It doesn't rewrite the file to the card, except if you run simulation (it appends the simulation time to the gcode file). My guess is that the SD card is doing some form of wear levelling, but causing issues doing it. What exact card (make and model) is this. I know you swapped to a new card from the one that was causing problems originally, but have you tried yet another?
Ian