Duet 2.05 memory leak?
-
I haven't ran m122 for couple of hours -- so this is a decent sample
Slowest loop: 8.62ms; fastest: 0.08ms
I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
=== Move ===
Hiccups: 0, FreeDm: 144, MinFreeDm: 6, MaxWait: 0ms
Bed compensation in use: none, comp offset 0.000
=== DDARing ===
Scheduled moves: 196268, completed moves: 196228, StepErrors: 0, LaErrors: 0, Underruns: 399, 0I mean -- slowest loop is under 10ms -- can't see how anything can be wrong -- and the important underrun number is 0.
-
@kazolar lot of guessing now but as someone who did have a lot of SD card issues on embedded systems what you are writing does not ring a bell. SD cards do behave weird but "freshly recorded" is not something I experienced ever, on any system. Reboot the system and SD start working is even weirder. The way you are describing the problem, to me, more looks like the temperature of the board and thickness of the sd-card change the contact between PCB and SD-card slot. So a cold joint of a kind there. Those card slots are often improperly soldered, and sometimes they can appear ok but a hairline fracture can exist end temperature/vibration can disconnect it. It's bin a long time since I used fatfs library but IIRC there is the checksum for reading data so this should be detected, the question is how RRF handles the retries as many retries might be seen as those stutters and underruns. Maybe run a SD card test on the duet and add physical stress to the board while it's running (slight bend, twist...)
-
@kazolar said in Duet 2.05 memory leak?:
I haven't ran m122 for couple of hours -- so this is a decent sample
Slowest loop: 8.62ms; fastest: 0.08ms
I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
=== Move ===
Hiccups: 0, FreeDm: 144, MinFreeDm: 6, MaxWait: 0ms
Bed compensation in use: none, comp offset 0.000
=== DDARing ===
Scheduled moves: 196268, completed moves: 196228, StepErrors: 0, LaErrors: 0, Underruns: 399, 0I mean -- slowest loop is under 10ms -- can't see how anything can be wrong -- and the important underrun number is 0.
What about the max SD retries?
-
@dc42
today with the sandisk card it was
SD card 0 detected, interface speed: 20.0MBytes/sec
SD card longest block write time: 0.0ms, max retries 0I did what others have asked and took another look at the microSD card solder joints again -- and saw nothing suspicious -- and short of halting everything to pull the board out and stick under my scope and touch up the solder joints -- this doesn't look faulty.
Here is a full res picture
https://www.dropbox.com/s/197mtwcmt2vq6co/SDCard.jpg?dl=0 -
so more underruns -- getting to a point where I think I need to format a card clean to get it to do an error free print..At this point I am pretty sure it's not the card. I tried shaking the enclosure and so on while running a write test and I am getting 3.03 and 3.14mb/sec -- even faster than it was before. I am trying again with a clean upload of a file. The problem seems to have progressively gotten worse -- before I could at least reset the board and print without issues, now that doesn't even work.
-
that slot looks ok, if the stuttering was due to the sd card I assume there would be max retries there
-
Fresh copy -- no reset
SD card 0 detected, interface speed: 20.0MBytes/sec
SD card longest block write time: 0.0ms, max retries 0
MCU temperature: min 27.4, current 27.6, max 28.1
Supply voltage: min 24.1, current 24.4, max 24.6, under voltage events: 0, over voltage events: 0, power good: yes
Driver 0: ok, SG min/max 0/341
Driver 1: standstill, SG min/max not available
Driver 2: ok, SG min/max 0/1023
Driver 3: ok, SG min/max 0/310
Driver 4: ok, SG min/max 0/321
Driver 5: standstill, SG min/max not available
Driver 6: standstill, SG min/max 51/258
Driver 7: standstill, SG min/max 159/297
Driver 8: standstill, SG min/max 0/208
Driver 9: standstill, SG min/max 0/221
Date/time: 2020-04-20 22:01:22
Cache data hit count 4294967295
Slowest loop: 6.01ms; fastest: 0.08ms
I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
=== Move ===
Hiccups: 0, FreeDm: 128, MinFreeDm: 6, MaxWait: 0ms
Bed compensation in use: none, comp offset 0.000
=== DDARing ===
Scheduled moves: 26040, completed moves: 26000, StepErrors: 0, LaErrors: 0, Underruns: 70, 036 minutes in -- previous attempt that failed was after examining the SD card. Slowest loop was bad, underruns were piling up -- I didn't notice what the write retries count was -- but I think it was zero, since it's not writing, but reading
-
So this obviously isn't a super common problem, which begs the question, what is unique about your setup that is different than most people. It's a quad head printer? correct? That's pretty unique. Can you provide you config file and some more details about your setup? We need to find the trigger.
-
@Phaedrux yes, AFIK -- this is the only such a machine.
The config is big: https://www.dropbox.com/s/dzf96rx23zekyo9/config.g?dl=0I have a modified firmware to expand to more drivers -- I am using the PT100 thermistor pins (on duet and duex5) for more external drivers -- as per dc42 suggestion.
Here is a curious thing -- this only happens while doing triplicate printing -- I wanted to have the ability to be able to have 0 Z offset for each tool to do duplicate, triplicate, or mirror printing, and I can. The other curious thing -- I have done triplicating printing before -- I was making clips to hold the polycarb panels for the machine enclosure, and those prints worked fine, and that was maybe a year ago. Now I am doing a lot of triplicate printing making shields, and the problem came up.
I did do a pair of duplication prints utilizing more of the bed to make some ear relievers and those prints both were fine, no SD card or other issues.
Now here is where I am at now -- this one is fun -- I ran a successful set of shields, then kept the machine on -- and started another set -- and got underuns at the 30 minute mark -- ok, so it's not power state or some magical boot behavior...something else. I then took the card out -- backed it up, formatted it, but this time using 32kb blocks instead of 64kb, and copied everything back -- put the card back -- it wasn't recognized, but a reset -- just a reset not a full cycle and it was fine -- and then the next print was fine, and I did not power it down and am almost an hour into the 2nd -- that's the most underun free prints in a row without a restart or a reset.
I was going of the recommendation that 64kb is the best option for formatting -- but maybe not in my case, yet to see how it does on the next print. I may be able to squeeze 2 more into the day -- each print produces 12 shield with a visor. I have a pickup of a 120 shields going to a nursing home. If I can get more done during the day before the pickup happens, I'll include those.
-
@kazolar said in Duet 2.05 memory leak?:
I was going of the recommendation that 64kb is the best option for formatting
Well it's going to depend on your card specification what the ideal formatting is for that card in particular. 64kb would theoretically give better performance with the Duet, but it may not work well with your card.
That's why it's recommended to use the SD card formatter from the SD card association. It format the card based on the spec of the card. Usually this would result in 32kb cluster size for all but the smaller cards (under 8gb I think). Doing a full surface format can help remap any bad clusters too.
-
So what exactly is it about triplicate printing that is different than quad, double, or single?
-
@Phaedrux so I'm a 3rd print in with the sandisk 8gb card (with 32kb cluster size) -- and no issues -- at least I'm past the point when I would normally run into an issue -- i got a new 32gb sandisk card delivered today, but if this keeps working -- I'm not gun ho to swap it.
If this survives a power down later and power up and start print of the same file -- then I'm keeping the setup as is and holding on to the 32 gb card for another time.As far as triplicate printing vs single/double -- difference is that more hotends and steppers are involved, and both Y gantries are moving -- presumably quad would involve the 4th x axis as well. I've never done that since my 4th extruder has a smaller nozzle -- and it's dedicated for detail features. I am able to get really good yields with triplicate printing, so I haven't had the need to setup the 4th one with the same size as the the other 3.
I just find it odd that triplicate print is triggering this issue and the duplicate print of 26 ear relievers -- so 52 in total worked flawlessly twice. And that gcode file was bigger than the shield gcode file I'm running now. -
The 8 gig card lasted 3 prints, 4th -- failed -- I'm beginning to suspect the microsd slot or related circuitry -- I took the new card -- left it with whatever formatting it came with -- and not right away like before at the 30 minute mark, but about 40 minute, I got underruns. I then reformatted it with 64kb cluster size. Pressed firmly on the casing of the microsd card, and reseted the ethernet header -- got everything tightened up, and -- it's printing fine. Should I bother trying to take the board out and reflow the microsd card. It's far from trivial to take the board out with so much connected to it. I was thinking of upgrading to duet 3 when things were less hectic, but I need this to work just while the PPE shortage is still high. I got this board from filastruder and nothing had been wrong with it -- I'd own up to it if I did something -- cause the first duet 2 that was part of this printer -- that was on me -- I didn't come here saying -- oh my board is not working I don't know what I did. I had a short and it killed it -- this is working fine, then not. Feels wasteful to go on buying another one cause I'd rather take my time get a duet 3 get several expansion modules and rewire the printer.
-
You've said that re-uploading the file after 3 prints clears the problem. So i don't think it is a problem with the Duet hardware.
Underruns in themselves do not necessarily indicate a problem - you may get some anyway if parts of your print contain long runs of very short segment. However, if SD card read operations become slow for any reason, you will get increased underruns, and eventually stuttering.
Please confirm (I think you have already said it) that resetting the Duet doesn't clear the problem, neither does powering the Duet down (including removing USB or 5V external power, if you are using it).
Regarding cluster size, we recommend using the largest one you can, which is normally 64kb. That reduces the number of SD card accesses needed.
-
@dc42 I don't have external 5v or USB connected. What I did last night to get it going again was -- I powered it down, reformatted the card -- and copied all the backed up data to it -- and it printed fine overnight without ANY underruns (i'm only looking at the last number) when it starts getting underruns in the 2nd number, it starts stuttering very quickly, and the underrun numbers keep going up, I haven't seen it not stutter and the 2nd number be >0. It seems that a fresh format or a fresh copy clears up whatever is wrong -- my guess in terms of hardware is maybe like others said 3 prints shook the machine, the difference with triplication is that 2 gantries are involved, and a lot more movement, so whatever wouldn't vibrate the machine on a single or a duplication print, would vibrate it on triplication -- then me taking the card out and re-seating it -- and starting a new print, makes better contacts. I had previously been successful just doing a reset and starting the print on a fresh boot - but that stopped working at some point -- or I stopped trying after 2.5.1 update. I can try to see if I can get more than 3 prints in a row by doing resets between each
-
What slicer are you using to generate the gcode? Can you post a sample?
-
Here is the file
https://www.dropbox.com/s/6myicwelo20mzjd/VisorQUADQuad.gcode?dl=0I just tried printing it -- and there was stuttering and underruns within 20 minutes -- this was after the board was reset (not powered down) using the file that worked fine for the overnight print. I just did a fresh copy of the file. I started the print again -- I saw 22 underruns in the 2nd number after a couple of minutes, but no noticeable stuttering. Not yet. I will keep checking until about an hour in, if it prints stutter free for 45+ min plus, then it will be fine through the course of the 5 hour print -- stuttering due to underruns starts in the first 30-40 minutes if it happens. Never seen it start later than that.
Looks like this print with a fresh copy is working fine (first 22 underruns appear to be innocuous)
-
I know it's not trivial, but is it possible to switch to a slicer other than S3D?
-
@Phaedrux no other slice supports different diameter nozzles for different extruders. I tried setting up in cura recently and that wasn't an option. My startup script uses s3d specific variables for temperature setting. Other slicers are better than s3d on many respects, but s3d is still better at multi extruder setup -- since it can all be done via multiple processes -- so it's still not an option for me.
-
@Phaedrux -- yep, running fine now with a fresh copy. Underruns are not piling up doing fine -- I can check S3D option for decreasing small movements -- not sure if that has anything with what's going on.