3500 hours on my printer now skipping steps like crazy



  • Hello,

    I've been using my duet wifi since 2017 and its been extremely reliable. I have a custom built core xy with a machine timer built in so I can see the hours of actual print time. I love my duet and if it turns out this one is toast I will buy another to replace it.

    The problem is in the last month I have noticed shifted layers cropping up into my prints. Im not one to spend a mess of time debugging it so I usually just print again and hope for the best. Well that solution has run out. I have seen it get worse and worse to the point where it will intermittently skip continuously. This happened two days ago when I was right next to it so I was able to shut it down within 30 seconds or so.

    The skipping only happens in the x and y. I suspected a friction problem but after studying the machine in depth I have ruled that out. I then suspected a heat problem (I have had overtemp driver warnings before when my little fan blowing on the duet gets moved or knocked over) so I added little microchip heat sinks to the trinamic drivers.

    This didnt seem to help at all. This morning when the shop was cool I started up the printer, warmed up the bed, homed everything out and started a print thinking (well I can at least get one or two models printed before it wigs out again). It started skipping every step again first thing in the morning.

    So I think my printer is dead in the water. I ordered two more steppers yesterday thinking/hoping this could be the issue. The more I think about it the more I doubt it.

    So now I'm turing my attention to the duet. I did some searching on missing steps and found this thread where dc42 pointed out the diagnostics gcode command (very cool).
    His idea was that the power supply could be cutting out and that would show as an undervoltage event.

    My diag showed no such events right after skipping a million steps.

    === Diagnostics ===
    Used output buffers: 2 of 32 (9 max)
    === Platform ===
    RepRapFirmware for Duet WiFi version 1.19.2 running on Duet WiFi 1.0
    Board ID: 08DDM-9FAM2-LW4SD-6J9F6-3SN6K-12ZMY
    Static ram used: 21176
    Dynamic ram used: 96136
    Recycled dynamic ram: 1472
    Stack ram used: 4048 current, 9104 maximum
    Never used ram: 3184
    Last reset 18:04:08 ago, cause: power up
    Last software reset reason: User, spinning module GCodes, available RAM 3136 bytes (slot 0)
    Software reset code 0x0003, HFSR 0x00000000, CFSR 0x00000000, ICSR 0x00400000, BFAR 0xe000ed38, SP 0xffffffff
    Error status: 0
    Free file entries: 10
    SD card 0 detected, interface speed: 20.0MBytes/sec
    SD card longest block write time: 5.5ms
    MCU temperature: min 25.8, current 30.8, max 32.9
    Supply voltage: min 23.7, current 23.9, max 24.2, under voltage events: 0, over voltage events: 0
    Driver 0: stalled standstill
    Driver 1: stalled standstill
    Driver 2: stalled standstill
    Driver 3: stalled standstill
    Driver 4: standstill
    Date/time: 2020-08-18 09:57:52
    Slowest main loop (seconds): 0.201172; fastest: 0.000000
    === Move ===
    MaxReps: 4, StepErrors: 0, FreeDm: 240, MinFreeDm 120, MaxWait: 4007670ms, Underruns: 5, 0
    Scheduled moves: 0, completed moves: 0
    Bed compensation in use: mesh
    Bed probe heights: 0.000 0.000 0.000 0.000 0.000
    === Heat ===
    Bed heater = 0, chamber heater = -1
    Heater 0 is on, I-accum = 0.0
    === GCodes ===
    Segments left: 0
    Stack records: 1 allocated, 0 in use
    Movement lock held by null
    http is ready with "M122 " in state(s) 0
    telnet is idle in state(s) 0
    file is idle in state(s) 0
    serial is idle in state(s) 0
    aux is idle in state(s) 0
    daemon is idle in state(s) 0
    queue is idle in state(s) 0
    autopause is idle in state(s) 0
    Code queue is empty.
    Network state is running
    WiFi module is connected to access point
    WiFi firmware version 1.19.2
    WiFi MAC address 60:01:94:33:84:d5
    WiFi Vcc 3.07, reset reason Exception
    WiFi flash size 4194304, free heap 39120
    WiFi IP address 192.168.15.146
    WiFi signal strength -63dBm
    Reconnections 0
    HTTP sessions: 1 of 8
    Socket states: 2 0 0 0 0 0 0 0
    Responder states: HTTP(1) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0)
    === Filament sensors ===
    Extruder 0 sensor: ok

    Is there anything else I can look at that could give me a clue as to what is going on?
    Thanks in advance for the help!

    -Chris



  • @gforce Are you a betting man? I'll give you 10:1 that it's a mechanical issue - shifted layers are almost inevitably caused by mechanical issues. I'd start looking for a pulley that is loose on a shaft or some such. The fact that you say it happens in X and Y would indicate (on a CoreXY) that one motor/pulley is affected. I guess it could be the motor but ................



  • @gforce My experience would be to say mechanical too, but a few thoughts:

    • Have you got any pictures of the skipped layers? Might be interesting to see if it's always one motor skipping, or both. Is it consistent?
    • Is there any chance it's skipping becasue it's hitting your print? Could be an extruder issue that's causing lumos that yiur nozzle is hitting?
    • Your drivers do say 'stalled' in M122 which suggests they know what has happened. What to do with this info, I'm less sure... 😛
    • How sure are you it isn't mechanical - have you checked all of your bearings for wear? The tiny ones in the 16T pulleys/idlers can often go (especially after 3500hrs!)
    • What firmware version are you using? Have you updated to RRF3? I believe some of the later ones put more stuff in the M122 output which might help (even 2.05)
    • Maybe posting your config might give a bit more insight into what your printer is doing, but not sure exactly what we'd be looking for other than motor currents...

    One other point on the driver heatsinks - general advice is not to use them. If you put them on top of the plastic chip, they won't conduct much heat (because it's plastic), and may do more harm by blocking the airflow from your fan over the board. There is also always the risk of shorting things with them. I doubt any of this is the issue here, but just as an FYI 🙂



  • Thanks for the responses so far guys!

    I thought it was mechanical too so I checked everything in the xy for friction or wear. I have replaced the idler pulleys in the past so I know how they react, and I have been burnt by a loose pulley in the past. I can grab the effector with the motors disabled and move it around in the xy. It feels normal. Plus...

    I walked up to the printer skipping both motors for like 30 seconds. Tons of skipped steps. Both motors. In fact while it was skipping the first time I grabbed the effector and moved it around to see if I could feel more friction(maybe it was intermittent) and it felt like the motors were off because they were both skipping. At this point the motors were loosing steps in the air. Just buzzing away skipping steps moving around half an inch here half an inch there(no pattern like 45 degrees or anything) . Pretty crazy I know. I felt the motors, they were about 120 degrees aka normal. The stepper drives heatsink feel cool to the touch.

    The extruder and z motor dont give me fits so I think the drivers say stalled because I hit the estop? I don't know.

    Good to know about the heatsinks. I thought I was helping. Maybe I'll pry them off.

    My firmware has been the same since I bought it 2017. Says 1.19.2 (2017-09-01)
    I suppose I could upgrade the firmware and try again. Do you have to redo the config file?

    motor currents are 1.9a for x and y. Been the same since hour 1.

    I do leave the duet on all the time and I have seen motor drive temp alarms over the years...



  • @gforce does seem weird - I'm quite intrigued to find out what it is!

    I'd definitely recommend updating the firmware:
    https://duet3d.dozuki.com/Wiki/Installing_and_Updating_Firmware
    I'd go for 2.05.1 (the latest RRF2) for the moment as there's quite a jump up to RRF3. You can download the files from here:
    https://github.com/Duet3D/RepRapFirmware/releases/tag/2.05.1
    Read the updating guide above as I think you'll have to rename a file as you're updating from so far back, but essentially you just upload the files to DWC and click install. As I say, I doubt that's your issue, but my M122 definitely has some more info if it is something board/firmware related and there have been a lot of bug fixes, upgrades new features etc.
    I may being wrong, but most of your config should be able to stay the same. Might be worth double checking it (or doing a fresh one using the configurator and comparing)... If you've not used it before, the Gcode page on the wiki is quite good for cross-checking your config to see what each line does and if the commands have changed in the different releases (https://duet3d.dozuki.com/Wiki/Gcode)

    Now back to the issue in hand...

    I felt the motors, they were about 120 degrees aka normal. The stepper drives heatsink feel cool to the touch.

    Please tell me you're working in fahrenheit... in which case that sounds reasonable 🙂

    motor currents are 1.9a for x and y. Been the same since hour 1.

    That sounds a bit high, but not crazy. General rule is 50-85% rated current of you steppers (so as long as your motors are rated for >2.2A/phase, you should be golden). Above ~1.5A you need a cooling fan on the board, but you've got that. Motors and chips do wear out faster if run hotter though.

    Next question would be the motors and wiring. Does any of the wiring get hot (I'm thinking a dodgy connection or partially broken wire in there)? I'd disconnect each of the motors from the Duet and measure the phase resistance on them all. Check this against the motor datasheets if you have them, or at least between the motors that are the same. The fact that the printer has been working reliably thus far suggests something has worn out.

    How reliably does the printer move if just commanding moves via DWC? Can you repeat the issue from there?

    One option might be to swap the drives over (e.g. swap X/Y with E0/E1), and remap them in your config (using M584). That will at least give you some indication if it is a drive specific issue. I'd do the resistance checks first though just in case...

    Please post your current config.g file as well so we can have a better idea of what's going on under the hood. It may not show the problem, but it at least helps us see what we're dealing with and how your machine is set up.


  • Moderator

    I'd lean towards mechanical as well. Does the mechanism work smoothly by hand when disconnected from the motors?

    Having updated firmware would give us more detailed diagnostic report at any rate. Seeing your config.g might help to see your settings.

    Given the vintage of your firmware, if you do decide to update I suggest generating a new config file with the online tool.

    https://configtool.reprapfirmware.org/Start

    Going from 1.19 will take a bit of manual work to get updated and it may be easiest to use bossa to flash the latest firmware, but do spend some time looking at the update docs. Once you get to firmware 2.0 the update procedure gets much much easier and you only need to upload a single zip file through the web interface.



  • I'd vote mechanical too but Im pretty sure i have managed to partially demagnetise a micro stepper before. I did this by running 1200ma through one designed for 250ma, but it got so hot to do this the mounts had also melted (cf-nylon). Not sure how you'd do this on a duet with anything other than a microstepper though.

    I'm guessing 120degrees is F not C but have they possibly overheated anytime in the past?


  • Moderator

    @oliverracing said in 3500 hours on my printer now skipping steps like crazy:

    Not sure how you'd do this on a duet with anything other than a microstepper though.

    Someone recently used a Duet 3 to put 3A through a 1.2A stepper for a long enough time to cook it. A bit harder with the Duet 2.



  • I believe stepper motors are usually rated for around 80C and start loosing magnetic properties (ie reduced output power) when you go much beyond that.



  • Wow thanks again guys. Lots of good help here.

    I will update the firmware to 2.05.1 as suggested by engikeneer. But I might not be able to for a day or two. I will follow the guides carefully.

    I would like to put a camera on it and recreate the issue to show you guys whats going on. Will work on that before the firmware change.

    Yes, motors at 120f. I can't say that I have ever noticed the motors getting hotter. I have been warned (as I said before) of high driver temps, but not during this loss of step issue.
    Motors on xy and z are 17hs24-2104s and they call for 2.1a current.

    Just to shine more light on the mechanics. When the print is running I can push on the effector with at least several pounds and it doesn't miss a beat. So I'm not running on any ragged edges. I do notice some belt stretch on my top belt so I will change them both out in the next week or so but I'm quite convinced that is not causing the skipped steps.

    Here is the config file. It has not changed for quite some time.

    ; Configuration file for Duet WiFi (firmware version 1.17 to 1.19)
    ; executed by the firmware on start-up
    ;
    ; generated by RepRapFirmware Configuration Tool on Sun Nov 12 2017 16:20:08 GMT-0700 (Mountain Standard Time)

    ; General preferences

    ;Set Password
    M551 PALLCAPSNOSPACES

    M111 S0 ; Debugging off
    G21 ; Work in millimetres
    G90 ; Send absolute coordinates...
    M83 ; ...but relative extruder moves
    M555 P2 ; Set firmware compatibility to look like Marlin

    M667 S1 ; Select CoreXY mode
    M208 X0 Y0 Z0 S1 ; Set axis minima
    M208 X416 Y420 Z360 S0 ; Set axis maxima

    ; Endstops
    M574 X1 Y1 Z2 S1 ; Define active high microswitches
    M558 P4 X0 Y0 Z0 H1 F200 T9000 ; Set Z probe type to unmodulated, the axes for which it is used and the probe + travel speeds
    G31 P600 X0 Y0 Z2.5 ; Set Z probe trigger value, offset and trigger height
    M557 X15:385 Y15:385 S20 ; Define mesh grid

    ; Drives
    M569 P0 S1 ; Drive 0 goes forwards
    M569 P1 S1 ; Drive 1 goes forwards
    M569 P2 S0 ; Drive 2 goes backwards
    M569 P3 S0 ; Drive 3 goes forwards
    M350 X16 Y16 Z16 E16 I1 ; Configure microstepping with interpolation
    M92 X80.16032 Y80.24072 Z2400 E408.173; Set steps per mm
    M566 X900 Y900 Z8 E120 ; Set maximum instantaneous speed changes (mm/min)
    M203 X25000 Y25000 Z200 E1200 ; Set maximum speeds (mm/min)
    M201 X1500 Y1500 Z60 E250 ; Set accelerations (mm/s^2)
    M906 X1900 Y1900 Z1500 E1200 I30 ; Set motor currents (mA) and motor idle factor in per cent
    M84 S30 ; Set idle timeout

    ; Heaters
    M143 S290 ; Set maximum heater temperature to 290C
    M301 H0 S1.00 P10 I0.1 D200 T0.4 W180 B30 ; Use PID on bed heater (may require further tuning)
    M305 P0 T100000 B4138 C0 R4700 ; Set thermistor + ADC parameters for heater 0
    M305 P1 T100000 B4138 C0 R4700 ; Set thermistor + ADC parameters for heater 1

    ; Tools
    M563 P1 D0 H1 ; Define tool 0
    G10 P1 X0 Y0 Z0 ; Set tool 0 axis offsets
    G10 P1 R0 S0 ; Set initial tool 0 active and standby temperatures to 0C
    M591 D0 P1 C3 S1; Turn on filament monitoring

    ; Network
    M550 PDuet Wifi ; Set machine name
    M552 S1 ; Enable network
    M587 S"Airius Fans" P"another change of fans" ; Configure access point. You can delete this line once connected
    M586 P0 S1 ; Enable HTTP
    M586 P1 S0 ; Disable FTP
    M586 P2 S0 ; Disable Telnet

    ; Fans
    M106 P0 S0.3 I0 F500 H-1 ; Set fan 0 value, PWM signal inversion and frequency. Thermostatic control is turned off
    M106 P1 S1 I0 F500 H1 T45 ; Set fan 1 value, PWM signal inversion and frequency. Thermostatic control is turned on
    M106 P2 S1 I0 F500 H1 T45 ; Set fan 2 value, PWM signal inversion and frequency. Thermostatic control is turned on

    ; Custom settings are not configured



  • @gforce one thing that jumps out there is you max speed of 25000mm/min (thats over 400mm/s). I would be very impressed if your printer could go that fast! I'm guessing your slicer is limiting the speed, but could it be that you've changed the limit in you slicer and it now gets into the higher speeds where torque drops off? Sounds unlikely from what you've described, but I though it worth mentioning.



  • @engikeneer Good observation! Yes I limit in my slicer to 65mm/sec for most top speed prints (16mm/sec in the z). I suppose I could change that in my next config when I update the firmware.


  • administrators

    Check the M122 report to make sure that the VIN voltage is stable when the machine is skipping steps. Problems on Duet 2 powered machines that start after months or years of use are sometimes caused by either the PSU failing or the VIN terminal block connections no longer being sound.



  • @dc42 Very good info! I will check that out. Thank you very much.



  • Well I shut down the printer for a few days until I had time to try to recreate the problem.
    After booting it up this morning I am getting a reoccurring Error: Over temperature shutdown on drivers 3. I haven't even turned anything on or done anything and the board is cold with a fan blowing on it.

    I think this thing is shot. Ordered a new one from Filastruder who I bought from back in 2017. I will report back my findings.



  • @gforce said in 3500 hours on my printer now skipping steps like crazy:

    M305 P0 T100000 B4138 C0 R4700 ; Set thermistor + ADC parameters for heater 0
    M305 P1 T100000 B4138 C0 R4700 ; Set thermistor + ADC parameters for heater 1

    also your thermistor settings are wrong. 4138 is most likely wrong for your thermistors



  • Hi guys,

    I got a new duet wifi from filastruder and set it up with the latest firmware based on my old settings. Printing has been flawless. I put another 15 hours or so of printing and had no missed steps at all.

    So apparently it was the old duet board. I'm happy with that. My biggest fear is that I would replace the board and find out that wasn't the problem.

    I'm also pleased with the external antenna version that I got vs the onboard antenna version I had from 2017. My printer is metal and I made a mistake of buying the internal one. My connection to the network has been much more reliable. Im a happy printer.

    Thank again for all the ideas and feedback you guys gave me. Hopefully this thread will be of help to someone in the future.

    Here is to another 3000+ hours!

    20200917_102204.jpg

    20200917_101905.jpg


  • Moderator

    That looks like a solid workhorse. Glad the new board is working well.


Log in to reply