Duet sometimes really slow? - I2C error or?


  • administrators

    @martin1454 said in Duet sometimes really slow? - I2C error or?:

    @dc42 said in Duet sometimes really slow? - I2C error or?:

    Ian has shown that when the problem occurs, the I2C subsystem can be recovered without a hardware reset. So recovering from this error should be possible.

    Well a power cycle is a hardware reset even if it is done using SW to initiate it I think? Or were there any other SW way of recovering other than power cycle?

    There is another thread which has gone OT to be about this same issue. In that thread, Ian said at https://forum.duet3d.com/post/9349:

    In case anyone is following this thread but not one of the other I2C error related threads, I can now confirm that running M199 to reset the firmware fixes the issue so no need to cycle the power. At least that's how it is for me.

    I can also confirm that when running M115 immediately after M999, it reports that the Duex5 is present. Since I had the error and ran M199 to restart the firmware, without powering down, I've re-homed the printer and it is now happily printing away (at least for now).

    I am assuming that Ian meant M999, not M199.



  • @dc42 ah okay - Will try a M999 next time



  • @dc42 whatever it's worth, whenever I've encountered this behavior a reboot via the stop button on the screen doesn't help.
    Also I never ran into these slow behaviors until upgrading to 2.05/7, I will try the code base from prior to the code rewrite. 07 seems making it worse where the end stops just stop working instead of just being slow. Also I can't see how I can make my ground connection any better, literally as short a run as possible, I can't fit a thicker set of wires into a ferule (that would fit into the terminal block) than what's on there now.



  • @dc42 said in Duet sometimes really slow? - I2C error or?:

    I am assuming that Ian meant M999, not M199.

    Ooops - yes indeed I did mean M999. I managed to type M999 correctly one time out of three ☺



  • @dc42

    @dc42 said in Duet sometimes really slow? - I2C error or?:

    Yes I expect the use of those chips would help. But it would require a redesign of both the Duet and the DueX, and new Duets wouldn't be compatible with older DueX boards, or vice versa.
    Two jumpers on Duet and Duex that could be used to bypass the PCA chip would allow compatibility with older hardware. I know this adds an extra complexity but it maybe worth it.



  • I might be homing in on a way to provoke this problem but it's difficult. For the third day running I've been able to get it to happen at exactly the same point.

    The sequence is as my post above but I have to start with the machine powered off and it has to have been powered off over night. It only happens once a day. If I power down the machine and start the sequence again, it won't misbehave a second time but for 3 days running I've been able to do exactly the same sequence and get exactly the same problem at exactly the same point in the sequence. I've had my suspicions that it seems more likely to happen after the machine has been powered down for a considerable time but I have no idea why that should be. Capacitance that takes a long time to decay?? Not my area of expertise.

    For 3 days running, the problem has started at the same point in the second iteration of my homeall file. So to reiterate, the sequence is as follows:

    1. Turn on machine and connect to DWC (usig Firerfox)
    2. Run home all through DWC
    3. Drop bed 100mm through DWC
    4. Heat hot end to 190 deg C
    5. Extrude 300mm of filament through extruder 0 at 5mm/sec using DWC (as if loading a new reel of filament).
    6. Retract 1mm at 5mm /sec through DWC
    7. Run home all again through DWC.

    My home all file is pretty complicated. Here it is in it's entirety with the point where the I2C errors (or at least the pauses between moves) occur, marked.

    TO; select a tool - any one will do
    M104 S140; heat to 140 but don't wait

    ;*****Home XYUV (lower 2 gantries)

    M584 X0 U3 Y1 V4 P5; temporarily map drives to U and V axes

    M906 X400 U400 Y400 V400 Z1200 ; reduce motor currents

    G91 ; set to use relative coordinates

    G1 Z5 F600 ; move bed down 5 mm

    G1 X-380 U-380 Y-380 V-380 F4800 S1; move all 4 axes fairly quickly until one or other triggers a switch

    G1 X-380 U-380 F4800 S1; now move just X and U fairly quickly left until one or other triggers a switch

    G1 X-380 S1; course home X
    G1 U-380 S1; course home U

    G1 X10 U10 F600 ; Go back a few mm

    G1 X-380 U-380 F360 S1; Move slowly to X and U axis endstops once more and stop when one triggers

    G1 X-380 F360 S1 ; fine home X
    G1 U-380 F360 S1 ; fine home U

    G1 Y-380 V-380 F4800 S1; now move Y and V fairly quickly until one or other triggers a switch

    G1 Y-380 S1; course home Y
    G1 V-380 S1; course home V

    G1 Y10 V10 F600; Go back a few mm

    G1 Y-380 V-380 F360 S1; Move slowly to Y and V axis endstops once more and stop when one triggers

    G1 Y-380 F360 S1 ; fine home Y
    G1 V-380 F360 S1 ; fine home V

    ;****Now home upper Gantry

    M584 X6 Y9 ; map upper motors to X and Y
    M574 X1 S1 C5 ; map end stop 5(E2) to X axis
    M574 Y1 S1 C6 ; map end stop 6 (E3) to Y axis

    G1 X-380 Y-380 F4800 S1; move X and Y fairly quickly until one or other switches triggers
    G1 X-380 F4800 S1 ; course home X
    G1 Y-380 F4800 S1 ; course home Y

    G1 X10 Y 10 F600 ; back off a few mm

    G1 X-380 F360 S1 ; fine home X
    G1 Y-380 F360 S1 ; fine home Y

    M574 X1 S1 C0 ; put X axis end stop switch back to standard mapping
    M574 Y1 S1 C1 ; put Y axis end stop switch back to standard mapping

    M584 X0:3:6 Y1:4:9 Z2 U10 V11 E5:7:8 P3; Put axes motors back to standard configuration

    ;******Now home Z

    G90; set to absolute coordinates

    G1 X180 Y180 F12000; move to more or less the centre of the bed

    M109 S140 ; continue heating hot end to 140 but this time wait

    ;change to faster probing speed
    M558 F450
    G30 ; FAST home Z using values from G31

    *********This is the point where pauses due to to I2C errors become apparent

    G91 ;relative
    G1 Z5 F300 ; lower bed
    G90 ;absolute

    ;change back to slower probing speed
    M558 F180
    G30 ; SLOW home Z

    G91 ;relative
    G1 Z5 F300
    G90 ; back to absolute

    M906 X1800 U1800 Y1800 V1800 Z1800 ; set motor currents back to defaults

    M104 S0; set hot end temp back to zero

    End of home all file.

    I'll try the entire sequence of power up, home, drop bed, extrude and repeat home again tomorrow to see if I can provoke it for the 4th day in succession.


  • administrators

    That's interesting.

    I tried to reproduce it yesterday, using separate ground wires from the PSU to the Duet and DueX, three Nema 23 motors running at 2A moving continuously driven from the Duet, and a diode connected between Fan0- and E2 endstop stop pin, set to 10Hz frequency, so that the E2 endstop input toggles at 10Hz in order to force the Duet to read the status via I2C frequently. The first time I tried this, I got an I2C lockup after 9 minutes. But I was not able to reproduce it again.

    I have reviewed the I2C code and made some changes. In particular, when an I2C error is detected, I now reset the I2C controller on the Duet and retry. Up to 3 tries are done. Each I2C reset is recorded and the reset count is included in the M122 log along with the other I2C stats. Also I have made a change that I planned a long time ago, which is to use a separate RTOS task to monitor the DueX5 state and do the I2C transactions when it changes. This should substantially reduce the latency of the endstop inputs on the DueX.

    I intend to release this in firmware 2.03RC2 later today. Ian, would you be able to upgrade to this release and see if you are able to reproduce the problem on it?



  • @dc42 said in Duet sometimes really slow? - I2C error or?:

    I intend to release this in firmware 2.03RC2 later today. Ian, would you be able to upgrade to this release and see if you are able to reproduce the problem on it?

    Yes of course.

    Interestingly, I repeated the exact same sequence for the 4th day running and got exactly the same result. That is to say, pauses between moves that occurred in exactly the part of the second home all sequence. As before an M122 showed I2C errors which were cleared with a subsequent M999. I have no idea why I can't reproduce this unless the printer has been powered down over night. I've switched the printer off and will try again later today to see if a 1 hour or 2 hour power down will provoke it.

    That's 4 days running that I've had exactly the same thing happen, so at least it looks like we are getting close to having a method that will provoke the problem with some degree of confidence, even if we don't know how or what part of the sequence is responsible.

    This kind of reminds of a problem I had in a previous life with a customer who had a V12 E type Jaguar. This thing would break down but only when he had driven from Luton to the 7 bridge on the Welsh border and stopped to pay the toll. I got there in the end but it was a bitch to diagnose. ☺



  • @dc42 David, can you pm or email me when you have the release ready so that I don't miss it.



  • @deckingman said in Duet sometimes really slow? - I2C error or?:

    I have no idea why I can't reproduce this unless the printer has been powered down over night. I've switched the > printer off and will try again later today to see if a 1 hour or 2 hour power down will provoke it.

    When it happens to me 4/5 times it is upon power on, and allready on the first homing move it is pausing between movement. 1/5 of the time it happens mid print/mid sequence.


  • administrators



  • @dc42

    OK. Just tried it and had no problem but...........

    1. The printer had only been off for about 3 to 4 hours rather than over night. I'll test again tomorrow morning.

    2. More importantly, I was using M574 to re-map end stops for the upper load balancing gantry. I ran the exact same home all macro, but it crashed the upper gantry and obviously the behaviour is different now that re-mapping end stops has been withdrawn. So I can't completely replicate the sequence of events that had proven to provoke the problem repeatedly over 4 days.

    To run a print like this, I'd need to revert to a configuration that doesn't use the upper gantry or switch to RRF 3.0. Please advise.

    Cheers


  • administrators

    @deckingman, thanks for trying it. RRF3 doesn't yet incorporate these changes, and won't until sometime later this week or next week. So for now, reverting your configuration is the only option.



  • @dc42 OK. In the interests of keeping everything as consistent as possible, I'll just slip the belts off the upper motors. Then I won't be introducing other variable as might be the case if I change the configuration. We still have the situation where the sequence of events that provoked the problem won't be quite the same, so it introduces an element of doubt as to the effectiveness of the solution.

    For info, I ran M122 and noticed this which is new:

    Tasks: NETWORK(ready,660) HEAT(blocked,1236) DUEX(suspended,156) MAIN(running,4264) IDLE(ready,160)

    Is there anything else you'd like me to check?


  • administrators

    There is a new field "resets" in the I2C stats line.



  • @dc42 Yup. That was reported as zero in this instance. I'll try it (the sequence) again tomorrow morning.



  • I will update to this release tonight, and will report back. The behavior of the homing sequence is identical to my issue.



  • I ran "the sequence" again today with this new RC firmware with no I2C issues but......

    1. The sequence isn't quite the same because I can't home my 3rd gantry (the homeall file is the same but the end stop mapping is dissabled in this firmware). I've just taken the belts off the upper gantry motor pulleys.

    2. During the first homing sequence I had a report - "Error: over temperature shutdown reported by driver(s) 8." I've never seen this before and I don't think it's "real". Driver 8 is the 3rd extruder drive so that motor hadn't even been energised.

    For info, running M122 after "the sequence" shows no I2C errors and resets as being zero.

    HTH and unless I hear otherwise, I run it again tomorrow without making any changes.



  • Quick update. I had to print something in a hurry. The print went well, no sign of any over temperature errors so hopefully that was a one off, unexplainable glitch. I wasn't expecting to have any I2C errors and that was what happened. M122 shows no I2C resets either. The printer is now going into it's over night hibernation ready for tomorrow's home, extrude, home sequence.


  • administrators

    @deckingman, thanks for the update.

    To be clear, what I would like to establish is:

    1. Whether the original fault is fixed (with or without a nonzero i2c reset coun). The original fault is that i2c communication breaks down. The symptom of the printer going slow was a side effect, which I don't expect to be present in the new release even if i2c comms does break down. To test whether i2c communication is working, try changing the speed of a fan connected to the DueX.
    2. Whether or not the new code that fetches the states of the DueX endstop inputs is reliable. So after a long print, please use M119 or the Machine Properties page of the old DWC to read the endstop states as you operate the switches.

 

Looks like your connection to Duet3D was lost, please wait while we try to reconnect.