SOLVED Duet 2 random loss of response (USB, PanelDue, Network)
Having a major problem with Duet 2 controller on one of my machines.
After a time (between 30 mins and 8+ hours) the controller will no longer respond to any commands (either through PanelDue, USB plugged in directly, nor web commands). If a job is already started it will continue to run fine, but there is no ability to input any commands (pause/resume) and PanelDue values do not update. This happens even if the machine idles with no input whatsoever.
I do see a WiFi error message in the console which corresponds to the loss of control:
WiFi reported error: Lost connection, auto reconnecting
Board has VIN, 3.3V, and 5V LEDs lit OK. It will run as normal after a power reset (until the problem reoccurs)
Setup is pretty standard, I do have the duet servo breakout board on the machine. WiFi version is 1.23. Firmware version is 2.03 stable. I also threw 2.04RC3 on the machine and no change. I replaced both the Duet and PanelDue but there was no change in behaviour.
Hard to troubleshoot as I can't even get a M122 dump due to the loss of response from the controls when the problem occurs.
Any ideas or ways to get more info when this happens again?
Sounds similar to problems in this thread: https://forum.duet3d.com/topic/10164/duet-wifi-lost-web-connection
Most likely the output buffers are full and not releasing.
To diagnose, follow @dc42 advice from that thread if possible:
Next time this happens, please try closing the browser tab that DWC is open in. Then wait more than 10 seconds, and after that either use YAT to send M122 and check whether the buffers have been released, or try reloading DWC. What should happen is that if DWC doesn't receive any traffic from an IP address in 10 seconds, it should time out that client and release any associated buffers.
In the M122 response, "Used output buffers" is the 4th line down from the top.
No good. Happened again this morning with no web interface open. Still can't open serial terminal or web interface.
I will try disabling the WiFi chip and see if it re-occurs, as that's the error I'm getting when it happens.
Any other ideas for getting useful debugging / troubleshooting output?
Something others have reported is power supplies breaking down and emitting RF interference, causing many disconnections and reconnections.
Well disabling the network chip seems to resolve the problem. Ran all weekend with no control loss after M552 S-1.
You may be on to something re: power supply or similar. I have multiples of the same printer built in exactly the same manner, and the problem doesn't present on any others, and now I run into it twice on the same machine with different duets installed. Could well be a factor external to the duet that's the root cause (power supply on this specific machine?). Will investigate further.
Thanks for your suggestions thus far.
For future reference, this issue appears resolved as RRF 2.04RC4 changes the behavior of output buffers.