Duet2 connection interrupted: Let's get to the bottom of this
-
For several years, I have been off and on plagues by frequent
Connection interrupted, attempting to reconnect... HTTP request timed out
's. Sometimes I have no issues for multiple months and then it seems like non stop troubles. Whenever the disconnect happens the device is actually not ping-able from any other device in my network. So my first guess is that's actually an issue on the Duet's side.I have seen this or similar issues poping up over the years on the forums but they never really seem to have gotten resolved, or just went away without a real resolution. So I want to get to the bottom of this.
I am running a Duet 2 WiFi (rrf 3.3) and hit me with whatever info/details you need or things to do.
Edit for more details:
- Fritz!Box 7590 router + Frtiz!Repeater 2400 (both send out a separate 2.4 GHz network
-
Failing wifi module perhaps?
Interference?
What's your signal strength like?
Distance to router?
What kind of router?
Have you tried a different SD card? -
@nxt-1 said in Duet2 connection interrupted: Let's get to the bottom of this:
I am running a Duet 2 WiFi (rrf 3.3) and hit me with whatever info/details you need or things to do.
I had to install a dedicated WiFi AP in the same room as the Duet 2 WiFi boards (the ones with the built-in antenna) to get a stable connection.
Frederick
-
@phaedrux said in Duet2 connection interrupted: Let's get to the bottom of this:
Failing wifi module perhaps?
If this a known issue? And even more important, is there a way to verify that it would or would not be failing?
Interference?
What's your signal strength like?
Distance to router?@fcwilt
The main router is about 5m away with some walls inbetween. The mesh repeater is 3m away with uninterupted LoS. A rssi of -51 dBm seems plenty to me.=== Network === Slowest loop: 204.58ms; fastest: 0.08ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions HTTP sessions: 1 of 8 - WiFi - Network state is active WiFi module is connected to access point Failed messages: pending 0, notready 0, noresp 1 WiFi firmware version 1.26 WiFi MAC address a0:20:a6:16:ec:ed WiFi Vcc 3.36, reset reason Turned on by main processor WiFi flash size 4194304, free heap 25032 WiFi IP address 192.168.178.30 WiFi signal strength -51dBm, mode 802.11n, reconnections 0, sleep mode modem Clock register 00002002 Socket states: 0 0 0 0 0 0 0 0
What kind of router?
I updated the original post with this info.
Have you tried a different SD card?
Would this have an influence? I'll try one if you want.
-
@nxt-1
I had this issue too.... but only ONE time till now (sgrat!) , I should have very good signal and when this issue happened all the rest of the devices connected to the router still work properly. After the reset all turned to work.
https://forum.duet3d.com/topic/24348/wifi-reported-error-lost-connection-auto-reconnecting
But I'll follow this 3d with interest ! -
@nxt-1 I'm not sure how easy this will be for you to test, but with the LPC/STM32 port we had a user that was using a mesh system. Occasionally the esp8266 would connect to the "wrong" AP and so would have a very poor signal leading to connection problems. I fixed this in our version of the esp8266 firmware by making a note of the signal strengths associated with each access point and using the MAC address (or BSSID) of the access point to force the connection to be to the one with the highest signal (in theory the esp8266 should do this automatically, but that did not always seem to work). This was only an issue when using a setup in which all of the access points in the mesh used the same SSID.
Does your mesh admin interface allow you to check what devices are connected to the various access points? If so that may be a way to see if RRF has connected to the "wrong" one?
-
@gloomyandy
In some mesh system you have the possibilty to bind a client to a dedicated mesh node. Asus ZenWifi AX have this function but maybe not relevant for this thread.
/Fredrik -
@gloomyandy @Falkia I have both the main AP and the repeater broadcasting their own ssid at the moment, so I can force connection to the AP by removing the main ssid from the duet's networks.
Update, I just did that, so now only the ssid broadcast by the repeater is known to the duet and the issue persists.
-
@nxt-1 Oh well, I guess it is good to eliminate things. I'm pretty sure that the guy that had the problem was using a google nest mesh system, which I think uses the same SSID for all of the access points.
-
This may possibly be related to https://forum.duet3d.com/topic/23701/mini-5-wifi-connection-unstable/34?_=1625331363732 which is being investigated.
-
@dc42 said in Duet2 connection interrupted: Let's get to the bottom of this:
This may possibly be related to https://forum.duet3d.com/topic/23701/mini-5-wifi-connection-unstable/34?_=1625331363732 which is being investigated.
You might very well be right, at least the symptoms described seems quite similar if not the exact same.
-
I have the same issue (had it ever since the first Duet). Things will go just fine for the longest time then all of a sudden I get repeated interruptions. Then just like it started, the issue magically goes away.
I was unable to make heads or tails out of it and eventually gave up. Now, if things act up, I go do something elseIt is not related to signal strength.
Latest episode happened yesterday. For about 5 minutes I was unable to reach the printer consistently and then all was fine again. I did not reboot anything tp make things work. -
@jens55 said in Duet2 connection interrupted: Let's get to the bottom of this:
Latest episode happened yesterday. For about 5 minutes I was unable to reach the printer consistently and then all was fine again. I did not reboot anything tp make things work.
Was microwave oven turned on for those 5 minutes by any chance?
-
@nxt-1 Would you be able to collect a wireshark trace?
-
@phaedrux said in Duet2 connection interrupted: Let's get to the bottom of this:
@nxt-1 Would you be able to collect a wireshark trace?
Sure, whenever another episode happens I will capture it. I assume a trace with a filter for just the Duet IP will suffice?
-
@nxt-1 Yes, the part where the DWC loses connection to the Duet should suffice. In other interest there is now v1.26 with extra debugging available. If you want to give it a try and a connection drop occurs, please send
M111 P14 S1
followed byM122
and share the full output here. It may help us further to isolate the underlying problem. -
@dc42 said in Duet2 connection interrupted: Let's get to the bottom of this:
@jens55 said in Duet2 connection interrupted: Let's get to the bottom of this:
Latest episode happened yesterday. For about 5 minutes I was unable to reach the printer consistently and then all was fine again. I did not reboot anything tp make things work.
Was microwave oven turned on for those 5 minutes by any chance?
No
-
@phaedrux said in Duet2 connection interrupted: Let's get to the bottom of this:
@nxt-1 Would you be able to collect a wireshark trace?
1.pcapng.c
2.pcapng.c
3FullDataset.pcapng.c
Here you go, the .c extension needs to be removed abviously, the forum doesn't allow from .pcapng files The first two datasets were started when I notices warning popping up in dwc, so they are probabily missing the first pieces of vital information. The 3rd set contains everything from start to finish.Note: all these traces were without the new v1.26 debug firmware as sugested by @chrishamm. I will install that now and report back.
-
@chrishamm said in Duet2 connection interrupted: Let's get to the bottom of this:
@nxt-1 Yes, the part where the DWC loses connection to the Duet should suffice. In other interest there is now v1.26 with extra debugging available. If you want to give it a try and a connection drop occurs, please send
M111 P14 S1
followed byM122
and share the full output here. It may help us further to isolate the underlying problem.As requested: as soon as a connection interupted warning poped, I did a
M11 P14 S1
followed byM122
.7/30/2021, 11:16:00 AM M122 === Diagnostics === RepRapFirmware for Duet 2 WiFi/Ethernet version 3.3 (2021-06-15 21:44:54) running on Duet WiFi 1.0 or 1.01 Board ID: 08D6M-91AST-L23S4-7JTD0-3S86K-1NX1K Used output buffers: 3 of 24 (24 max) === RTOS === Static ram: 23876 Dynamic ram: 74668 of which 24 recycled Never used RAM 15848, free system stack 94 words Tasks: NETWORK(ready,91.8%,143) HEAT(delaying,0.9%,330) Move(notifyWait,2.4%,301) MAIN(running,4.5%,128) IDLE(ready,0.4%,29), total 100.0% Owned mutexes: WiFi(NETWORK) === Platform === Last reset 116:49:31 ago, cause: power up Last software reset at 2021-07-25 14:11, reason: User, GCodes spinning, available RAM 15848, slot 2 Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f000 BFAR 0xe000ed38 SP 0x00000000 Task MAIN Freestk 0 n/a Error status: 0x14 Aux0 errors 0,0,0 Step timer max interval 0 MCU temperature: min 30.9, current 36.0, max 36.9 Supply voltage: min 24.1, current 24.2, max 24.5, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0 Driver 0: position 91505, standstill, SG min/max not available Driver 1: position 73958, standstill, SG min/max not available Driver 2: position 73795, standstill, SG min/max not available Driver 3: position 0, standstill, SG min/max not available Driver 4: position 0, standstill, SG min/max not available Driver 5: position 0 Driver 6: position 0 Driver 7: position 0 Driver 8: position 0 Driver 9: position 0 Driver 10: position 0 Driver 11: position 0 Date/time: 2021-07-30 11:15:41 Cache data hit count 4294967295 Slowest loop: 88.64ms; fastest: 0.15ms I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0 === Storage === Free file entries: 10 SD card 0 detected, interface speed: 20.0MBytes/sec SD card longest read time 1.5ms, write time 87.0ms, max retries 0 === Move === DMs created 83, maxWait 0ms, bed compensation in use: mesh, comp offset 0.000 === MainDDARing === Scheduled moves 68595, completed moves 68595, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1 -1 -1 === GCodes === Segments left: 0 Movement lock held by null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 Daemon is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === Network === Slowest loop: 1927.81ms; fastest: 0.00ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions HTTP sessions: 1 of 8 - WiFi - Network state is active WiFi module is connected to access point Failed messages: pending 0, notready 0, noresp 0 WiFi firmware version 1.26-D WiFi MAC address a0:20:a6:16:ec:ed WiFi Vcc 3.36, reset reason Turned on by main processor WiFi flash size 4194304, free heap 24360 WiFi IP address 192.168.178.30 WiFi signal strength -50dBm, mode 802.11n, reconnections 0, sleep mode modem Clock register 00002002 Socket states: 0 0 0 0 0 0 0 0
7/30/2021, 11:15:47 AM M111 P14 S1 Debugging enabled for modules: WiFi(14 - 0xffffffff) Debugging disabled for modules: Platform(0) Network(1) Webserver(2) GCodes(3) Move(4) Heat(5) DDA(6) Roland(7) Scanner(8) PrintMonitor(9) Storage(10) PortControl(11) DuetExpansion(12) FilamentSensors(13) Display(15) LinuxInterface(16) CAN(17)
7/30/2021, 11:15:36 AM Connection established 7/30/2021, 11:15:31 AM Connection interrupted, attempting to reconnect... HTTP request timed out
EDIT: After a few more disconnects, another
M122
incase it is usefulM122 === Diagnostics === RepRapFirmware for Duet 2 WiFi/Ethernet version 3.3 (2021-06-15 21:44:54) running on Duet WiFi 1.0 or 1.01 Board ID: 08D6M-91AST-L23S4-7JTD0-3S86K-1NX1K Used output buffers: 3 of 24 (24 max) === RTOS === Static ram: 23876 Dynamic ram: 74668 of which 24 recycled Never used RAM 15848, free system stack 94 words Tasks: NETWORK(ready,13.8%,143) HEAT(delaying,0.0%,330) Move(notifyWait,0.1%,301) MAIN(running,86.1%,128) IDLE(ready,0.0%,29), total 100.0% Owned mutexes: WiFi(NETWORK) === Platform === Last reset 117:01:45 ago, cause: power up Last software reset at 2021-07-25 14:11, reason: User, GCodes spinning, available RAM 15848, slot 2 Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f000 BFAR 0xe000ed38 SP 0x00000000 Task MAIN Freestk 0 n/a Error status: 0x14 Aux0 errors 0,0,0 Step timer max interval 0 MCU temperature: min 35.6, current 36.3, max 37.0 Supply voltage: min 24.1, current 24.2, max 24.5, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0 Driver 0: position 91505, standstill, SG min/max not available Driver 1: position 73958, standstill, SG min/max not available Driver 2: position 73795, standstill, SG min/max not available Driver 3: position 0, standstill, SG min/max not available Driver 4: position 0, standstill, SG min/max not available Driver 5: position 0 Driver 6: position 0 Driver 7: position 0 Driver 8: position 0 Driver 9: position 0 Driver 10: position 0 Driver 11: position 0 Date/time: 2021-07-30 11:27:55 Cache data hit count 4294967295 Slowest loop: 5.81ms; fastest: 0.17ms I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0 === Storage === Free file entries: 10 SD card 0 detected, interface speed: 20.0MBytes/sec SD card longest read time 1.3ms, write time 0.0ms, max retries 0 === Move === DMs created 83, maxWait 0ms, bed compensation in use: mesh, comp offset 0.000 === MainDDARing === Scheduled moves 68595, completed moves 68595, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1 -1 -1 === GCodes === Segments left: 0 Movement lock held by null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 Daemon is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === Network === Slowest loop: 203.84ms; fastest: 0.08ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions HTTP sessions: 1 of 8 - WiFi - Network state is active WiFi module is connected to access point Failed messages: pending 0, notready 0, noresp 1 WiFi firmware version 1.26-D WiFi MAC address a0:20:a6:16:ec:ed WiFi Vcc 3.36, reset reason Turned on by main processor WiFi flash size 4194304, free heap 24360 WiFi IP address 192.168.178.30 WiFi signal strength -51dBm, mode 802.11n, reconnections 0, sleep mode modem Clock register 00002002 Socket states: 0 0 0 0 0 0 0 0
-
@nxt-1 I'm sorry I forgot to mention it but I'll need the M122 output from a USB console, else the useful stats from the debug build aren't printed. I'll have a look at your Wireshark dumps nevertheless.
PS: Your Wireshark dumps show that the WiFi module sometimes doesn't seem to receive incoming packets. I suspect we could confirm this by looking at the LWIP stats reported over USB.