WiFi disconnect errors
-
David, testing DuetWifiServer 1.20b8:
- A M122 diagnostic report produced when WiFi was connected
[c]Connecting…
Printer is now online.
M122
SENDING:M122
=== Diagnostics ===
Used output buffers: 2 of 32 (9 max)
=== Platform ===
RepRapFirmware for Duet WiFi version 1.20beta7 running on Duet WiFi 1.0
Board ID: 08DDM-9FAM2-LW4SD-6JTF0-3S86N-TLWZX
Static ram used: 15472
Dynamic ram used: 99192
Recycled dynamic ram: 4120
Stack ram used: 3992 current, 9196 maximum
Never used ram: 3092
Last reset 00:54:35 ago, cause: reset button or watchdog
Last software reset reason: User, spinning module GCodes, available RAM 7296 bytes (slot 3)
Software reset code 0x0003, HFSR 0x00000000, CFSR 0x00000000, ICSR 0x00400000, BFAR 0xe000ed38, SP 0xffffffff
Error status: 0
[ERROR] Error status: 0Free file entries: 9
SD card 0 detected, interface speed: 20.0MBytes/sec
SD card longest block write time: 14.6ms
MCU temperature: min 29.0, current 33.5, max 34.6
Supply voltage: min 12.2, current 12.7, max 12.9, under voltage events: 0, over voltage events: 0
Driver 0: ok
Driver 1: ok
Driver 2: ok
Driver 3: ok
Driver 4: standstill
Date/time: 2017-11-14 20:25:36
Cache data hit count 4294967295
Slowest main loop (seconds): 2.192716; fastest: 0.000042
=== Move ===
MaxReps: 5, StepErrors: 0, FreeDm: 123, MinFreeDm 120, MaxWait: 4216772852ms, Underruns: 0, 0
Scheduled moves: 32211, completed moves: 32181
Bed compensation in use: none
Bed probe heights: -0.042 0.010 -0.062 -0.060 -0.063
=== Heat ===
Bed heater = 0, chamber heater = -1
Heater 0 is on, I-accum = 0.3
Heater 1 is on, I-accum = 0.5
=== GCodes ===
Segments left: 1
Stack records: 2 allocated, 0 in use
Movement lock held by file
http is idle in state(s) 0
telnet is idle in state(s) 0
file is doing "G1 X9.258 Y-3.758 F1839" in state(s) 0
serial is ready with "M122" in state(s) 0
aux is idle in state(s) 0
daemon is idle in state(s) 0
queue is idle in state(s) 0
autopause is idle in state(s) 0
Code queue is empty.
Network state is running
WiFi module is connected to access point
Failed messages: pending 0, notready 0, noresp 0
WiFi firmware version 1.20b8
WiFi MAC address 5c:cf:7f:ef:51:6f
WiFi Vcc 3.38, reset reason Turned on by main processor
WiFi flash size 4194304, free heap 29096
WiFi IP address 10.0.1.161
WiFi signal strength -39dBm, reconnections 0, sleep mode modem
HTTP sessions: 1 of 8
Socket states: 0 0 0 0 0 0 0 0
Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0)
[/c]- The make and model of your WiFi router
Tenda AC18 - Does the problem occur mainly (or only) during printing, or does it occur equally frequently when the printer is idle?
Both (from past experience) - When the problem occurs, are you able to reconnect by pressing the Connect button in DWC?
Not this time - If you are unable to connect, are you still able to ping the IP address of the Duet WiFi?
Could not ping the IP address - Anything else that you think may be correlated with disconnections occurring
I had 24 hours of successful connection with DuetWifiServer 1.20b8-nosleep prior to this test. It took me 3 tries to upload the DuetWifiServer 1.20b8 firmware. I renamed the file, then uploaded via DWC. It "worked" but then via pronterface M552 only would report "wifi module starting." It took two consecutive runs of M997 S1 to get it fully working. I then started homing it, G32's and preheating. After about 15 minutes of that, I started a print. It disconnected 27 mins after the print started. This was a total connection time of about 40 mins.
- A M122 diagnostic report produced when WiFi was connected
-
Ok, a new problem….
I revert back to WifiServer 1.20b8-nosleep since it had previously been working great. It took 2 installs to get it to work. Since then I've had 3 discontents in the last 2 hours. These were different though.
The first was seems harmless, it disconnected and i just clicked connect in DWC and everything was up and running again. no problem.
The second two were different. Both times They reported a new AJAX error, that gave no reason. It didn't have the normal timeout. Image attached. Both times, I could not connect to the duet via USB with pronterface. It was like the whole board was locked out or frozen. A power cycle resolved it the first time, but then it disconnected again near immediately. The second time I just let it sit and after about 10mins I was able to simply click connect in DWC again.
UPDATE: APPARENTLY I HAVEN'T SLEPT ENOUGH. I WAS TRYING TO PRINT A .STL FILE. INTERESTINGLY THIS CREATES AN IMMEDIATE DISCONNECT AND LOCKS UP THE BOARD.
David, any way to adjust the firmware to prompt an error message rather than locking up the board for those times when we try to print an STL?
-
David, I'm not sure if that's the same problem or not, but here is what I'm observing:
WiFi connection works most of the time, signal strength looks ok (between -60dBm and -50dBm) but sometimes it just fails throwing AJAX error and doesn't come back until I reset the WiFi module by using M552 S0 followed by M552 S1.
It was always the case, and I thought it just a signal level issue. I tried using two different routers (one noname provided by my provider, the other one is Apple TimeCapsule). Now the strange part I noticed just recently, is that when it fails, it is till showing as connected in the WiFi router web UI. Even more, if I connect using the USB and issue M122, it is showing wifi as connected, with good signal strength and valid IP address. That makes me thinking that at least in some cases what looks like a WiFi connectivity issue is not related to connectivity. I believe it might be the web server firmware crashing or hanging.
It happens pretty often, and I'm happy to provide further diagnostic data - just tell me what exactly do I need to do. -
zlowred, please upgrade to the just-released DuetWiFiFirmware 1.20beta8 and DuetWiFiServer 1.20beta9. Connect a PC via USB and send M111 S1 P14 to enable the new WiFi debugging feature. You will get a few WiFi debug messages during startup and initial connection to your router, but there should be no more after that. If the disconnect occurs again, look at the console on the PC to see whether any more debug messages have been received, and report what you find.
-
That means to keep USB connected all the time, right? (no problems with that, just to confirm)
added: maybe it's better to keep debug log somewhere on the flash card so that nothing is lost in case when PC went to sleep for example. -
I think I've sent my Duet3D WiFi to England, because of symptoms very similar to those described …
I hope not....
-
I think I've sent my Duet3D WiFi to England, because of symptoms very similar to those described …
I hope not....
Yours was different, it wouldn't upload new wifi firmware.
-
i hope….
-
So I was able to reproduce the disconnect. Below is the full log. Note that I temporarily disabled M552 S1 in the config.g so that full log is captured.
I issued M122 after the disconnect in case it contains something useful.RepRapFirmware for Duet WiFi Version 1.20beta8 dated 2017-11-17 Executing config.g...HTTP is enabled on port 80 FTP is disabled TELNET is disabled Done! Network disabled. RepRapFirmware for Duet WiFi is up and running. M111 S1 P14 Debugging enabled for modules: WiFi(14) Debugging disabled for modules: Platform(0) Network(1) Webserver(2) GCodes(3) Move(4) Heat(5) DDA(6) Roland(7) Scanner(8) PrintMonitor(9) Storage(10) PortControl(11) DuetExpansion(12) FilamentSensors(13) ok M552 S1 ok WiFi: WiFi: ets Jan 8 2013,rst cause:2, boot mode:(3,7) WiFi: WiFi: load 0x4010f000, len 1384, room 16 WiFi: tail 8 WiFi: chksum 0x2d WiFi: csum 0x2d WiFi: v00007fff WiFi: ~ld WiFi module started WiFi: mode : sta(ec:fa:bc:02:1e:41) WiFi: add if0 WiFi: scandone WiFi: sleep enable,type: 2 WiFi: scandone WiFi: state: 0 -> 2 (b0) WiFi: state: 2 -> 3 (0) WiFi: state: 3 -> 5 (10) WiFi: add 0 WiFi: aid 3 WiFi: cnt WiFi: WiFi: connected with Lrrr, channel 1 WiFi: dhcp client start... Wifi module is connected to access point Lrrr, IP address 192.168.1.102 WiFi: ip:192.168.1.102,mask:255.255.255.0,gw:192.168.1.1 WiFi: pm open,type:2 0 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 WiFi: p->ref == 1 M122 === Diagnostics === Used output buffers: 1 of 32 (8 max) === Platform === RepRapFirmware for Duet WiFi version 1.20beta8 running on Duet WiFi 1.0 Board ID: 08DDM-9FAM2-LW4S8-6JTDD-3SJ6P-9MXBY Static ram used: 15488 Dynamic ram used: 99624 Recycled dynamic WiFi: ram: 3672 Stack ram used: 4328 current, 5324 maximum NeveWiFi: LINK r used ram: 6964 Last reset 00:02:07 ago, cause: power up Last software reset reason: User, spinning module GCodes, available RAM 6960 bytes (slot 4) Software reset code 0x0003, HFSR 0x00000000, CFSRWiFi: xmit: 0 0x00000000, ICSR 0x00400000, BFAR 0xe000ed38, SP 0xffffffff Error status: 0 Free file entries:WiFi: recv: 0 10 SD card 0 detected, interface speed: 20.0MBytes/sec SD card longest block write time: 0.0ms MCU temperature: min 31.5, current 34.8, max 35.0 Supply voltage:WiFi: fw: 0 min 0.3, current 0.5, max 0.5, under voltage events: 0, over voltage events: 0 Driver 0: ok Driver 1: ok Driver 2: ok Driver 3: okWiFi: drop: 0 Driver 4: ok Date/time: 2017-11-18 11:55:08 Cache data hit count 436514026 Slowest main loop (seconds): 0.099160; fastest: 0WiFi: chkerr: 0 .000034 === Move === MaxReps: 0, StepErrors: 0, FreeDm: 240, MinFreeDm 240, MaxWait: 0ms, Underruns: 0, 0 Scheduled moves: 0, completed moves: 0 Bed compensation iWiFi: lenerr: 0 n use: none Bed probe heights: 0.000 0.000 0.000 0.000 0.000 === Heat === Bed heater = 0, chaWiFi: memerr: 0 mber heater = -1 Heater 1 is on, I-accum = 0.0 === GCodes === Segments left: 0 Stack records: 2 allocated, 0 in use Movement lock held by null http is idle in state(s) 0 telnet is idle in stateWiFi: rterr: 0 (s) 0 file is idle in state(s) 0 serial is ready with "M122" inWiFi: proterr: 0 state(s) 0 aux is idle in state(s) 0 daemon is idle in state(s) 0 queue is idle in state(s) 0 autopause is idle in state(s) 0 Code queue is empty. Network state is running WiFi module is connected to access point Failed mWiFi: opterr: 0 essages: pending 0, notready 0, nWiFi: err: 0 oresp 0 WiFi firmware version 1.20b9 WiFi MAC address ec:fa:bc:02:1e:41 WiFi Vcc 3.35, reset reason Turned on by main processor WiFi flash size 4194304, free heap 25224 WiFi IP address 192.168.1.102 WiFi signal strength -58dBm, reconnections 0, sleep mode WiFi: cachehit: 0 modem HTTP sessions: 1 of 8 Socket states: 0 0 0 0 0 0 0 0 Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) ok WiFi: WiFi: ETHARP WiFi: xmit: 4 WiFi: recv: 2 WiFi: fw: 0 WiFi: drop: 39 WiFi: chkerr: 0 WiFi: lenerr: 0 WiFi: memerr: 0 WiFi: rterr: 0 WiFi: proterr: 39 WiFi: opterr: 0 WiFi: err: 0 WiFi: cachehit: 1605 WiFi: WiFi: IP WiFi: xmit: 1614 WiFi: recv: 1892 WiFi: fw: 0 WiFi: drop: 1 WiFi: chkerr: 1 WiFi: lenerr: 0 WiFi: memerr: 0 WiFi: rterr: 0 WiFi: proterr: 0 WiFi: opterr: 0 WiFi: err: 0 WiFi: cachehit: 0 WiFi: WiFi: IGMP WiFi: xmit: 4 WiFi: recv: 1 WiFi: drop: 0 WiFi: chkerr: 0 WiFi: lenerr: 0 WiFi: memerr: 0 WiFi: proterr: 0 WiFi: rx_v1: 0 WiFi: rx_group: 0 WiFi: rx_general: 1 WiFi: rx_rmit: 1174 WiFi: recv: 1831 WiFi: fw: 0 WiFi: drop: 0 WiFi: chkerr: 0 WiFi: lenerr: 0 WiFi: memerr: 0 WiFi: rterr: 0 WiFi: proterr: 0 WiFi: opterr: 0 WiFi: err: 0 WiFi: cachehit: 0
The "WiFi: p->ref == 1" messages appeared around every few seconds or so, and the "AJAX error" appeared after the last "WiFi: p->ref == 1" message. I waited for another 10 minutes, but didn't see these messages after the "AJAX error"
-
Thanks. Did the p->ref==1 messages start appearing as soon as you connected, or not until shortly before the disconnection message?
-
First message is seconds after I connected. Others are at irregular intervals between few seconds and somewhere around a minute between each other.
In the meanwhile, I've replaced the wifi module with ESP-07S with external antenna, and it didn't change anything, e.g. behaviour is exactly the same. -
I've never seen those p->ref==1 messages on my test systems. I suspect they are related to the disconnections. Please do a few more tests, to establish how many p->ref messages you get, before they start appearing every few seconds and the disconnection occurs.
Edit: those p->ref == 1 messages indicate an assertion failure within the TCP/IP stack, so they are definitely indicative of something going wrong.
-
Sure, I'll collect more data and post it here
In the meanwhile – I can build the firmware myself, so if you need me to build it with maybe some additional debug flags, or quickly test some changes without releasing the new beta version – feel free to ask.
-
Thanks. Those debug messages come from the firmware on the wifi module, so it's that firmware that is likely to contain the fault.
-
Please can anyone else who is still getting WiFi disconnections even though the RSSI is good do the following:
1. Upgrade to the just-released DuetWiFiFirmware 1.20beta8 and DuetWiFiServer 1.20beta9.
2. Connect a PC via USB and send M111 S1 P14 to enable the new WiFi debugging feature. If you have already started wifi, send M552 S-1 and then M552 S1 to restart it. You will get some WiFi debug messages during startup and connection to your router, similar to the following:
WiFi:
WiFi: ets Jan 8 2013,rst cause:2, boot mode:(3,6)
WiFi:
WiFi: load 0x4010f000, len 1384, room 16
WiFi: tail 8
WiFi: chksum 0x2d
WiFi: csum 0x2d
WiFi: v00007fff
WiFi: ~ld
WiFi module started
WiFi: mode : sta(a0:20:a6:19:28:23)
WiFi: add if0
WiFi: scandone
WiFi: sleep enable,type: 2
WiFi: scandone
WiFi: state: 0 -> 2 (b0)
WiFi: state: 2 -> 3 (0)
WiFi: state: 3 -> 5 (10)
WiFi: add 0
WiFi: aid 2
WiFi: cnt
WiFi:
WiFi: connected with ********, channel 6
WiFi: dhcp client start…
Wifi module is connected to access point ********* IP address 192.168.1.123
WiFi: ip:192.168.1.123,mask:255.255.255.0,gw:192.168.1.254
WiFi: pm open,type:2 03. Load DuetWebControl in your browser.
4. If/when the wifi disconnects unexpectedly, look at the console on the PC to see whether any more debug messages have been displayed, and report what you find.
-
Hi David,
So I was able to reproduce few more times. Here are my observations:
- (obvious one) I didn't see any debug messages including p->buf==1 while DWC page is not opened
- these p->buf==1 messages always (or most of the time) come out in pairs
- there are no regular intervals between these messages, sometimes I see few per second, sometimes I don't any for 10-15 minutes
- they are not getting more often before the disconnect
- I feel that messages are coming much more often when printer is printing (compared to the idle) - suspect it may be related to some timings, e.g. CPU is busy with something and can't send some command or data to the WiFi module in time. Or maybe noise in power line due to heaters PWM. My power supply is able to provide enough current, so it's not under-voltage, but maybe some noise…
I'll post here if I get more data
-
Thanks. Please can you install this DuetWiFiServer.bin file twice (the second time you can just send M997 S1 to install it again from the existing SD card file). It will provide a slightly more detailed debug message.
-
Sure, will do. I wonder if there is anything of interest to you in the debug message showing when I do M552 S0 after such disconnect:
WiFi: state: 5 -> 0 (0) WiFi: rm 0 WiFi: pm close 7 WiFi: tcp_pcb_purge: pcb->state == SYN_RCVD but tcp_listen_pcbs is NULL Wifi module is idle WiFi: tcp_pcb_purge: pcb->state == SYN_RCVD but tcp_listen_pcbs is NULL WiFi: tcp_pcb_purge: pcb->state == SYN_RCVD but tcp_listen_pcbs is NULL WiFi: del if0 WiFi: usl WiFi: mode : null
Not sure if these pcb->state checks are ok
-
Yes, those are interesting too, in particular the tcp_pcb_purge messages.
-
I did additional M997 S1 and then reproduced disconnect once again, but it didn't result in any additional messages in the log.