Wifi 2.1beta6 from 3.5.0-rc.2/3 still disconnecting
-
@droftarts Sure, the services are enabled, you will get a "connection refused" if not. And they work after the reboot. I wanted to indicate that it seems that the WiFi modules get stuck when this failed state happens. And the "turn in off and on again" drill did not brought it back to normal operations. It is better (back bingable) but not fully working. I have seen that behaviour often on Un*x aor Linux systems when the OS had not enough resources for the demon behind to port. So the TCP handshake worked but the process behind it was unable to respond.
Here is the current config.g. (Kindly ignore the typos, I'm German)
; Hardware: Duet Mini 5+ ; Toolboard 1.1 LC ; Stepper XY = LDO 0,9° 2Amax LDO-42STH40-2004MAC ; Stepper Z = LDO 1,8° 2Amax LDO-42STH48-2004AC ; Stepper E = LDO 1,8 1Amax LDO-42STH20-1004ASH ; Enable network if {network.interfaces[0].type = "ethernet"} M552 P0.0.0.0 S1 else M552 S1 ; Network M586 P0 S1 ; enable HTTP M586 P1 S1 ; enable FTP M586 P2 S1 ; enable Telnet G90 ; send absolute coordinates... M83 ; ...but relative extruder moves M550 P"v2" ; set printer name M669 K1 ; 1=select CoreXY mode 0=Cadasian ;; Helpful Toolboards commands ; M115 B121 ; Show board 121 ; M997 B121 ; Update tool 121 ; M122 B121 ; Detailed status of toolboard G4 S1 ; wait 1s for expansion boards to start ;;; Drives ;X M569 P0.2 S1 D3 ; physical drive 0.2 goes forward M584 X0.2 ; Map the stepper to X ;Y M569 P0.1 S0 D3 ; physical drive 0.1 goes backward M584 Y0.1 ; Map the stepper to Y ;; Z ; - front left M569 P0.5 S1 D3 ; physical drive 0.5 goes forward ; - front right M569 P0.6 S0 D3 ; physical drive 0.6 goes backward ; - back right M569 P0.0 S1 D3 ; physical drive 0.0 goes forward ; - back left M569 P0.4 S0 D3 ; physical drive 0.4 goes backward M584 Z0.5:0.4:0.0:0.6 ; Mapping ;; E M569 P121.0 S0 D3 ; Extruder stepper goes backward M584 E121.0 ; Map the E stepper to E ; Stepper settings M350 X16 Y16 Z16 E32 I1 ; configure microstepping with interpolation M92 X160 Y160 Z400 E823 ; set steps per mm (800 from manuall, measured 823 M98 P"/macros/print_scripts/speed_printing.g" ; Accelerations and speed M906 X1400 Y1400 Z1000 E700 I30 ; set motor currents (mA) and motor idle factor in per cent (E stepper max 1A) M84 S120 ; Idle timeout ; Axis Limits M208 X0 Y0 Z0 S1 ; set axis minima M208 X250 Y258 Z210 S0 ; set axis maxima ;; Endstops -- Display status with: M119 M574 Y2 S1 P"0.io5.in" ; Y M574 X2 S1 P"!0.io6.in" ; X M574 Z0 P"nil" ; No endstop we have the switch and a probe M574 Z1 S2 ; configure Z-probe endstop for low end on Z ; Z probe M98 P"/macros/print_scripts/activate_z_probe.g" ; Z-level settings ;M671 X-75:-75:288:289 Y0:320:320:0 S20 ; Define Z belts locations (Front_Left, Back_Left, Back_Right, Front_Right) ;M671 X-75:-75:288:289 Y0:328:328:0 S20 ; Define Z belts locations (Front_Left, Back_Left, Back_Right, Front_Right) M671 X-75:-75:288:289 Y0:358:358:0 S20 ; Define Z belts locations (Front_Left, Back_Left, Back_Right, Front_Right) ;; Define the mesh ;M557 X5:245 Y22:245 S35 ; spacing ;M557 X5:245 Y22:245 P9 ; grid (points per axis) M557 X5:245 Y22:220 P9 ; grid (points per axis) ;; Heaters :: Tune with: M303 H0 S110 ; Bed M308 S0 P"0.temp0" Y"thermistor" A"Bed" T100000 B4138 ; configure sensor 0 as thermistor on pin temp0 M950 H0 C"out5+out6" T0 Q10 ; create bed heater outputs for both SSRs on out0 and map it to sensor 0 M307 H0 B0 S1.00 ; disable bang-bang mode for the bed heater and set PWM limit M140 H0 ; map heated bed to heater 0 M143 H0 S120 ; set temperature limit for heater 0 to 120C ;; Bed Corner temp sensor (2=Orange, 3=Brown, 4=Green, 5=Yellow, 6=Purple 7=Black, ) ; Configure Bed corner temp sensor as thermistor on pin temp2 M308 S5 P"0.temp2" Y"thermistor" A"Bed-Corner" T100000 B4138 ; Hotend ; Tune in with: M303 H1 S270 (270=Temp) (M500 to save) ; Show current settings M307 H1 ;M308 S1 P"121.temp0" Y"thermistor" A"Hotend" T500000 B4702 C1.171057e-7 ; configure sensor 1 as thermistor on pin temp1 Mosquito ;M308 S1 P"121.temp0" A"Hotend" Y"thermistor" T100000 B4725 C7.06e-8 ; define E0 temperature sensor Rapido Argo M308 S1 P"121.temp0" A"Hotend" Y"thermistor" T100000 B4725 C7.060000e-8 ; define E0 temperature sensor e3d revo M950 H1 C"121.out0" T1 ; create nozzle heater output on 0.out3 and map it to sensor 1 M143 H1 S300 ; set temperature limit for heater 1 to 300C ;; Fans ; Fan for the printed part: M950 F0 C"121.out1" Q500 ; create fan 0 on pin 0.out9 and set its frequency M106 P0 S0 H-1 C"Part" ; set fan 0 value. Thermostatic control is turned off ; Fan for the Hotend: M950 F1 C"121.out2" Q500 ; create fan 1 on pin 0.out9 and set its frequency M106 P1 S1 H1 T45 C"Hotend" ; P="set fan 1" S="value" H="Thermostatic control Heater No." T=" is turned on at 45°C" ;; Tool M563 P0 S"Tool" D0 H1 F0 ; define tool G10 P0 X0 Y0 Z0 ; set tool 0 axis offsets G10 P0 R0 S0 ; set initial tool 0 active and standby temperatures to 0C ; Filament sensor : Status M591 D0 ;M591 D0 P7 C"io4.in" L7 R50:150 E5 S0 ;pulse, disabled, 7 mm/pulse, measure every 22 sec, minimum 50 maximum 250, S1 = Enabled S0 = Disabled ;M591 D0 P1 C"io4.in" S1 M950 J3 C"!io4.in" ; Create a trigger on io4.in (NC) M581 P3 T3 S0 R1 ; R1=Trigger only while printing ;; Chamber temp sensor M308 S4 P"0.temp1" Y"thermistor" A"Chamber" T100000 B4138 ; configure Chamber temp sensor as thermistor on pin temp1 ;; Input Shaping ; Accelerometer https://duet3d.dozuki.com/Wiki/Input_shaping M955 P121.0 I05 ; specify orientation of accelerometer on Toolboard 1LC with CAN address 121 ; Input Shaping ;M593 P"zvd" F40.5 ; use ZVD input shaping to cancel ringing at 40.5Hz ;M593 P"none" ; disable input shaping ;M593 P"custom" H0.4:0.7 T0.0135:0.0135 ; use custom input shaping ; PA https://duet3d.dozuki.com/Wiki/Pressure_advance M572 D0 S0.025 ;;;;;;;;;;;; Setup Only ;M564 S0 H0 ; Allow movement over the endstops ;M302 P1 ; allow cold extrusion ;M302 S1 ; deny cold extrusion ;;;;;;;;;;;; Setup Only END ;; Case Cooling ; Temps M308 S9 P"mcu-temp" Y"mcu-temp" A"Mainboard" ; define sensor 9 to be mcu temperature ; Case Fans M950 F3 C"!0.out3" Q50 ; Fan on out3 ground on top pin, plus on 3rd pin from top (V_OUTLC1) M106 P3 C"Base" S120 ; Setup the FAN and slow it down ; Nevermore m950 F4 C"0.out0" Q50 m106 P4 C"Nevermore" S0 ; Define the LED stripe and turn it off M950 F5 C"0.out1" Q100 ; LED on out1 M106 P5 C"LED" S0 ; Make sure that the LEDs are off ; Trigger on the toolboard ;#M950 J5 C"^121.button0" ;#M581 P4 T5 S0 ;######################################## M950 J1 C"^0.io1.in" M581 P1 T2 S0 ;M572 D0 S0.037 ; Set preasure Advance Gemessen M501 ; Load config-override.g ;; Serial interface ; Duet M575 P1 S1 B57600 ;;;;;; Old Display ;M575 P1 B115200 S1 ;; Mini 12864 ;M918 P2 ;M918 P2 E4 R3 C100 ;M150 X2 R255 U255 B255 S3 ; set all 3 LEDs to white ;M150 X2 R0 U255 B0 S3 ; set all 3 LEDs to red T0 ; Select the tool 0 as default ; Make sure that all heaters are off M104 S0 ; Extruder temp to 0 M568 P0 A0 ; Extruder heater off M140 S0 ; Set the bed temp to 0 M140 S-276 ; Bed heater off ; Some variables for later global tool_temp_initial=0 global bed_temp_initial=0 global debug=false ; AutoZ global klicky_home=true global qgl_done=false global nozzle_cleaned=false global Zswitch_homed=false global probetype="euclid" global clickystatus = "none" global probe_settingsH=10 global probe_settingsA=1 global autoz_temp2=20 # Stealthburner LEDs: global sb_leds="n-off" M98 P"/macros/sb_leds/sb_leds.g" set global.sb_leds="hot" set global.sb_leds="n-off" ;set global.sb_logo="red" ;set global.sb_leds="n-off" ;global sb_nozzle="off" ; M307 H0 R0.327 C227.635:227.635 D5.48 S1.00 V24.4 B0 I0 ; R altered for a firmware bug ; EOF[chriss@leela sys]$
-
@Chriss Hello, I might have a fix for this issue. But as you can guess, this issue seems to be intermittent and highly network dependent. So I'd like your help in order to verify it really works.
Are you able to setup two boards:
- One board has 2.1beta6 and 3.5rc3 from the release https://github.com/Duet3D/RepRapFirmware/releases/tag/3.5.0-rc.3
- The other board also has 3.5rc3 from the release, but has experimental wifi server firmware: https://drive.google.com/file/d/1NgssWNSS3xGL99hWwfgYbY5YVXF-f4jH/view?usp=drive_link.
Please verify the versions are correct for each board; for the first board it should say "2.1beta6" and on the second one it should be "2.1beta7".
The idea is simple - to run and use these board normally and see if the 2.1beta7 board has none of these disconnections you previously encountered, compared to the 2.1beta6 one.
Sorry by the way for the delay on this issue. I was sick for the last few days (from last week) and was only able to resume work yesterday.
-
Glad to hear that you are recovered from your sick leave. I was on vacation in the meantime so I was not "waiting" for a reply.
Please give me some days to test with the new WiFi firmware. Do you remember that I had a other printer with RC3?:
2: RC3 without problem, without PenalDue and not printing (VCast)That one is printing since yesterday and has developed the same problem since this morning. I think that I will upgrade this printer to beta7 first. I will use that printer more frequently in the next days, please let me know if you want me to stick on the other printer which had the problem first.
I have to admit that I'm more than happy that both printers with beta6 have the same problem now, I was a bit concerned about the observation that one encountered the problem while the other one was fine.Cheers, Chriss
-
@rechrtb I have no access to the file. I requested it a minute ago.
-
@Chriss Granted you access to the file. Tell me if you still have problems accessing it.
Regarding your question, I would advise to put beta7 on the board on which the issue seems to manifest most often.
I would also advise putting the two boards near each other if you can, so that they roughly get the same wifi signal strength, same wifi devices in proximity, etc. I recommend moving the beta6 board to the beta7 board location (again, because this might be a 'goldilocks' location w/ respect to the access point for the issue to manifest more frequently).
-
Cheers, I have the file. It seems to me that the printer I use is getting the error... Let me see... I print on both at the moment. I will wait till tomorrow and I hope that one of them will be in the failed state than. This is the chosen one than.
The printer stand next to each other and the AP is in the same room about 5m away.
Do you want me to to the drill via the serial interface on the board with the new firmware too? Or do I need to do if the new fw will have the same problem? (And I hope that this is totally hypothetic because you found the problem and fixed it!)
Cheers, Chriss
-
Do you want me to to the drill via the serial interface on the board with the new firmware too?
For now, not yet. Only when beta7 also displays the same issues.
-
@rechrtb OK, cool for me... My apologies, it took me almost a day to get the printer back into the failed state. Just to make sure that the problem is still present after the very latest reboot of my WiFi infra.
So we have:
now. I will print for a while now, but I can not tell you: Yes it is working now.
Simply because the problem does not show up frequently. I had the impression that it happens at least one in 24h. But the very last issue came up after more than 30h. So when could we say: "Yes solved" than?Do you want to tell us how you have fixed it? Is it by doing a full reset of the WiFi module after a connection lost? Or was there a real issue with the board firmware? (I'm just curious because the disconnects are not new, the not recovery was new.)
I will update the thread as soon as the problem is back or I will ping you on Friday or Monday when I have the feeling that the problem is gone for good.
Cheers, Chriss
-
@Chriss said in Wifi 2.1beta6 from 3.5.0-rc.2/3 still disconnecting:
Do you want to tell us how you have fixed it? Is it by doing a full reset of the WiFi module after a connection lost? Or was there a real issue with the board firmware? (I'm just curious because the disconnects are not new, the not recovery was new.)
From a conversation we had with @rechrtb
Ok, I think I may have found a fix to the issue. The reason I say 'may' is because as you might imagine, this issue seems to be very intermittent - I have only been able to reproduce it that one time last week.
But in trying to debug this problem, I inserted a bunch of debug printf's that I was able to get the same symptoms namely:- board seems to disconnect and never reconnect again - unless module is disabled and re-enabled
- module led is still on, but board is unpingable
- repeating "responsebusy" and "bad recv status size"
Ok, so the issue I found is that one of the tasks block indefinitely on https://github.com/Duet3D/WiFiSocketServerRTOS/blob/dev/src/Connection.cpp#L694.
Inserting the printf's must've slowed things down enough that simulated the connectionQueue to be backed up. There is supposed to be a task that consumes events from this queue, but since this callback occurs on the lwip task - if that consumer task calls and lwip function, it might also lock up.
Increasing the queue size seems to have alleviated the issue. I have set the size to MaxConnections * 3 , since there are three types of connection events that can be enqueued in connectionQueue :
Accept, Close and Terminate.
That said, it is probably still needed to verify if this is the issue @Chriss encountered. Since they have multiple boards, I'll probably advise them to load this firmware onto one of the boards, while the other retains the current firmware - to see if the 'fixed' version has reduced occurrences of the disconnects.
Though long term, I'll probably think of potentially better ways to refactor this part of the code.The relevant fix is here: https://github.com/Duet3D/WiFiSocketServerRTOS/commit/0f8bdc18f2968ee357cdb09d1319590abb7cdd08
Ian
-
@Chriss Hello, as in @droftarts response, I might have managed to re-create the instance in which the WiFi module firmware locks up - which is consistent with the symptoms you displayed. With the fix, I wasn't able to recreate the lock up anymore 'artificially'. Now, we wait to confirm maybe we can't recreate the lock-up 'naturally'.
As it is a very intermittent issue, it's hard to say when we say the issue is fixed. It has to be long term test, but when the beta6 board encounters the issue and beta7 still does not, I think it can be a good sign.
-
Thanks for the information, I appreciate that very much.
I was very busy during the last days so I skipped to tell you: "Yes, it still works" every day.What I can tell so far is that the error is gone. I was printing a lot with the beta7 board in since the upgrade (many under 30minutes prints) and the WiFi connection was very stable. I would vote with a "hump up" and would say that the problem is gone.
I work with my beta6 board at the moment a lot (IDEX setup is a pain) and I saw the problem here twice since yesterday evening.
I guess you guys want to close the case now and release beta7 officially. Thank you very much for your good support, I felt very compy during the process. And I'm more than happy that it was not a stupid wrong config on my site this time.
Cheers, Chriss
-
@Chriss I'm not quite sure why this thread is in the STM category. I'll move it to the beta firmware category.
Ian
-
-
@droftarts Hahaha... I started it in the Beta, somebody moved it to here.
-
-
-
-
@droftarts Maybe my bad than. I only remember the one of my threads where moved, maybe a other one.
Thanks you very much! Glad that we found it and it is stable now and all of us can concentrate on other things.
-
@rechrtb Do you give me the 2.1 beta7, please? I have terrible issues with my connection. That is so bad, that I have to restart the printer on every time.
-
@jensus11 here's a copy. DuetWiFiServer_beta7.bin
-
Thanks, for the first time it looks really better.