web died after successful print



  • dwc 3.1.1, rrf 3.1.1, duet2ethernet

    printer was idle (not printing but on) for few days

    started a rather short print

    before print end changed bed temp to 0 from dwc

    print ended, I removed the part, went to install it (~5min) returned to computer and DWC was trying to reconnect, did a refresh and RRF is unaccessible

    attached USB and hit M122

    === Diagnostics ===
    RepRapFirmware for Duet 2 WiFi/Ethernet version 3.1.1 running on Duet Ethernet 1.02 or later
    Used output buffers: 1 of 24 (20 max)
    === RTOS ===
    Static ram: 27980
    Dynamic ram: 93984 of which 208 recycled
    Exception stack ram used: 552
    Never used ram: 8348
    Tasks: NETWORK(ready,268) HEAT(blocked,1224) MAIN(running,1848) IDLE(ready,80)
    Owned mutexes:
    === Platform ===
    Last reset 131:48:48 ago, cause: power up
    Last software reset at 2020-06-28 02:03, reason: User, spinning module GCodes, available RAM 8604 bytes (slot 0)
    Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0441f000 BFAR 0xe000ed38 SP 0xffffffff Task MAIN
    Error status: 10
    MCU temperature: min 22.8, current 33.4, max 39.1
    Supply voltage: min 23.4, current 24.1, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes
    Driver 0: standstill, SG min/max 0/379
    Driver 1: standstill, SG min/max 0/422
    Driver 2: standstill, SG min/max 0/432
    Driver 3: standstill, SG min/max 0/1023
    Driver 4: standstill, SG min/max not available
    Date/time: 2020-07-04 10:37:54
    Cache data hit count 4294967295
    Slowest loop: 217.35ms; fastest: 0.13ms
    I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
    === Storage ===
    Free file entries: 10
    SD card 0 detected, interface speed: 20.0MBytes/sec
    SD card longest read time 6.5ms, write time 630.2ms, max retries 0
    === Move ===
    Hiccups: 0(0), FreeDm: 169, MinFreeDm: 121, MaxWait: 437300462ms
    Bed compensation in use: mesh, comp offset 0.000
    === MainDDARing ===
    Scheduled moves: 18152, completed moves: 18152, StepErrors: 0, LaErrors: 0, Underruns: 0, 0  CDDA state: -1
    === AuxDDARing ===
    Scheduled moves: 0, completed moves: 0, StepErrors: 0, LaErrors: 0, Underruns: 0, 0  CDDA state: -1
    === Heat ===
    Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
    Heater 0 is on, I-accum = 0.0
    Heater 1 is on, I-accum = 0.8
    === GCodes ===
    Segments left: 0
    Movement lock held by null
    HTTP is idle in state(s) 0
    Telnet is idle in state(s) 0
    File is idle in state(s) 0
    USB is ready with "M122" in state(s) 0
    Aux is idle in state(s) 0
    Trigger is idle in state(s) 0
    Queue is idle in state(s) 0
    Daemon is idle in state(s) 0
    Autopause is idle in state(s) 0
    Code queue is empty.
    === Network ===
    Slowest loop: 631.82ms; fastest: 0.02ms
    Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions
    HTTP sessions: 0 of 8
    Interface state active, link 100Mbps full duplex
    === Filament sensors ===
    Extruder 0 sensor: ok
    ok
    
    

    the gcode that printed
    Body1.gcode



  • @arhi please tell what is in print_start.g and print_stop.g, the two macros are called in the gcode file.



  • print_start

    ; no volumetric extrusion
    M200 D0
    
    ; relative extruder distance
    M83
    
    ; do not home if already homed
    if !move.axes[0].homed || !move.axes[1].homed || !move.axes[2].homed
      G28
    
    ; do the wipe 
    M98 P"wipe.g"
    G92 E0
    G1 E30 F200
    
    ; use MESH compensation
    G29S1
    
    ; all currents to 100%
    M913 X100 Y100 Z100 
    
    ; load default jerk, speed and acceleration values
    M98 P"cfg_jerkspeedaccel.g"
    
    ; reset baby steps
    M290 R0 S0
    
    ; reset speed overrides
    M221 S100
    M220 S100 
    
    ; Dynamic Acceleration Adjustment
    ; M593 Fxxx
    
    ; Pressure Advance
    ; M572 D0 S0.042
    M572 D0 S0.03
    
    ;M118P0S"MESSAGE: PRINT STARTE
    

    print_stop

    M104 S0 ; turn off extruder
    M140 S0 ; turn off bed
    M106 S0 ; turn off fan
    
    M913 X20 Y20 Z25
    G91
    G0Z10
    G90
    M913 X100 Y100 Z100
    
    M98 P"wipe.g"
    
    ;M118P0S"MESSAGE: PRINT FINISHED"
    


  • @arhi my idea was that in the macro files is something interrupting the connection. Beside G29S1 and G0Z10 I see nothing strange, but there are two other macros again which you should check.

    Does your internet router have log entries, did the router disconnect to the internet, change IP addresses, intruder detection, firmware update with router reboot, or such things which would disconnect your Duet?
    (AP removed, not Wifi board)



  • @JoergS5 said in web died after successful print:

    @arhi my idea was that in the macro files is something interrupting the connection. Beside G29S1 and G0Z10 I see nothing strange, but there are two other macros again which you should check.

    note that this works for a while now, many prints finished ok, this is the third time this happened, both previous times the printer was idle for few days and after print web died (so after first print executed after long idle time) ... after reboot I made tens of prints without a problem... looks like the only problem is when printer is on and idle for 3+ days and then I start a print.

    Does your internet router have log entries, did the router disconnect to the internet, change IP addresses, intruder detection, firmware update with router reboot, or such things which would disconnect your Duet?

    Yes I have logs, no it did not ask for IP again so I assume network stack did not restart. The printer does not see internet it is assigned static IP from the dhcp server that puts it on the lan with no internet access. I don't trust my iot devices to see WAN, only LAN. Also, duet is not "connected" to the router, it is connected to the managed switch, that's connected to the managed switch that's connected to the router so reboot of the router would not be seen by the printer in any way.

    If you run Duet in Access Point mode

    it is duet2ETHERNET 🙂

    the only devices in my house on wifi are phones and tablets and those are on a separate untrusted network and those devices have to connect to VPN to be able to see anything



  • @arhi said in web died after successful print:

    it is duet2ETHERNET

    ok, then no WiFi problem 😉

    The reason Spinning module was sometimes a problem with the SDCard.

    One possibility is to set a higher debug level for analyzing.



  • @JoergS5 said in web died after successful print:

    One possibility is to set a higher debug level for analyzing, but I have not done it yet, so I cannot help how to do it in detail. One can set debug levels for specific modules.

    It's not a problem to add additional debug levels but the problem is that I can't reproduce this easily. last two times it happened I added debug + usb but could not reproduce the problem.



  • @arhi you could try a keep alive program (etc....)

    (wrong thought Maybe the httpsessions were full (8 of 8 used) removed)



  • @arhi what is interesting is that the data cache hit count. Cache data hit count 4294967295 is exactly 32 bit unsigned, maybe reached an upper limit.



  • @JoergS5 said in web died after successful print:

    @arhi you could try a keep alive program to check whether it is a disconnect problem after long inactivity.

    It is not. The long inactivity was before the print, web opened ok and during the print there was activity and it stopped working immediately after the print finished. So no "timeout" to speak of.

    You could check M122 before crash, maybe the httpsessions were full (8 of 8 used).

    If I can reproduce the problem on demand it would make sense but I can't. All three times it happened the printer was not in use for 3-4 days.

    Where do you see 8 out of 8 ?!?!

    Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions
    HTTP sessions: 0 of 8
    

    ??

    Do you use additional programs to access the Duet, like monitoring programs, which may use connections?

    No program other than web accessed duet in the times the problem happened (normally I do but the rpi4 that host those programs is of-line these days so, no)



  • @arhi said in web died after successful print:

    Where do you see 8 out of 8 ?!?!

    Please ignore, this was wrong.

    One thing to check is wipe.g



  • @arhi said in web died after successful print:

    [...}
    No program other than web accessed duet in the times the problem happened [...]

    This got me thinking. What if there is some ping or other mechanism sent to the Duet automatically by some system on your network.

    I wonder if mDNS is playing a role. It was disabled on legacy Duets for similar reason.

    From the RRF2 Whats_new doc:

    [...}

    [...}



  • @JoergS5 the wipe has nothing to do with it but here it is

    if !move.axes[0].homed || !move.axes[1].homed 
      echo "X and Y axes not homed, aborting the wipe"
      M99
    
    if state.currentTool < 0
      echo "No tool loaded, aborting the wipe"
      M99
    
    if heat.heaters[tools[state.currentTool].heaters[0]].current < 200
      echo "Extruder too cold, no point wiping, aborting the wipe"
      M99
    
    ; Drop all motor currents down
    M400
    M913 X30 Y30 Z25
    
    M83 
    
    ; -135,       -116,             115,   125,
    ;     ,104....++++++++++++++++++++++....
    ;         ....++++++++++++++++++++++....
    ;         ....++++++++++++++++++++++....
    ;         ....++++++++++++++++++++++....
    ;     ,63 ..|.++++++++++++++++++++++....
    ;         ..|.++++++++++++++++++++++....
    ;         ..|.++++++++++++++++++++++....
    ;     ,30 ..|.++++++++++++++++++++++....
    ;         ....++++++++++++++++++++++....
    ;         ....++++++++++++++++++++++....
    ;         ....++++++++++++++++++++++....
    ;    ,-112....++++++++++++++++++++++....
    ;         ..............................
    ;         ..............................
    ;    ,-121..............................
    
    
    G0 X-115 Y65  F9000
    
    while true
      G0 X-135 Y{65 - iterations * 3} F7000 
      G0 X-115 Y{65 - iterations * 4} F5000
      if iterations == 6
        G1 E-3 F3000
      if iterations == 8
        break
    
    M98 P"park.g"
    
    ; Return all motor currents to 100%
    M400
    M913 X100 Y100 Z100
    

    and the park is

    ; drop motor curents
    M913 X30 Y30 Z25
    ; go to park position
    
    G0 X-135 Y40 
    
    ; restore motor curents
    M913 X100 Y100 Z100
    

    as I said, the code itself works ok.. and after everything is finished paneldue and usb work ok, just web is dead



  • @bot said in web died after successful print:

    @arhi said in web died after successful print:

    [...}
    No program other than web accessed duet in the times the problem happened [...]

    This got me thinking. What if there is some ping or other mechanism sent to the Duet automatically by some system on your network.

    But web itself is pinging it non stop (fetching the model json, displaying temperature etc etc)

    I wonder if mDNS is playing a role. It was disabled on legacy Duets for similar reason.

    dunno, I don't use mDNS as I had issues with it on windows so my router is configured to add entry to local dns to .local.lan or .local.wifi for all dhcp leases so I use ender5.local.lan to access it so local dns, not using mdns at all

    yeah but no reboot here, the duet runs ok, this time the web died after the print but usb/paneldue worked ok, last time web died in the middle of the print but print finished ok, paneldue was ok, I just could not connect using web.... I was running later on with debug on for few modules but could not reproduce the problem, today it happened again, the only similarity, the printer was idle for few days (not printing, but I did access it via web to check in some config details etc.. and basically the tab with dwc was open most of the time)



  • @arhi Error 10 Could be that you have a recursion somewhere.

    Error 10 is an addition of 0x02 and 0x08 according to
    https://duet3d.dozuki.com/Wiki/Error_codes_and_software_reset_codes



  • @JoergS5 said in web died after successful print:

    @arhi Could be that you have a recursion somewhere

    In that case, it would die much faster; but no, no recursion here. I do call same file from multiple places but never recursive



  • (deleted, not relevant)



  • @JoergS5 said in web died after successful print:

    @arhi but some overflow is the reset reason of your last reset. Maybe the cash overflow above, or another overflow. For overflow reasons, I only found the recursion as a cause, but maybe you're first for your overflow reason.

    Well, the board did not reset! As I wrote, only the HTTP module or NET module died (did not restart, stayed dead), everything else continued to work ok as nothing happens, continued printing, parsing g-code...



  • @arhi The M122 reset reason is the reason of the reset 131 hours ago. The error 10 could be related to this access error.



  • @JoergS5 yup, that M122 happened before the ~130 hours of inactivity



  • This post is deleted!


  • here it comes again, ~12 hours of inactivity, dead 5 minutes after the print finished ?!?!

    ce47ea7e-a79b-40ff-a3ed-61c85795abcd-image.png



  • @arhi what is the M122 please?



  • @JoergS5 I didn't do M122 I just rebooted it, turned on debugging on net, web and few others turned on logging on serial port and let it run so it catches log if it happens again as even dc42 did not find anything useful in the m122 last time so .. collecting debug data now



  • @arhi ok, hope you find something. Good luck!


Log in to reply