3HC board disappearing from operation and diag



  • I tried to search the forums here, but couldn't find anything similar.

    I've been having an issue with my Duet3 with a 3HC board.

    I have two things connected to the daughter board now - the extruder motor and the part cooling fan. Everything else is on the main board.

    The problem is that sometime during a print, the 3HC board stops any communication and any commands sent to it are lost. It's weird to me since I have been able to run one print perfectly fine, but a second one started almost immediately afterwards died midprint (same gcode).

    When I try to send an M122 B1 to check if the board is there - I am getting a timeout.
    If I reboot the printer, M122 B1 shows details about the board and the firmware (has to be a full power cycle - emergency stop doesn't fix the issue).
    I tried changing the CAN Address - same story.

    Has anyone else seen this? I exchanged the CAN cable (using an off the shelf ADSL2+ 4 wire cable) - this is not the issue.

    Clean diag here (diag after emergency stop when I noticed a problem just for the main board at the bottom).

    m122 B0
    === Diagnostics ===
    RepRapFirmware for Duet 3 MB6HC version 3.1.1 running on Duet 3 MB6HC v0.6 or 1.0 (standalone mode)
    Board ID: 08DJM-956L2-G43S4-6JKDA-3SJ6T-1B6GH
    Used output buffers: 1 of 40 (13 max)
    === RTOS ===
    Static ram: 154604
    Dynamic ram: 163228 of which 44 recycled
    Exception stack ram used: 304
    Never used ram: 75036
    Tasks: NETWORK(ready,356) ETHERNET(blocked,436) SENSORS(blocked,200) HEAT(blocked,1188) CanReceiv(suspended,3424) CanSender(suspended,1488) CanClock(blocked,1436) TMC(blocked,204) MAIN(running,4456) IDLE(ready,76)
    Owned mutexes:
    === Platform ===
    Last reset 00:04:11 ago, cause: power up
    Last software reset at 2020-07-18 23:55, reason: User, spinning module GCodes, available RAM 74516 bytes (slot 0)
    Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0444a000 BFAR 0x00000000 SP 0xffffffff Task MAIN
    Error status: 0
    MCU temperature: min 48.0, current 50.8, max 52.2
    Supply voltage: min 24.0, current 24.0, max 24.1, under voltage events: 0, over voltage events: 0, power good: yes
    12V rail voltage: min 12.1, current 12.2, max 12.2, under voltage events: 0
    Driver 0: standstill, reads 56868, writes 14 timeouts 0, SG min/max 0/0
    Driver 1: standstill, reads 56868, writes 14 timeouts 0, SG min/max 0/0
    Driver 2: standstill, reads 56869, writes 14 timeouts 0, SG min/max 0/0
    Driver 3: standstill, reads 56870, writes 14 timeouts 0, SG min/max 0/0
    Driver 4: standstill, reads 56870, writes 14 timeouts 0, SG min/max 0/0
    Driver 5: standstill, reads 56871, writes 14 timeouts 0, SG min/max 0/0
    Date/time: 2020-07-19 00:43:42
    Slowest loop: 130.47ms; fastest: 0.21ms
    === Storage ===
    Free file entries: 10
    SD card 0 detected, interface speed: 25.0MBytes/sec
    SD card longest read time 3.2ms, write time 1.7ms, max retries 0
    === Move ===
    Hiccups: 0(0), FreeDm: 375, MinFreeDm: 375, MaxWait: 0ms
    Bed compensation in use: none, comp offset 0.000
    === MainDDARing ===
    Scheduled moves: 0, completed moves: 0, StepErrors: 0, LaErrors: 0, Underruns: 0, 0  CDDA state: -1
    === AuxDDARing ===
    Scheduled moves: 0, completed moves: 0, StepErrors: 0, LaErrors: 0, Underruns: 0, 0  CDDA state: -1
    === Heat ===
    Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
    === GCodes ===
    Segments left: 0
    Movement lock held by null
    HTTP is ready with "m122 B0" in state(s) 0
    Telnet is idle in state(s) 0
    File is idle in state(s) 0
    USB is idle in state(s) 0
    Aux is idle in state(s) 0
    Trigger is idle in state(s) 0
    Queue is idle in state(s) 0
    LCD is idle in state(s) 0
    SBC is idle in state(s) 0
    Daemon is idle in state(s) 0
    Aux2 is idle in state(s) 0
    Autopause is idle in state(s) 0
    Code queue is empty.
    === Network ===
    Slowest loop: 29.46ms; fastest: 0.03ms
    Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions Telnet(0), 0 sessions
    HTTP sessions: 1 of 8
    - Ethernet -
    State: active
    Error counts: 0 0 0 0 0
    Socket states: 2 5 2 2 2 0 0 0
    === CAN ===
    Messages sent 1016, longest wait 120ms for type 6024
    === Linux interface ===
    State: 0, failed transfers: 0
    Last transfer: 251292ms ago
    RX/TX seq numbers: 0/1
    SPI underruns 0, overruns 0
    Number of disconnects: 0
    Buffer RX/TX: 0/0-0
    
    M122 B2
    Diagnostics for board 2:
    Board EXP3HC firmware 3.1.0 (2020-05-15b1)
    Never used RAM 163.4Kb, max stack 296b
    HEAT 1092 CanAsync 1452 CanRecv 1420 TMC 156 AIN 524 MAIN 2208
    Last reset 00:00:29 ago, cause: power up
    Driver 0: standstill, reads 2245, writes 14 timeouts 0, SG min/max 0/0
    Driver 1: standstill, reads 2250, writes 11 timeouts 0, SG min/max 0/0
    Driver 2: standstill, reads 2252, writes 11 timeouts 0, SG min/max 0/0
    Moves scheduled 0, completed 0, hiccups 0
    VIN: 24.1V, V12: 12.1V
    MCU temperature: min 45.7C, current 45.9C, max 45.9C
    Ticks since heat task active 40, ADC conversions started 29534, completed 29533, timed out 0
    Last sensors broadcast 00000000 found 0 43 ticks ago
    Free CAN buffers: 36
    NVM user row de9a9239 aeecffb1 ffffffff ffffffff
    

    Main board diag after noticing 3HC is not responding.

    m122 B0
    === Diagnostics ===
    RepRapFirmware for Duet 3 MB6HC version 3.1.1 running on Duet 3 MB6HC v0.6 or 1.0 (standalone mode)
    Board ID: 08DJM-956L2-G43S4-6JKDA-3SJ6T-1B6GH
    Used output buffers: 1 of 40 (18 max)
    === RTOS ===
    Static ram: 154604
    Dynamic ram: 163016 of which 88 recycled
    Exception stack ram used: 272
    Never used ram: 75236
    Tasks: NETWORK(ready,244) ETHERNET(blocked,436) SENSORS(blocked,200) HEAT(blocked,1188) CanReceiv(suspended,3820) CanSender(suspended,1488) CanClock(blocked,1424) TMC(blocked,192) MAIN(running,4528) IDLE(ready,76)
    Owned mutexes:
    === Platform ===
    Last reset 00:07:03 ago, cause: software
    Last software reset at 2020-07-18 23:55, reason: User, spinning module GCodes, available RAM 74516 bytes (slot 0)
    Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0444a000 BFAR 0x00000000 SP 0xffffffff Task MAIN
    Error status: 0
    MCU temperature: min 53.2, current 53.3, max 55.5
    Supply voltage: min 24.0, current 24.0, max 24.1, under voltage events: 0, over voltage events: 0, power good: yes
    12V rail voltage: min 12.1, current 12.2, max 12.2, under voltage events: 0
    Driver 0: standstill, reads 53263, writes 14 timeouts 0, SG min/max 0/0
    Driver 1: standstill, reads 53263, writes 14 timeouts 0, SG min/max 0/0
    Driver 2: standstill, reads 53264, writes 14 timeouts 0, SG min/max 0/0
    Driver 3: standstill, reads 53265, writes 14 timeouts 0, SG min/max 0/0
    Driver 4: standstill, reads 53265, writes 14 timeouts 0, SG min/max 0/0
    Driver 5: standstill, reads 53266, writes 14 timeouts 0, SG min/max 0/0
    Date/time: 2020-07-19 00:35:07
    Slowest loop: 1050.57ms; fastest: 0.20ms
    === Storage ===
    Free file entries: 10
    SD card 0 detected, interface speed: 25.0MBytes/sec
    SD card longest read time 3.7ms, write time 12.1ms, max retries 0
    === Move ===
    Hiccups: 0(0), FreeDm: 375, MinFreeDm: 375, MaxWait: 0ms
    Bed compensation in use: none, comp offset 0.000
    === MainDDARing ===
    Scheduled moves: 0, completed moves: 0, StepErrors: 0, LaErrors: 0, Underruns: 0, 0  CDDA state: -1
    === AuxDDARing ===
    Scheduled moves: 0, completed moves: 0, StepErrors: 0, LaErrors: 0, Underruns: 0, 0  CDDA state: -1
    === Heat ===
    Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
    === GCodes ===
    Segments left: 0
    Movement lock held by null
    HTTP is ready with "m122 B0" in state(s) 0
    Telnet is idle in state(s) 0
    File is idle in state(s) 0
    USB is idle in state(s) 0
    Aux is idle in state(s) 0
    Trigger is idle in state(s) 0
    Queue is idle in state(s) 0
    LCD is idle in state(s) 0
    SBC is idle in state(s) 0
    Daemon is idle in state(s) 0
    Aux2 is idle in state(s) 0
    Autopause is idle in state(s) 0
    Code queue is empty.
    === Network ===
    Slowest loop: 115.99ms; fastest: 0.03ms
    Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions Telnet(0), 0 sessions
    HTTP sessions: 1 of 8
    - Ethernet -
    State: active
    Error counts: 0 0 0 0 0
    Socket states: 5 2 2 2 2 0 0 0
    === CAN ===
    Messages sent 1693, longest wait 0ms for type 0
    === Linux interface ===
    State: 0, failed transfers: 0
    Last transfer: 423605ms ago
    RX/TX seq numbers: 0/1
    SPI underruns 0, overruns 0
    Number of disconnects: 0
    Buffer RX/TX: 0/0-0
    

    Response when I tried to diag the 3HC board:

    m122 B2
    Error: M122: Response timeout: CAN addr 2, req type 6024, RID=13
    


  • Weird stuff. I found a beta posted in another thread here.
    Seems to have helped 😮 Full day of printing and no communication interruptions thus far.

    @dc42 said in Hiccups on 3HC expansion board:

    I've put a binary of the new EXP3HC firmware at https://www.dropbox.com/s/7rqp6pul9ip3yam/Duet3Firmware_EXP3HC.bin?dl=0. Please monitor it carefully when using it, it's had only a little testing.


  • administrators

    @pkos said in 3HC board disappearing from operation and diag:

    Weird stuff. I found a beta posted in another thread here.
    Seems to have helped 😮 Full day of printing and no communication interruptions thus far.

    @dc42 said in Hiccups on 3HC expansion board:

    I've put a binary of the new EXP3HC firmware at https://www.dropbox.com/s/7rqp6pul9ip3yam/Duet3Firmware_EXP3HC.bin?dl=0. Please monitor it carefully when using it, it's had only a little testing.

    Thanks, let me know if it continues to be reliable. I found that in the original firmware we were running one of the internal PLLs out of specification, which could perhaps lead to issues when the chip gets hotter. The beta firmware fixes that.



  • Will do.

    But seeing you speak about temps... that might actually be it too.
    I moved the printer to a different spot today, where it's not that hot.
    MCU temps went down by over 10C from what I see in DWC (at least for the main board). the 3HC seems to be around the same temps all the time - around 46C, while main board is around 35C).



  • So far so good. Printer has been printing happily over the past couple days.
    Today, the board got a bit hotter (about 52C on main board), but no interruptions observed. It kept printing.

    The one thing I would love right now is to be able to have fans hooked up to the expansion board react to a temp sensor on the main board, but I understand that will come in the future.


Log in to reply