Toolboard 1LC heater faults at higher temps
-
I have a Duet3+SBC installed in an Aon M2 running RRF3.2 with 1LC toolboards on each X carriage. Up until this point I had been printing lower temp materials (240-260C), however I'm running some material now at 275C and the 1LCs have been triggering heater faults a couple hours into the builds.
The 1LCs are necessarily in the heated chamber, but we had been running the chamber at 50C and bed around 70C without issues. The step up from 260 to 275 seems to have been the tipping point.
Here are the diags from the toolboards, at this stage board 21 had a heater fault:
2/17/2021, 8:58:47 PM M122 B20 Diagnostics for board 20: Duet TOOL1LC firmware version 3.2 (2021-01-05) Bootloader ID: not available Never used RAM 3600, free system stack 22 words HEAT 86 CanAsync 85 CanRecv 83 TMC 54 MAIN 218 AIN 64 Last reset 12:18:54 ago, cause: software Last software reset data not available Driver 0: position 56823563, 675.0 steps/mm, standstill, SG min/max 0/148, read errors 0, write errors 1, ifcnt 99, reads 16102, writes 21, timeouts 0, DMA errors 0 Moves scheduled 403829, completed 403829, in progress 0, hiccups 0 No step interrupt scheduled VIN: 23.8V MCU temperature: min 40.2C, current 65.0C, max 73.8C Ticks since heat task active 132, ADC conversions started 44158069, completed 44158067, timed out 0 Last sensors broadcast 0x00000002 found 1 135 ticks ago, loop time 0 CAN messages queued 554665, send timeouts 0, received 980476, lost 0, free buffers 36 === Filament sensors === Interrupt 5 to 44us, poll 9 to 793us Driver 0: pos 118.12, errs: frame 12 parity 0 ovrun 0 pol 0 ovdue 0 2/17/2021, 8:57:53 PM M122 B21 Diagnostics for board 21: Duet TOOL1LC firmware version 3.2 (2021-01-05) Bootloader ID: not available Never used RAM 3600, free system stack 22 words HEAT 86 CanAsync 85 CanRecv 83 TMC 54 MAIN 208 AIN 64 Last reset 12:18:00 ago, cause: software Last software reset data not available Driver 0: position 56943713, 675.0 steps/mm, standstill, SG min/max 0/124, read errors 0, write errors 1, ifcnt 102, reads 54571, writes 22, timeouts 0, DMA errors 0 Moves scheduled 403818, completed 403818, in progress 0, hiccups 0 No step interrupt scheduled VIN: 23.9V MCU temperature: min 38.5C, current 62.4C, max 69.9C Ticks since heat task active 249, ADC conversions started 44104626, completed 44104625, timed out 0 Last sensors broadcast 0x00000004 found 1 3 ticks ago, loop time 0 CAN messages queued 553745, send timeouts 0, received 979676, lost 0, free buffers 36 === Filament sensors === Interrupt 5 to 36us, poll 8 to 813us Driver 0: pos 356.13, errs: frame 6 parity 0 ovrun 0 pol 0 ovdue 0
Note that none of the heater faults ever trigger an error on the DWC or pause the print job. I see those MCU temps are creeping up towards the 80+ range that could potentially be an issue, is that the likely cause?
The hotend setup is 24V/60W heaters w/ PT1000 RTDs. I have liquid cooling on the carriages and a spare fan output on the 1LCs, I was just hoping to not need to engineer the board cooling just yet. Both space and time are tight. I went through all the manual heater tuning prior to this last build to make sure there weren't some temp excursions happening that I wasn't seeing on the graph, the tuning seems ok and stable.
Let me know if overheating is the likely culprit, or if something else is amiss. It certainly seems like the desired behavior would be for a 1LC heater fault to pause the build.
-
The heater faults appear to have been resolved by switching back to known-good PT1000 sensors. I guess I will look into wtf is going on with these new RTDs.
Since the failure-to-pause issue is already a known issue, I'll mark this as resolved. Thanks
-
When you say heater fault, you mean that you're concerned the 1LC is overheating, not that your hotend has gone into a heater fault condition. There would be an error message in the console.
@mct82 said in Toolboard 1LC heater faults at higher temps:
the 1LCs have been triggering heater faults a couple hours into the builds.
What exactly is happening?
-
Sorry for the confusion, the hotend(s) are going into a heater fault condition...as in: press the heater name on I/F, wait for warning timer, reset. The lack of a msg in the console is partly why I found this issue concerning.
This and previous jobs have been running on T2, which is a duplication mode tool using both carriages. Config.g attached.
-
The issue has just recurred, this time running no chamber heat and lower bed temp. The max MCU temps are lower, but the hotend heaters' status was both "FAULT". They did not fault simultaneously based on the print failures (~1hr apart).
The diagnostics look like this:
2/18/2021, 3:05:15 AM M122 B21
Diagnostics for board 21:
Duet TOOL1LC firmware version 3.2 (2021-01-05)
Bootloader ID: not available
Never used RAM 3648, free system stack 24 words
HEAT 86 CanAsync 89 CanRecv 83 TMC 54 MAIN 218 AIN 64
Last reset 04:59:48 ago, cause: software
Last software reset data not available
Driver 0: position 26542684, 675.0 steps/mm, standstill, SG min/max 0/176, read errors 0, write errors 1, ifcnt 120, reads 15580, writes 17, timeouts 0, DMA errors 0
Moves scheduled 182892, completed 182892, in progress 0, hiccups 0
No step interrupt scheduled
VIN: 24.0V
MCU temperature: min 46.8C, current 47.5C, max 59.2C
Ticks since heat task active 102, ADC conversions started 17916430, completed 17916429, timed out 0
Last sensors broadcast 0x00000004 found 1 105 ticks ago, loop time 0
CAN messages queued 224907, send timeouts 0, received 416810, lost 0, free buffers 36
=== Filament sensors ===
Interrupt 5 to 33us, poll 8 to 764us
Driver 0: pos 30.94, errs: frame 0 parity 0 ovrun 0 pol 0 ovdue 02/18/2021, 3:04:54 AM M122 B20
Diagnostics for board 20:
Duet TOOL1LC firmware version 3.2 (2021-01-05)
Bootloader ID: not available
Never used RAM 3648, free system stack 22 words
HEAT 86 CanAsync 89 CanRecv 83 TMC 54 MAIN 216 AIN 64
Last reset 04:59:28 ago, cause: software
Last software reset data not available
Driver 0: position 26552134, 675.0 steps/mm, standstill, SG min/max 0/114, read errors 0, write errors 0, ifcnt 118, reads 3506, writes 1, timeouts 0, DMA errors 0
Moves scheduled 182901, completed 182901, in progress 0, hiccups 0
No step interrupt scheduled
VIN: 24.5V
MCU temperature: min 47.7C, current 47.7C, max 61.1C
Ticks since heat task active 41, ADC conversions started 17896490, completed 17896489, timed out 0
Last sensors broadcast 0x00000002 found 1 44 ticks ago, loop time 0
CAN messages queued 145963, send timeouts 0, received 290355, lost 0, free buffers 36
=== Filament sensors ===
Interrupt 5 to 33us, poll 9 to 685us
Driver 0: pos 348.05, errs: frame 59 parity 0 ovrun 0 pol 0 ovdue 0M122 B0
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.2 running on Duet 3 MB6HC v1.01 or later (SBC mode)
Board ID: 08DJM-956L2-G43S8-6JKD0-3S46L-9U2LD
Used output buffers: 1 of 40 (15 max)
=== RTOS ===
Static ram: 149788
Dynamic ram: 64972 of which 164 recycled
Never used RAM 143908, free system stack 122 words
Tasks: Linux(ready,77) HEAT(blocked,271) CanReceiv(blocked,809) CanSender(blocked,335) CanClock(blocked,352) TMC(blocked,19) MAIN(running,671) IDLE(ready,19)
Owned mutexes: HTTP(MAIN)
=== Platform ===
Last reset 05:03:23 ago, cause: software
Last software reset at 2021-02-18 03:05, reason: User, GCodes spinning, available RAM 143948, slot 0
Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task Linu Freestk 0 n/a
Error status: 0x00
Aux0 errors 0,0,0
Aux1 errors 0,0,0
MCU temperature: min 43.1, current 43.2, max 43.3
Supply voltage: min 32.2, current 32.2, max 32.2, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 12.1, current 12.2, max 12.2, under voltage events: 0
Driver 0: position -6560, standstill, reads 14561, writes 0 timeouts 0, SG min/max not available
Driver 1: position -3440, standstill, reads 14561, writes 0 timeouts 0, SG min/max not available
Driver 2: position 5767, standstill, reads 14561, writes 0 timeouts 0, SG min/max not available
Driver 3: position 42640, standstill, reads 14561, writes 0 timeouts 0, SG min/max not available
Driver 4: position 0, standstill, reads 14561, writes 0 timeouts 0, SG min/max not available
Driver 5: position 0, standstill, reads 14561, writes 0 timeouts 0, SG min/max not available
Date/time: 2021-02-18 08:08:47
Slowest loop: 0.28ms; fastest: 0.06ms
=== Storage ===
Free file entries: 10
SD card 0 not detected, interface speed: 37.5MBytes/sec
SD card longest read time 0.0ms, write time 0.0ms, max retries 0
=== Move ===
DMs created 125, maxWait 0ms, bed compensation in use: mesh, comp offset 0.000
=== MainDDARing ===
Scheduled moves 185922, completed moves 185922, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== AuxDDARing ===
Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = 3 -1 -1 -1
Heater 0 is on, I-accum = 0.2
=== GCodes ===
Segments left: 0
Movement lock held by null
HTTP* is doing "M122 B0" in state(s) 0
Telnet is idle in state(s) 0
File* is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger* is idle in state(s) 0
Queue* is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0
Aux2 is idle in state(s) 0
Autopause* is idle in state(s) 0
Code queue is empty.
=== Filament sensors ===
Extruder 0: no data received
Extruder 1: no data received
=== CAN ===
Messages queued 11, send timeouts 0, received 68, lost 0, longest wait 0ms for reply type 0, free buffers 48
=== SBC interface ===
State: 4, failed transfers: 0
Last transfer: 1ms ago
RX/TX seq numbers: 42120/42120
SPI underruns 0, overruns 0
Number of disconnects: 0, IAP RAM available 0x2c8a8
Buffer RX/TX: 0/0-0
=== Duet Control Server ===
Duet Control Server v3.2.0
Code buffer space: 4096
Configured SPI speed: 8000000 Hz
Full transfers per second: 35.97
Maximum length of RX/TX data transfers: 3812/1684 -
Thanks for your report.
I already have an issue logged that a heater fault on a tool or expansion board does not pause the print or produce a message. This is scheduled to be fixed in release 3.3.
As to why the heater fault occurred, did you notice any temperature fluctuations shortly before the heater faulted?
-
@dc42 Negative. It's been difficult to catch, but I did see a trace on the temp graph that just started rolling down from a definite fault point. Prior to that the trace was within the normal +/-1C.
I don't think 275C should be anywhere near the limit for these PT1000s, but these are both new and from the same batch.
-
The heater faults appear to have been resolved by switching back to known-good PT1000 sensors. I guess I will look into wtf is going on with these new RTDs.
Since the failure-to-pause issue is already a known issue, I'll mark this as resolved. Thanks