Mystery Help: 4 TC lines all error at the same time!
-
So I’m a bit stumped on this one - anyone up for a mystery?
On a recent test, I errored out with all 4 of my thermocouples jumping to 2000 deg at the same time. Two of them are fully shielded, two are partially shielded with an unshielded extension. 2 are on the main Duet 3 6HC, 2 are on an expansion board. All 4 went to 2000C nearly instantly. The message I received was:
“Heater 0 fault : unable to read sensor : sensor short to other wiring”I suspect there was an issue with a SSR possibly shorting to ground, or one of the thermocouples shorting to the TC sheathing (also GND).
Any ideas?
-
@TLAS said in Mystery Help: 4 TC lines all error at the same time!:
Any ideas?
Take the error at its word and start checking for shorts.
-
@Phaedrux
A short that doesn’t exist except at higher temperatures with a heater powered on…? It’s not a normal wire short - the system didn’t move at all and returned to normal once the heater faulted out. Hence the thought that it must have been a short to ground, which could occur in insulated TCs as a function of temperature. Enough EMI could have also put a sufficient signal in the ground plane to simulate a short to another wire.It’s still super weird that all 4 tanked as a result. Probably something in the code where it tries to reach each one and when it fails, stops reading all TC lines (thermistor lines were fine during this period BYW).
I’m convincing myself to go download the code to search for the exact error and the behavior that triggers is - likely an exact error state of the MAX TC chip.
-
@dc42 I think I found an error, although I can’t find the exact spot in the code. Great job with documentation and code organization, it led me to find the issue more quickly.
As a disclaimer, I’m acting under the assumption that having 4 thermocouples all short at the exact same time in two separate cards and in different physical locations is highly improbable. If I had an inadvertent short to ground where the gnd reference voltage was significantly altered, I expected to have seen a LOT more warnings everywhere.
Summary: I likely triggered an over/under voltage event on a single thermocouple, possibly due to an unexpected contact with ground and some EMF transfer to the ground plane near the TC (leading theory). All 4 thermocouples went to 2000C for about 10 seconds, the 4 termisitors on the mainboard were unaffected. 2 of the thermocouples are on the Duet 6HC, 2 are on an expansion board. I suspect that there may be a logic /loop error where the sensor address is not updated after fault detection or the badTemperatureCount is applied globally while the single sensor fault exists. Beyond not being able to identify the single Thermocouple causing the issue, I’m worried about other heaters that may be operating at the time.
I suspect to replicate this issue, just apply a voltage slightly above VDD to a max daughter board with multiple TC sensors in operation.
-
@TLAS do you have the MAX31856 thermocouple boards, or the older MAX31865 boards? The newer boards have one LED per channel that lights up when the chip has sensed an error condition and disconnected the inputs. The error condition is reset by removing and reapplying power.
My guess is that you has an electrostatic discharge into one of the thermocouple wires, and this caused all four chips to go into the error state. To avoid this in future, ensure that there is electrical continuity (either directly or through a resistor) between the hot end metalwork and the ground side of Duet VIN.
-
@dc42
Very interesting…I had some video of the event I looked back at, your theory is definitely close. The board had both thermocouple LEDs blinking before the event (when only a single sensor was bouncing to 2000C). When the event happened, both LEDs stayed red for the outage of 10 seconds before reverting back to normal.
I’ll double check ground paths again, but what makes this interesting is the fact that nearly everything in the system is metal and grounded with insulated thermocouples. I’m working with some high powered heaters that may create some substantial EMI noise. I’ll set up a test the next chance I get to capture noise on the metal elements and see what happens.
Appreciate the insight.
-
@dc42
Following up on this - there is indeed a short somewhere (matching the exact over-under voltage error) and it happens at higher temperatures. It’s a VERY weird short thought - l measured about 13V alternating current with an oscilloscope that suddenly appears (and I can hear a feint buzzing) when it triggers as well. I disconnected all sensors but one and I’m seeing the AC current on an UNCONNECTED thermocouple line.My leading theory is that the AC is shorting somewhere to the 3.3V line, but the various capacitors are reducing the impact to manageable voltages in most places. On this particular circuit, something is making its way through and visible only in the thermocouple readings.
Have you seen anything similar? In full disclaimer I’m routing this through a custom interconnect board, the the likelihood the short is somewhere else and I’m seeing EMI-induced voltages is not 0.
I’m going to redo the entire set of wiring in the related circuits with robust shielding everywhere. That should fix whatever is causing this weird error even if I can’t find the exact point of input.
-
@TLAS where did you measure 13V AC?
-
@dc42
Between the TC input (when disconnected) and the chassis ground. I’m still scratching my head over it. -