Unsolved Duet3D MB6HC v1.02 flash memory corruption issue
-
Hi all,
we have issues that the flash memory of MB6HC v1.02 Duet boards gets corrupted from time to time. The only way to restore the boards is by flashing the firmware either via BOSSA/USB or OpenOCD/SWD. The firmware version we are using is v3.4.5. The MB6HC is powered by the SBC (JP9 jumper set to 5V_SBC).
Currently all reported issues have the newest revision of the Duet board, v1.02. For the older boards (v1.00/v1.01) we used firmware v3.3.0, but as we had issues with this firmware with the newer boards (v1.02) we switched to v3.4.5.
Checking the schematics we saw that there has been some changes on the power supply / buck converters of the board between revision 1.01 and 1.02. We are not sure if this may cause the issue in some way. We measured the 3.3V and it slightly drops when VIN is cut off, what we did not expect as the 5V is set to be supplied by the SBC.
Our device is a laser cutter. Any time the device is opened we turn off the 24V power supply (VIN) connected to the Duet board. If the device is opened while executing any G-code we send a M999 to the Duet board to immediately stop the device. VIN may drop at different speed depending on what stepper motors are currently moved, or if the compressor is turned on while the laser source is active.
Any ideas where this issue may come from is very welcome as we have almost two hundred devices globally distributed.
Kind regards,
Andreas -
@andiwinter how do you know the flash memory is getting corrupted? Does the Status LED flash an error code?
-
@dc42 On the first occurances our guess was that the Duet boards got somehow damaged - the Duet Web UI could not connect to the board and the log journal of the DuetControlServer service running on the SBC showed that is was permanently restarted, reporting communication issues with the Duet board. We sent our customers replacement boards and with the new boards the issue was gone. If customers could not replace it themselves we exchanged the complete device. As the replaced Duet boards piled up our electronics department had a look at the boards. There was one board with a diode soldered in the wrong orientation, but on all other boards no issue could be found and the boards could be brought to life again by flashing the firmware via BOSSA/USB.
Currently I have no Duet board available in the 'corrupted' state as all have been re-flashed, but I think I can remember that the status LED was not flashing at all, I think normally it should be flashing at a 1 second period.
As a work-around to be able to flash a 'corrupted' board remotely we have assembled newer devices with a connection of the SBC with the SWD pins of the Duet board, so we can reset and flash the board via OpenOCD software directly from the SBC.
-
-
@dc42 Is there any recommendation what we can do or test in case of another occurance to narraw down the issue. Duet boards of our older machines are only connected with the SPI interface to the SBC, whereas newer machines also have the SWD interface connected and openocd installed on the SBC.
-
@andiwinter sorry to barge in on your issue, but do you have a documentation about using openocd to flash the duet 6hc you'd be willing to share?
-
@oliof I have no document for the public, but for internal use only and it's still not very advanced, but it works for our configuration and purpose. Some settings are dependant of which type of SBC you are using - in our case a CM4 module on an IO board - and which GPIO pins are free on the SBC. The SWD pins SWDIO, SWCLK, RESET and GND need to be connected to the SBC, and the GPIOs need to be configured as output in /boot/config.txt.
This is the sequence of commands I use for flashing the firmware via the SWD interface (GPIO18 is connected to RESET in our case):
raspi-gpio set 18 dl
openocd -f ocd.cfg -c "init; halt; flash protect 0 0 last off"
openocd -f ocd.cfg -c "init; halt; flash write_image erase Duet3Firmware_MB6HC.bin 0x00400000"
openocd -f ocd.cfg -c "init; halt; reset run"
raspi-gpio set 18 dhExecuting an openocd command holds the connection and need to be terminated with CTRL+C after it has done its task. I still did not find a way to do this all in one line.
content of ocd.cfg:
adapter driver bcm2835gpio
bcm2835gpio peripheral_base 0xFE000000
bcm2835gpio speed_coeffs 236181 60
adapter gpio swclk 23
adapter gpio swdio 24
adapter gpio srst 18
transport select swd
set CHIPNAME atsame70q20
source [find target/atsamv.cfg]
reset_config srst_only srst_push_pull
adapter speed 1000peripheral_base and speed_coeffs are dependant of which SBC you have
-
@andiwinter how do you cut the 24V power, and how do you restore it afterwards? If you cut and restore it using a relay in the VIN line then cutting the power should not cause an issue, however if you restore the power using that relay then there will be a large surge current to recharge the capacitors on the 6HC board. This is most likely causing the problem. See https://docs.duet3d.com/en/User_manual/Connecting_hardware/Power_choosing#inrush-current.