Mid print hang 3.4.0b2, duet3, sbc
-
The SBC doesn't seem to be losing power or restarting when this happens, as I maintained a putty connection through the most recent problem.
m122 immediately after stopping:
m122 === Diagnostics === RepRapFirmware for Duet 3 MB6HC version 3.4.0beta2 (2021-08-03 12:42:33) running on Duet 3 MB6HC v1.01 or later (SBC mode) Board ID: 08DJM-956BA-NA3TJ-6JTDD-3S06N-KU8LS Used output buffers: 1 of 40 (26 max) === RTOS === Static ram: 151128 Dynamic ram: 62208 of which 216 recycled Never used RAM 134000, free system stack 127 words Tasks: SBC(ready,3.9%,310) HEAT(notifyWait,0.1%,326) Move(notifyWait,1.8%,262) CanReceiv(notifyWait,0.0%,943) CanSender(notifyWait,0.0%,361) CanClock(delaying,0.1%,334) TMC(notifyWait,58.3%,59) MAIN(running,35.7%,922) IDLE(ready,0.2%,29), total 100.0% Owned mutexes: HTTP(MAIN) === Platform === Last reset 01:54:18 ago, cause: software Last software reset at 2021-08-19 13:40, reason: User, Platform spinning, available RAM 134072, slot 0 Software reset code 0x0000 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task SBC Freestk 0 n/a Error status: 0x00 Aux0 errors 0,0,0 Step timer max interval 136 MCU temperature: min 26.4, current 27.6, max 37.1 Supply voltage: min 23.9, current 24.0, max 24.1, under voltage events: 0, over voltage events: 0, power good: yes 12V rail voltage: min 12.0, current 12.1, max 12.1, under voltage events: 0 Heap OK, handles allocated/used 99/0, heap memory allocated/used/recyclable 2048/36/36, gc cycles 0 Driver 0: position 46018, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/927 Driver 1: position -5508, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/905 Driver 2: position 480, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/1016 Driver 3: position 0, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/188 Driver 4: position 0, standstill, reads 35863, writes 17 timeouts 0, SG min/max 0/1023 Driver 5: position 0, standstill, reads 35863, writes 17 timeouts 0, SG min/max 0/452 Date/time: 2021-08-19 15:34:34 Slowest loop: 66.88ms; fastest: 0.04ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 37.5MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 125, segments created 11, maxWait 22003ms, bed compensation in use: none, comp offset 0.000 === MainDDARing === Scheduled moves 2729, completed moves 2729, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 5], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1 === GCodes === Segments left: 0 Movement lock held by null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File* is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue* is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty === CAN === Messages queued 61729, received 0, lost 0, longest wait 0ms for reply type 0, peak Tx sync delay 0, free buffers 49 (min 49), ts 34291/0/0 Tx timeouts 0,22,34290,0,0,27414 last cancelled message type 30 dest 127 === SBC interface === State: 4, failed transfers: 2, checksum errors: 398 Last transfer: 2ms ago RX/TX seq numbers: 64933/1590 SPI underruns 415, overruns 20 Disconnects: 4, timeouts: 0, IAP RAM available 0x2c690 Buffer RX/TX: 0/0-0 === Duet Control Server === Duet Control Server v3.4-b2 Code buffer space: 4096 Configured SPI speed: 8000000Hz Full transfers per second: 39.16, max wait times: 54.4ms/9.9ms Codes per second: 0.85 Maximum length of RX/TX data transfers: 2924/1044
-
Here is the DWC output when the stall happens. The event a 1:13 was during a print, the event at 1:16 was after.
-
Can you set up some additional monitoring on the pi?
https://duet3d.dozuki.com/Wiki/Getting_Started_With_Duet_3#Section_Monitoring_optional
-
@phaedrux started DCS with debugging and imediately saw this: I'm going to run a print and see what happens.
[warn] Bad data CRC32 (expected 0x7122d0a5, got 0x68c5ab97) [warn] Bad data CRC32 (expected 0xc31600e2, got 0xe7708562) [warn] Bad data CRC32 (expected 0xc8a7c26b, got 0x5b1340ab) [warn] Bad data CRC32 (expected 0x312c52a0, got 0x2c1f8ef5) [warn] Bad data CRC32 (expected 0xd1710152, got 0x1c81bcd5)
edit: couldn't get into the web interface. Got this:
System.OperationCanceledException: Board is not available (no header) at DuetControlServer.SPI.DataTransfer.ExchangeHeader() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 1436 at DuetControlServer.SPI.DataTransfer.PerformFullTransfer(Boolean connecting) in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 194 [info] Connection to Duet established [debug] Updated key spindles [debug] Requesting update of key state, seq 0 -> 1 [debug] Updated key state [debug] Requesting update of key tools, seq 0 -> 5 [debug] Updated key tools [debug] Requesting update of key volumes, seq 0 -> 0 [debug] Updated key volumes [debug] Requesting update of key boards, seq 0 -> 1087 [debug] Updated key boards [debug] Requesting update of key directories, seq 0 -> 0 [debug] Updated key directories [debug] Requesting update of key fans, seq 0 -> 7 [debug] Updated key fans [debug] Requesting update of key global, seq 0 -> 0 [debug] Updated key global [debug] Requesting update of key heat, seq 0 -> 10 [debug] Updated key heat [debug] Requesting update of key inputs, seq 0 -> 25 [debug] Updated key inputs [debug] Requesting update of key job, seq 0 -> 15 [warn] Bad data CRC32 (expected 0x3534c24d, got 0xc6f735d8) [debug] Updated key job [debug] Requesting update of key move, seq 0 -> 54 [warn] Bad data CRC32 (expected 0x390f64ca, got 0xffa85249) [debug] Updated key move [debug] Requesting update of key network, seq 0 -> 3 [debug] Updated key network [debug] Requesting update of key sensors, seq 0 -> 8 [debug] Updated key sensors [warn] Bad data CRC32 (expected 0x4c04f1ba, got 0x11d51962) [warn] Bad data CRC32 (expected 0x4bd6bb59, got 0x5ca43599) [warn] Bad data CRC32 (expected 0x4bd6bb59, got 0x01651ad2) [warn] Bad data CRC32 (expected 0xe75e751f, got 0x20b97c2f) [warn] Bad data CRC32 (expected 0x0259194f, got 0x1dc9c3ca) [warn] Bad data CRC32 (expected 0x59691b91, got 0xd61c979f) [fatal] Abnormal program termination [fatal] SPI task faulted System.Exception: RepRapFirmware refused message format at DuetControlServer.SPI.DataTransfer.ExchangeHeader() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 1540 at DuetControlServer.SPI.DataTransfer.PerformFullTransfer(Boolean connecting) in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 194 at DuetControlServer.SPI.Interface.Run() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/Interface.cs:line 1026 at DuetControlServer.Utility.PriorityThreadRunner.<>c__DisplayClass0_0.<Start>b__0() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/Utility/PriorityThreadRunner.cs:line 25 [fatal] SPI task faulted System.Exception: RepRapFirmware refused message format at DuetControlServer.SPI.DataTransfer.ExchangeHeader() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 1540 at DuetControlServer.SPI.DataTransfer.PerformFullTransfer(Boolean connecting) in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 194 at DuetControlServer.SPI.Interface.Run() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/Interface.cs:line 1026 at DuetControlServer.Utility.PriorityThreadRunner.<>c__DisplayClass0_0.<Start>b__0() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/Utility/PriorityThreadRunner.cs:line 25 [debug] Update task terminated [debug] IPC task terminated [debug] Job task terminated [debug] Periodic updater task terminated [info] Application has shut down
-
Try a new SD card with a fresh duetpi image.
-
@phaedrux it seems turning on the berd air pump is causing the problem. Several times now I've switched on the berd air pump just to have the disconnect happen shortly after. I don't have a flyback diode on the pump, and originally I've had it connected to out3, but I tried moving it to out9 which should have an onboard flyback diode, and even now I have moved it on an external mosfet board with the machine still stopping once the pump turns on.
-
@zakm0n It sounds a lot like the SPI communication is interrupted by external sources. You may want to try to move other cables away from the SBC cable, replace it with a shorter one (if possible), or reduce the SPI frequency in
/opt/dsf/conf/config.json
. -
@chrishamm I tried lowering the SPI frequency to half it's default. The SBC cable is run underneath the duet, so it should be well protected. I'm still getting disconnects because of SPI interruptions.
-
@chrishamm So, I have the chassis of the printer bonded to the mains earth, and after running a grounding wire to the body of the berd air pump, I'm now able to get about 3 hours into a print, but I'm still getting disconnects somewhere around the 3.5-4 hour mark into a print. Any other grounding you might suggest to get rid of noise? The SBC cable is run underneath of the Duet 3 and there's not really any way to further isolate it from the wiring of the machine. I'm at a loss here, and I'm considering ditching the Duet3 for something else going forward, as my personal machines with Klipper have absolutely never had such an issue, even in much less ideal setups.
-
@zakm0n said in Mid print hang 3.4.0b2, duet3, sbc:
The SBC cable is run underneath of the Duet 3 and there's not really any way to further isolate it from the wiring of the machine.
Foil tape as shielding?
-
@phaedrux I could try that, but I'll have to re-route the cable from behind the board. Foil tape has a way of shorting out things, doesn't it? I just can't understand how I'm seemingly the only one to ever have this problem.
-
@zakm0n said in Mid print hang 3.4.0b2, duet3, sbc:
Foil tape has a way of shorting out things, doesn't it?
A layer of non-conductive tape on top? There are purpose made shielded ribbon cables as well.
@zakm0n said in Mid print hang 3.4.0b2, duet3, sbc:
I just can't understand how I'm seemingly the only one to ever have this problem.
There are maybe a few other cases of similar interference I can think of, but it's usually been with a Duex ribbon cable or Paneldue picking up some noise.
The fact you're getting an improvement by increasing the grounding is promising though.
-
@zakm0n I grab Vin and GND for the Pi's buck converter relatively close from the Duet's Vin power connector in order to minimise potential interference.
I've been printing A LOT in SBC mode (countless prints > 7h) and I have not observed any connection drops on my setup. Do you get CRC errors when you bend or slightly twist the ribbon cable? If yes, try replacing it - a bad cable is the most plausible reason for your connection drops.
-
@chrishamm The PI has an external wall wart currently. The ribbon cable isn't in a place where it can be moved easily, so that hasn't been something I can test. Adding the grounding strap to the Berd pump got me from 10 minutes to 4 hours though, so I'm thinking grounding and shielding is where I should be focusing.
-
@phaedrux So, I managed to get the ribbon cable wrapped in metal foil tape and successfully printed a 6 hour print yesterday. I tried starting a print today and only made 10 minutes. It's like all of the gains I made by grounding and such just disappeared. It seems like this time there were a bunch of rapid connects and disconnects in the console on my paneldue.
-
And just as a verification if you have the berd air completely off the prints sill consistently complete without issue?