6HC+2x 3HC+SBC. 3.2 Crashes at high print speed.
-
@jrockland said in High performance issues.. Not bashing here, just a question.:
Error: M550: Machine name must consist of the same letters and digits as configured by the Linux hostname
Fan 2, speed: 0%, min: 10%, max: 100%, blip: 0.10The first one is pretty obvious what it means. The fan 2 is because you have M106 P2 with nothing else so it's just reporting the settings of fan2.
I should have also asked if you could send M122 B1 and M122 B2 to check the expansion boards as well.
@jrockland said in High performance issues.. Not bashing here, just a question.:
Last software reset at 2021-01-21 19:47, reason: MemoryProtectionFault mmarValid daccViol, GCodes spinning, available RAM 145604, slot 2
Was that M122 captured after it crashed?
-
@Phaedrux
m550 I have that on all my machines, never took the time to look into it..I will launch another "test run" tomorrow then take the m122 from all the boards when it crash
-
Use M122 B1 and M122 B2 before your test to make sure the expansions are running 3.2 as well. If not, send M997 B1 and M997 B2 to flash them as well.
-
@Phaedrux
So I did flash them to the last update (they where at 2020-12-20, they are now at 2021-01-05)
and here my m122 and m122 b1/b222/01/2021, 11:47:39 m122
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.2 running on Duet 3 MB6HC v1.01 or later (SBC mode)
Board ID: 08DJM-956L2-G43S4-6J1F8-3SD6T-9V56D
Used output buffers: 1 of 40 (16 max)
=== RTOS ===
Static ram: 149788
Dynamic ram: 63132 of which 56 recycled
Never used RAM 145856, free system stack 200 words
Tasks: Linux(ready,111) HEAT(blocked,302) CanReceiv(blocked,848) CanSender(blocked,371) CanClock(blocked,352) TMC(blocked,53) MAIN(running,1202) IDLE(ready,19)
Owned mutexes: HTTP(MAIN)
=== Platform ===
Last reset 00:01:19 ago, cause: power up
Last software reset at 2021-01-22 11:41, reason: User, none spinning, available RAM 145356, slot 1
Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task Linu Freestk 0 n/a
Error status: 0x00
Aux0 errors 0,0,0
Aux1 errors 0,0,0
MCU temperature: min 38.9, current 39.5, max 39.7
Supply voltage: min 24.1, current 24.2, max 24.2, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 12.1, current 12.1, max 12.1, under voltage events: 0
Driver 0: position 0, standstill, reads 54751, writes 14 timeouts 0, SG min/max 0/0
Driver 1: position 0, standstill, reads 54751, writes 14 timeouts 0, SG min/max 0/0
Driver 2: position 0, standstill, reads 54752, writes 14 timeouts 0, SG min/max 0/0
Driver 3: position 0, standstill, reads 54752, writes 14 timeouts 0, SG min/max 0/0
Driver 4: position 0, standstill, reads 54752, writes 14 timeouts 0, SG min/max 0/0
Driver 5: position 0, standstill, reads 54752, writes 14 timeouts 0, SG min/max 0/0
Date/time: 2021-01-22 11:47:38
Slowest loop: 0.41ms; fastest: 0.05ms
=== Storage ===
Free file entries: 10
SD card 0 not detected, interface speed: 37.5MBytes/sec
SD card longest read time 0.0ms, write time 0.0ms, max retries 0
=== Move ===
DMs created 125, maxWait 0ms, bed compensation in use: none, comp offset 0.000
=== MainDDARing ===
Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== AuxDDARing ===
Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters = -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
=== GCodes ===
Segments left: 0
Movement lock held by null
HTTP* is doing "M122" in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger* is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
Code queue is empty.
=== CAN ===
Messages queued 212, send timeouts 0, received 18, lost 0, longest wait 1ms for reply type 6042, free buffers 48
=== SBC interface ===
State: 4, failed transfers: 0
Last transfer: 1ms ago
RX/TX seq numbers: 1725/1725
SPI underruns 0, overruns 0
Number of disconnects: 0, IAP RAM available 0x2c8a8
Buffer RX/TX: 0/0-0
=== Duet Control Server ===
Duet Control Server v3.2.0
Code buffer space: 4096
Configured SPI speed: 8000000 Hz
Full transfers per second: 4.15
Maximum length of RX/TX data transfers: 4102/81222/01/2021, 11:48:26 m122 b2
Diagnostics for board 2:
Duet EXP3HC firmware version 3.2 (2021-01-05)
Bootloader ID: not available
Never used RAM 154848, free system stack 200 words
HEAT 77 CanAsync 94 CanRecv 87 TMC 64 MAIN 313 AIN 257
Last reset 00:02:07 ago, cause: reset button
Last software reset data not available
Driver 0: position 0, 320.0 steps/mm, standstill, reads 14960, writes 16 timeouts 0, SG min/max 0/0
Driver 1: position 0, 320.0 steps/mm, standstill, reads 14962, writes 16 timeouts 0, SG min/max 0/0
Driver 2: position 0, 80.0 steps/mm, standstill, reads 14970, writes 11 timeouts 0, SG min/max 0/0
Moves scheduled 0, completed 0, in progress 0, hiccups 0
No step interrupt scheduled
VIN: 24.2V, V12: 12.1V
MCU temperature: min 37.3C, current 37.5C, max 37.5C
Ticks since heat task active 248, ADC conversions started 127240, completed 127238, timed out 0
Last sensors broadcast 0x00000000 found 0 1 ticks ago, loop time 0
CAN messages queued 28, send timeouts 0, received 1035, lost 0, free buffers 36
22/01/2021, 11:48:09 m122 b1
Diagnostics for board 1:
Duet EXP3HC firmware version 3.2 (2021-01-05)
Bootloader ID: not available
Never used RAM 154848, free system stack 200 words
HEAT 102 CanAsync 94 CanRecv 87 TMC 64 MAIN 319 AIN 257
Last reset 00:01:50 ago, cause: reset button
Last software reset data not available
Driver 0: position 0, 320.0 steps/mm, standstill, reads 53562, writes 16 timeouts 0, SG min/max 0/0
Driver 1: position 0, 320.0 steps/mm, standstill, reads 53565, writes 16 timeouts 0, SG min/max 0/0
Driver 2: position 0, 80.0 steps/mm, standstill, reads 53573, writes 11 timeouts 0, SG min/max 0/0
Moves scheduled 0, completed 0, in progress 0, hiccups 0
No step interrupt scheduled
VIN: 24.2V, V12: 12.3V
MCU temperature: min 36.6C, current 36.6C, max 36.6C
Ticks since heat task active 50, ADC conversions started 110042, completed 110041, timed out 0
Last sensors broadcast 0x00000000 found 0 53 ticks ago, loop time 0
CAN messages queued 28, send timeouts 0, received 880, lost 0, free buffers 36 -
@Phaedrux I realized something else: the pi actually do a reboot when the board shut down, it doesn't just loose power.
-
Now that they are all on 3.2 hopefully that will show an improvement.
If it does crash again get an M122 afterwards and setup this monitoring on the pi.
https://duet3d.dozuki.com/Wiki/Getting_Started_With_Duet_3#Section_Monitoring_optional
-
@Phaedrux
I did but they didn't gave me any info, so I did it again and recorded the whole thing.
video here (1.30 min)
the video is not publish publicly, you can only see it from the link. I didn't want people to have the wrong idea.you can see at 1:20 everything just *freeze as the whole system just quit.
when I reboot I have no error message or anything, it just boot as usual. -
And when it boots up again, gather a M122 so we can see what the last reset reason was.
-
@Phaedrux
m122
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.2 running on Duet 3 MB6HC v1.01 or later (SBC mode)
Board ID: 08DJM-956L2-G43S4-6J1F8-3SD6T-9V56D
Used output buffers: 1 of 40 (16 max)
=== RTOS ===
Static ram: 149788
Dynamic ram: 63132 of which 56 recycled
Never used RAM 145856, free system stack 200 words
Tasks: Linux(ready,107) HEAT(blocked,302) CanReceiv(blocked,848) CanSender(blocked,371) CanClock(blocked,352) TMC(blocked,53) MAIN(running,1189) IDLE(ready,19)
Owned mutexes: HTTP(MAIN)
=== Platform ===
Last reset 00:04:03 ago, cause: power up
Last software reset at 2021-01-22 13:19, reason: HardFault undefInstr, Platform spinning, available RAM 145856, slot 2
Software reset code 0x4060 HFSR 0x40000000 CFSR 0x00010000 ICSR 0x0440f803 BFAR 0x00000000 SP 0x2041fca8 Task MAIN Freestk 1720 ok
Stack: 0008eddf 00000000 204180d8 20420d58 000000af 00462ac7 00462ac6 610f0000 a5a5a5a5 a5a5a5a5 00000000 0046bd7f 00000001 fffc0000 204193df 00000000 2041dbcc a5a5a5a5 1a28e11f a5a5a5a5 a5a5a5a5 a5a5a5a5 a5a5a5a5 004170a1 204276c0 00462ac3 00000000
Error status: 0x00
Aux0 errors 0,0,0
Aux1 errors 0,0,0
MCU temperature: min 18.5, current 34.4, max 34.5
Supply voltage: min 24.2, current 24.2, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 12.0, current 12.1, max 12.1, under voltage events: 0
Driver 0: position 0, standstill, reads 51221, writes 14 timeouts 0, SG min/max 0/0
Driver 1: position 0, standstill, reads 51221, writes 14 timeouts 0, SG min/max 0/0
Driver 2: position 0, standstill, reads 51221, writes 14 timeouts 0, SG min/max 0/0
Driver 3: position 0, standstill, reads 51222, writes 14 timeouts 0, SG min/max 0/0
Driver 4: position 0, standstill, reads 51222, writes 14 timeouts 0, SG min/max 0/0
Driver 5: position 0, standstill, reads 51222, writes 14 timeouts 0, SG min/max 0/0
Date/time: 2021-01-22 14:04:38
Slowest loop: 0.47ms; fastest: 0.06ms
=== Storage ===
Free file entries: 10
SD card 0 not detected, interface speed: 37.5MBytes/sec
SD card longest read time 0.0ms, write time 0.0ms, max retries 0
=== Move ===
DMs created 125, maxWait 0ms, bed compensation in use: none, comp offset 0.000
=== MainDDARing ===
Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== AuxDDARing ===
Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters = -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
=== GCodes ===
Segments left: 0
Movement lock held by null
HTTP* is doing "M122" in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger* is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
Code queue is empty.
=== CAN ===
Messages queued 860, send timeouts 0, received 18, lost 0, longest wait 1ms for reply type 6018, free buffers 48
=== SBC interface ===
State: 4, failed transfers: 0
Last transfer: 1ms ago
RX/TX seq numbers: 7514/7514
SPI underruns 0, overruns 0
Number of disconnects: 0, IAP RAM available 0x2c8a8
Buffer RX/TX: 0/0-0
=== Duet Control Server ===
Duet Control Server v3.2.0
Code buffer space: 4096
Configured SPI speed: 8000000 Hz
Full transfers per second: 2.71
Maximum length of RX/TX data transfers: 4102/904 -
@Phaedrux
I think we have our issue here: Last software reset at 2021-01-22 13:19, reason: HardFault undefInstr, Platform spinning, available RAM 145856, slot 2.
Slot 2 mean the second 3HC in line right ?
Im having some issue with him, loosing connection (red led start flashing fast). I didnt bother as it does the exact same job as the first 3HC the print was going anyway. -
I've flagged this for review. Will just need to wait for time zones before anyone can take a look.
-
@Phaedrux
Im having that fault every 10 min now while trying normal print speed..:
HardFault undefInstr, Platform spinning, available RAM 145856, slot 2
since I updated the boards.. Ill just shut it down for now.no worry I know we are not in the same time zones. thank for the help
-
If you disconnect that tool board entirely does the problem go with it?
-
@Phaedrux
seem like it.
as per my config.g file do you think I can move X-Y to the main board and the 4 Z to the extension boards ? (older version wouldn't accept my Z sensor on extension boards, and also the sensor will not be on the same board as all the Z (only 3 ports, 4 motors)..But that would solve the communication issue as Z barely receive anything compare to x and y..
-
@jrockland, there have been fixes to CAN communication at high movement rates in 3.3beta. But I don't think any of them accounts for the reset you are seeing. I will analyse the reset data.
-
I've looked at the reset data. It returned from a function call, but then complained that the instruction following the call wasn't valid - whereas it certainly is in the firmware. So either the flash memory or the cache memory has returned the wrong data. The most likely cause I can think of for this happening is a power brownout.
-
@dc42 so we are looking at the boards power controller failing at high rate ? as for the power supply they are stable +/- 0.15v during the whole process, and under 30% load.
what can we do to stabilize the power distribution on the board ?
-
@dc42 my friend is bringing over is marlin boards, as we still want to see stability vs speed of the machine. we will be using the same power supply for comparison.
Also I now need to downgrade the boards as since I updated they are not useable anymore..
-
@dc42 might actually be power distribution to the sbc.. I know they are unstable and I wouldn't trust a raspberry pi to be anywhere near a functional tool.
I will look into it. -
@jrockland
negative, same problem with the sbc self powered.