'SPI connection has been reset' while tuning
-
Hey all, have been attempting to do some motor tuning with the 1HCL boards, but more often than not I am getting "SPI connection has been reset", usually in the middle of the motor performing a tuning move with data collection.
This is on a Mini 5+ with a Pi4 running all latest updates. The Mini is on 3.5.0-rc.3 along with a TOOL1LC, and two EXP1HCLs which are running the X and Y for the CoreXY motion of this RatRig.
The Mini 5 is running on a 24V mean well supply, and the 1HCLs are on a 48V supply, both of which are tied together with a common ground. The Pi is running the Duet Pi image so it has the wifi fixes integrated.
The motors are StepperOnline 1000CPR Nema17 steppers and have been working fine through calibration and numerous prints. However, a few times now I've been trying to tune these motors (which seem to respond oddly to tuning) and many times when I command a move with the Closed Loop plugin, during the move I receive the above error.I've tried swapping Pis, re-orienting the cable, reseating it and a few other things, but it still seems to have the same issue. Not sure where to go here, as I've tried the troubleshooting guide but gotten nowhere.
Here's my config file, I perform the motor startup calibration in the homeall.g
config.g
homeall.g -
@Chorca Do you have any heater/stepper cables running next to the ribbon cable? If yes, try moving them away. Also, the output of
M122
after a reset may help. -
@chrishamm
I don't have any heaters or anything nearby.. The only steppers the Mini5 is controlling is the Z axis, the heaters are being controlled up on the toolhead, and the bed heater is controlled via an SSR (both are off during this process)
The part fan is also connected here, but is off during the tuning process. The 1HCLs are about 12" away from the board, along with their power supply, near the stepper motors driving the X/Y.
I've included a couple photos to show the setup, and added the M122 output here:
-
@Chorca You seem to have mounted all your electronics on an acryl glass plate. Did you check if there might be some potential difference between the boards connected via SPI and/or if the issue gets better if you connect the GND of the PCBs and the metal parts of the printer with a sufficiently thick wire to eliminate the risk of that?
-
Big plastic sheets like that have a tendency to build up a heck of a static electric charge.
-
@NeoDue
A good suggestion!
I just went through and made sure all power supplies have their negative rails connected together, and tied the connected negative rails to earth ground at the closest power supply to the frame. I ensured all grounds are tied together star style, and are connected to earth ground via the outlet. I tied the frame to ground as well and validated that it has low resistance to ground. Each board's negative is tied to ground now and I added a 5V 10A MeanWell supply to the Raspberry Pi to ensure it's getting it's own clean power supply that's also at the same ground potential.After doing all that, I am still getting the SPI errors while performing moves while in the Closed Loop plugin.
I've attached the M122 from the current instance of the issue, but I assume it's similar to the first one.
m122-2.txt -
-
Can you please provide some photos of your wiring? Including the power to the 1HCL boards.
-
@Phaedrux
Sure thing:
-
@Chorca It's rather difficult to see, can you confirm that the termination jumpers have been removed from the 1HCL on the left? And can you confirm that you have CAN termination on the TOOL1LC provided you don't have a tool distribution board somewhere else? If not, see here -> Connecting WITHOUT Duet 3 Tool Distribution Board.
Unfortunately your M122 outputs don't show anything interesting, not even an SPI reset. Please send M122 immediately after an SPI connection reset if you can.
-
@chrishamm I've validated that the 1HCL boards do NOT have the termination jumpers connected, and the 1LC board DOES have it's jumper soldered.
I attempted to grab a M122 as fast as possible after the reset this time, and I sent the command 1 second after the notification popped up on the screen.
-
@Chorca I am not too deep into what the M122 output tells us here, but if I interpret that correctly, the Duet is running config.g at the time and just finished a software reset with the reason "User" - a combination which I would interpret in such a way that the Duet actually got the command to reset itself from somewhere - which then logically killed the SPI connection for a brief time.
Just to rule that out: I have no clue what you did run at that time in order to tune the printer, but if any kind of macro (including any system macros) was involved: did you check if there might be any misplaced code in there that could cause the reset?
Edit - one more thing, thinking about your plexiglass mounting plate: do you happen to have a physical reset button or such added / configured?
-
@NeoDue I was running a "Record" command from the Closed Loop tuning plugin with these settings:
Upon clicking Record, there's about a 20% chance the move will begin, and while the move is taking place, the printer will stop moving and the board resets.
After going through plenty of motor tuning, the settings here don't seem to have any effect on the SPI errors. I'm assuming it may have to do with the motor data being transferred out of the controller to the Pi, but not sure if that would cause a reset or some other error may cause it.I don't have any physical reset buttons installed anywhere, just the board itself.
Would any CAN bus commands from other boards cause a reset of the main Duet board? -
@Chorca Okay, that excludes any EMF issues with an external reset button.
Then I fear I am out, I do not know the plugin you use and do not use CAN here with my Duet as well.
As I understand your initial post, the setup is new and was not tested before. Therefore, if it works with the printer, I would try to go forward with simplifying the system by disabling and physically removing any parts you deem a possible cause followed repeating your tests. If the error does not occur any more, you know that one of the pieces you unplugged and disabled is the culprit - plug some in, retry, etc. until you have narrowed down the issue.While you do that, someone who knows more about this plugin hopefully checks the software side
-
@NeoDue Sadly, the change here was to move to closed-loop stepper control via the 1HCLs, and the closed-loop tuning is what seems to be instigating this issue, as I've never had this error occur outside running the closed loop tuning commands, so it seems like it's been narrowed down to something going on here.
I appreciate the help! -
A bit more info as I went through attempting to tune the motors again today.. it seems if the motor is oscillating a bit, i.e. running a low P (<20) with all other parameters set to zero, the resets happen more often and i can't get more than about 300ms of data from the board, but setting to higher values seems to make it happen less. An odd thing for sure.
-
Spent some more time on this tonight.
I disconnected everything not necessary from the Duet board, leaving only power, 1 endstop, and the Z stepper motors, along with the CAN bus and SPI cable. Curiously I saw a reset or two happen without movement.
I hooked up a logic analyzer to the SPI lines and checked those. I also enabled debug logging via the USB port.
I checked the Duet's 3.3v line for any issues, spikes, noise, etc, but it is very quiet and during motor movements I saw no noise on it, so I feel like I've ruled out a brownout on the processor. I looked at some other lines but similarly didn't see much noise at all on them.
Curiously, with debug enabled, the reset errors became much less frequent, which makes me wonder if there's a race or something that's causing a reset of the board.
I've included a CSV of the SPI logging, and a text log from the Duet's serial output while the issue happened. The reset is at the end of the files, I stopped logging right after it died. I have the Saleae Logic file which I can send on request as the forum doesn't support that filetype.
-
@Chorca okay, that should indeed rule out the "simple" reason I suspected. Then I cross my fingers the devs here will be able to find the issue with your data!
(Probably a bit far-fetched, but anyway: I guess the oscilloscope you used to check the lines is fast enough to let you see possible spikes? This might only apply if you are using a rather antique and/or slow piece of equipment (as I do... ), but it does not hurt to mention it anyway before anyone starts chasing ghosts...)
-
@NeoDue It is indeed a 250Mhz scope and I tested at a few speeds and triggers, to see if there was any detectable spikes.
I saw mentioned in another thread to verify if NC endstops are used, so i'll double-check that too. -
@Chorca Can you confirm that this issue persists with 3.5.0-rc.4?
-
@chrishamm It appears to be happening much less with 3.5.1, though I was able to trigger a SPI reset; it happened after about 40 moves instead of one or two.
Instead it appears the data capture is being cut off somewhere, as the received data should be 1 second long but i am only getting around 150-350ms of data back for each run. Once in awhile (about every 10 runs) I get a complete data log.
Here's a screenshot showing my settings and the returned data:
EDIT: Looks like I spoke too soon, it seems to be resetting just as often on longer moves (100mm+).
I also validated that the endstop wasn't the culprit by unplugging it after homing, it still seems to reset without that connected.