Restart DuetControlServer (SBC)
-
Yes there is, only 1 occurrence. I have seen it before the other day as well.
Mar 1 09:39:52 duet3 DuetControlServer[30541]: [info] Settings loaded Mar 1 09:39:52 duet3 DuetControlServer[30541]: [info] Environment initialized Mar 1 09:39:53 duet3 DuetControlServer[30541]: [fatal] Could not connect to Duet (Timeout while waiting for transfer ready pin) Mar 1 09:39:53 duet3 systemd[1]: duetcontrolserver.service: Failed with result 'protocol'. Mar 1 09:39:53 duet3 systemd[1]: Failed to start Duet Control Server. Mar 1 09:39:53 duet3 systemd[1]: duetcontrolserver.service: Service RestartSec=100ms expired, scheduling restart. Mar 1 09:39:53 duet3 systemd[1]: duetcontrolserver.service: Scheduled restart job, restart counter is at 5. Mar 1 09:39:53 duet3 systemd[1]: Stopped Duet Control Server. Mar 1 09:39:53 duet3 systemd[1]: duetcontrolserver.service: Start request repeated too quickly. Mar 1 09:39:53 duet3 systemd[1]: duetcontrolserver.service: Failed with result 'protocol'. Mar 1 09:39:53 duet3 systemd[1]: Failed to start Duet Control Server. Mar 1 09:39:53 duet3 DuetWebServer[589]: #033[40m#033[1m#033[33mwarn#033[39m#033[22m#033[49m: DuetWebServer.Services.ModelObserver[0] Mar 1 09:39:53 duet3 DuetWebServer[589]: Failed to synchronize machine model Mar 1 09:39:53 duet3 DuetWebServer[589]: System.Net.Internals.SocketExceptionFactory+ExtendedSocketException (111): Connection refused /var/run/dsf/dcs.sock Mar 1 09:39:53 duet3 DuetWebServer[589]: at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress) Mar 1 09:39:53 duet3 DuetWebServer[589]: at System.Net.Sockets.Socket.Connect(EndPoint remoteEP) Mar 1 09:39:53 duet3 DuetWebServer[589]: at DuetAPIClient.BaseConnection.Connect(ClientInitMessage initMessage, String socketPath, CancellationToken cancel$ Mar 1 09:39:53 duet3 DuetWebServer[589]: at DuetWebServer.Services.ModelObserver.Execute() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetWebServe$
And when I do a systemctl status duetcontrolserver, duet on and pi duet status of Off, I get:
Mar 01 09:53:30 duet3 DuetControlServer[31177]: [warn] Note: RepRapFirmware didn't receive valid data either (code 0xffffffff) Mar 01 09:53:30 duet3 DuetControlServer[31177]: [warn] Restarting transfer because the number of maximum retries has been exceeded Mar 01 09:53:30 duet3 DuetControlServer[31177]: [warn] Bad header CRC16 (expected 0x0000, got 0x0272) Mar 01 09:53:30 duet3 DuetControlServer[31177]: [warn] Note: RepRapFirmware didn't receive valid data either (code 0xffffffff) Mar 01 09:53:31 duet3 DuetControlServer[31177]: [warn] Bad header CRC16 (expected 0x0000, got 0x0272) Mar 01 09:53:31 duet3 DuetControlServer[31177]: [warn] Note: RepRapFirmware didn't receive valid data either (code 0xffffffff) Mar 01 09:53:31 duet3 DuetControlServer[31177]: [warn] Bad header CRC16 (expected 0x0000, got 0x136c) Mar 01 09:53:31 duet3 DuetControlServer[31177]: [warn] Note: RepRapFirmware didn't receive valid data either (code 0xffffffff) Mar 01 09:53:31 duet3 DuetControlServer[31177]: [warn] Restarting transfer because the number of maximum retries has been exceeded Mar 01 10:00:55 duet3 DuetControlServer[31177]: [warn] RepRapFirmware got a bad header checksum
-
If you get that message it basically means that systemd will no longer restart the duetcontrolserver service.
The systemd documentation is a little tricky to follow but this message is generated if the service needs to be restarted more than a certain number of times over a period of time. In this case I think the defaults are 5 times in 10 seconds. I think this can happen if the Duet is turned off and the transfer ready pin floats high.
Given that the main purpose of the SBC is to talk to the Duet I'm not sure it really makes sense to disable the service in this way. If you are happy making changes to the rPi config files (worse case you just reinstall everything), you can tell systemd not to disable a service in this situation by adding the following line:
StartLimitIntervalSec=0
to the [unit] part of the service control file. In this case that file is:
/etc/systemd/system/multi-user.target.wants/duetcontrolserver.service
if you modify to look like this:
[Unit] Description=Duet Control Server StartLimitIntervalSec=0 [Service] ExecStart=/opt/dsf/bin/DuetControlServer TimeoutStopSec=60 Restart=always Type=notify User=dsf Group=dsf UMask=0002 CapabilityBoundingSet=CAP_SYS_PTRACE CAP_DAC_READ_SEARCH CAP_SYS_TIME AmbientCapabilities=CAP_SYS_PTRACE CAP_DAC_READ_SEARCH CAP_SYS_TIME [Install] WantedBy=sysinit.target
Then the action that disables the service should be turned off. Note that after making the change you should run the following commands:
pi@duet3:~ $ sudo systemctl daemon-reload pi@duet3:~ $ sudo systemctl restart duetcontrolserver
May be a good idea to make a copy of the file before you edit it.
I've seen the service be disabled when updating firmware on the STM32/LPC port of RRF (that I look after). In that case flashing new firmware is different to the Duet and the firmware restart takes longer which can sometimes trigger systemd to disable the control server. Making the above change seems to fix that problem for me. It would be interesting to see if it helps with your problem.
-
@BeosDoc said in Restart DuetControlServer (SBC):
Reseated the ribbon cable...no difference.
Do you have another ribbon cable to test with or possibly make some jumper cables?
@gloomyandy said in Restart DuetControlServer (SBC):
StartLimitIntervalSec=0
I've asked christian to take a look at this. Regardless, the Duet and Pi should be making a connection so the retry shouldn't come into play. Seems like either a bad cable or damaged pins.
-
I don't have that file in /etc/systemd/system/multi-user.target.wants
I did find it in /etc/systemd/system/sysinit.target.wants/
Mine at that location is currently:
[Unit] Description=Duet Control Server [Service] ExecStart=/opt/dsf/bin/DuetControlServer TimeoutStopSec=15 Restart=always Type=notify User=dsf Group=dsf UMask=0002 CapabilityBoundingSet=CAP_SYS_PTRACE CAP_DAC_READ_SEARCH CAP_SYS_TIME AmbientCapabilities=CAP_SYS_PTRACE CAP_DAC_READ_SEARCH CAP_SYS_TIME [Install] WantedBy=sysinit.target
-
@Phaedrux said in Restart DuetControlServer (SBC):
@BeosDoc said in Restart DuetControlServer (SBC):
Reseated the ribbon cable...no difference.
Do you have another ribbon cable to test with or possibly make some jumper cables?
No other ribbon cable, just the one supplied with your board and I have reseated it. I have short jumper wires I can use. I want to verify the required pins: 19,21,22,23,34 and 20 for ground. Is pin 17 (3.3V) and/or Pin 2/4 (5v) required? SBC has it's own power supply and the duet no jumpers on 5V -> SBC and also SBC ->5V
-
@Phaedrux My understanding is that problem occurs when the rPi is started but the Duet board does not have any power. In that situation it is possible that the duetcontrolserver service has been suspended (due to the too many restarts), before the Duet board is even started. If that happens there is no way that the Duet is ever going to connect, because it has nothing to connect to. Certainly if that "duetcontrolserver.service: Start request repeated too quickly" message is in the syslog it would indicate that the service is being suspended. Either way the change is easy to make and then test to see if it helps with the problem.
@BeosDoc That's odd, perhaps it has been moved at some point, my rPi image was created probably a year or so ago (but has been updated continuously since then).
-
To be honest I'm not sure about the 3.3 and 5v pins. Try without if you're short on jumper wires.
-
I'll try the jumpers first. Then I'll try the changes to that file.
-
First part. I replaced the ribbon cable with very short, 9cm (ribbon is 20cm), and brand new jumpers. Initial boot worked. After that it still had the status of Off.
FYI: You do need pin 17 (3.3V) hooked up. Without it, it didn't see it at all.
For future reference Duet 3 > rPi you need pins 17, 19, 21, 22, 23, 24, 26 (not sure about 26) and one of the grounds (like 20 and/or 25
Now for the second part with the file
-
@gloomyandy said in Restart DuetControlServer (SBC):
Given that the main purpose of the SBC is to talk to the Duet I'm not sure it really makes sense to disable the service in this way. If you are happy making changes to the rPi config files (worse case you just reinstall everything), you can tell systemd not to disable a service in this situation by adding the following line:
StartLimitIntervalSec=0to the [unit] part of the service control file. In this case that file is:
/etc/systemd/system/multi-user.target.wants/duetcontrolserver.serviceI modified the file /etc/systemd/system/sysinit.target.wants/duetcontrolserver.service
I tried it several times and it didn't see it
-
@Phaedrux said in Restart DuetControlServer (SBC):
I've asked christian to take a look at this. Regardless, the Duet and Pi should be making a connection so the retry shouldn't come into play. Seems like either a bad cable or damaged pins.
If it was a bad cable or damaged pins, then it probably wouldn't work at all or there would be lots of comm errors. If I reboot the pi or restart DuetControlServer it works find, I've done several prints.
-
@BeosDoc Are you getting any
systemd[1]: duetcontrolserver.service: Start request repeated too quickly.
messages now in your syslog file? Are you rebooting the rPi to test this?
-
@gloomyandy Thanks for pointing out this limitation, I'll check if I can integrate your change in the next software version. The automatic fail detection wasn't an issue in earlier DSF versions because that always had a restart delay of 5 seconds.
@BeosDoc What I find odd about your report is that you do get occasional transfers (albeit with bad checksums) so I'm wondering if there is something wrong with the SPI peripheral either on the Pi or the Duet. Can you run the spidev_test procedure as described here? I'll update the docs with better troubleshooting instructions soon.
-
@BeosDoc said in Restart DuetControlServer (SBC):
First part. I replaced the ribbon cable with very short, 9cm (ribbon is 20cm), and brand new jumpers. Initial boot worked. After that it still had the status of Off.
FYI: You do need pin 17 (3.3V) hooked up. Without it, it didn't see it at all.
For future reference Duet 3 > rPi you need pins 17, 19, 21, 22, 23, 24, 26 (not sure about 26) and one of the grounds (like 20 and/or 25
To clarify: using Duet 3 MB6HC version 1.01 and later, or a Duet 3 Mini, you do need the 3.3v pin because it feeds the voltage translation buffer. You do not need the 5V pin.
It's better to connect more than one ground, especially the ground pins close to the SPI pins.
-
@gloomyandy said in Restart DuetControlServer (SBC):
@BeosDoc Are you getting any
systemd[1]: duetcontrolserver.service: Start request repeated too quickly.messages now in your syslog file? Are you rebooting the rPi to test this?
Actually I don't see any of those messages since I made the change to that file. The last time it appeared was 18:22 on Mar 1 which is before I made the change. I did a reboot of pi after I changed /etc/systemd/system/sysinit.target.wants/duetcontrolserver.service but not since then and I even did a print last night. The pi has it's own separate power supply and stays on (several reasons: powering off pi without proper shutdown could cause some corruption, don't want to wait for it to boot )
-
@BeosDoc Well that's a good sign then and possibly removes one potential source of the problem you are seeing.
-
@chrishamm said in Restart DuetControlServer (SBC):
@BeosDoc What I find odd about your report is that you do get occasional transfers (albeit with bad checksums) so I'm wondering if there is something wrong with the SPI peripheral either on the Pi or the Duet. Can you run the spidev_test procedure as described here? I'll update the docs with better troubleshooting instructions soon.
Here's the results:
pi@duet3:~ $ RDY=22 CS=24 ; { > gpio -1 mode $CS out > gpio -1 mode $RDY in > gpio -1 write $CS 1 && echo "(Pin RDY/$RDY) `gpio -1 read $RDY` should equal `gpio -1 read $CS` (Pin CS/$CS)" > gpio -1 write $CS 0 && echo "(Pin RDY/$RDY) `gpio -1 read $RDY` should equal `gpio -1 read $CS` (Pin CS/$CS)" > { ~/spidev-test/spidev_test -v -s 8000000 -D /dev/spidev0.0 && echo RX should equal TX. ;} | tail -n3 | cut -b-100 ;} (Pin RDY/22) 0 should equal 0 (Pin CS/24) (Pin RDY/22) 0 should equal 0 (Pin CS/24) TX | FF FF FF FF FF FF 40 00 00 00 00 95 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF F0 0D RX | FF FF FF FF FF FF 40 00 00 00 00 95 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF F0 0D RX should equal TX. pi@duet3:~ $
SPI should be working, I've done several prints. Also, this isn't a good test, it shows that it's working but 32 bytes isn't long enough to see if there is any errors during a long transmission .
Is there a SPI test for the Duet 3?