Run a Command Transiently When a Systemd Service Exits
Run a Command Transiently When a Systemd Service Exits
I have a Systemd oneshot unit that runs a backup on a timer. Recently, I performed an update on that machine while the backup was running, and I wanted to reboot after the backup finished. In the past, I might have started a shell loop in a tmux session like …
I have a Systemd oneshot unit that runs a backup on a timer. Recently, I performed an update on that machine while the backup was running, and I wanted to reboot after the backup finished. In the past, I might have started a shell loop in a tmux session like this:
mainpid="$(systemctl show --property=MainPID backup-job.service |
cut -d= -f2)"
while kill -0 "$mainpid"
do
sleep 10
done; systemctl reboot
This has never failed for me, but it could have an issue where the backup job completes during the sleep and its PID gets re-used before the next execution of kill. Since Systemd was supervising the backup task, it would know exactly when it finished. I decided to investigate using a transient unit started with systemd-run
to take advantage of this.
Running a Command After a Oneshot Exits
Since I didn’t want to wait for a full backup to complete for every iteration of testing this, I created a new oneshot unit to use while developing:
# /etc/systemd/system/test10s.service
[Unit]
Description=Test Service that runs for 10s and then succeeds
[Service]
Type=oneshot
ExecStart=/usr/bin/sleep 10
And then I tested using the After
property of the transient unit to run a command when the test10s
unit exited:
# systemctl start --no-block test10s
# systemd-run --no-block --property After=test10s.service sh -c 'date > /tmp/datelog'
Running as unit: run-r154afc9c334e467c9b38c2027e1821b4.service
# journalctl --since=-1min -u test10s.service -u run-r154afc9c334e467c9b38c2027e1821b4.service
Jun 11 18:56:27 localhost systemd[1]: Starting Test Service that runs for 10s and then succeeds...
Jun 11 18:56:37 localhost systemd[1]: test10s.service: Succeeded.
Jun 11 18:56:37 localhost systemd[1]: Finished Test Service that runs for 10s and then succeeds.
Jun 11 18:56:37 localhost systemd[1]: Started /usr/bin/sh -c date > /tmp/datelog.
Jun 11 18:56:37 localhost systemd[1]: run-r154afc9c334e467c9b38c2027e1821b4.service: Succeeded.
# cat /tmp/datelog
Tue 11 Jun 2024 06:56:37 PM UTC
This worked perfectly! I also tested this with a unit that exited with a failure and, as I expected from the Systemd documentation, the transient command was also executed in that case.
If you decide you don’t want the transient follow-up command to execute, it is possible to run, for example in this case, systemctl stop run-r154afc9c334e467c9b38c2027e1821b4
and it will never take effect.
Running a Command After a Non-Oneshot Exits
What if I had written my backup job as another service type, like simple
or exec
, or if for some reason the service had RemainAfterExit=true
but would be stopped in some other way?
# /etc/systemd/system/simple10s.service
[Unit]
Description=Simple Service that runs for 10s and then succeeds
[Service]
Type=simple
ExecStart=/usr/bin/sleep 10
This is a trickier proposition. Generally, Systemd defines a unit as started once the main process has started (with various differences between the different Type=
specifications); only oneshots are “started” once the main process exits. Thus, using After=
would have the transient command executed effectively immediately, not after the unit had exited.
I tried a few ways of overcoming this, and none of them were perfect. The first that worked was:
# systemctl start simple10s
# systemd-run --property ExecStopPost="sh -c 'date > /tmp/datelog'" --property BindsTo=simple10s.service --property RemainAfterExit=true true
Running as unit: run-r2867d350197b4a0fb0c2a7791039e991.service
# journalctl --since=-1min -u simple10s.service -u run-r2867d350197b4a0fb0c2a7791039e991.service
-- Journal begins at Tue 2024-04-23 18:57:44 UTC, ends at Tue 2024-06-11 19:34:49 UTC. --
Jun 11 19:34:29 localhost systemd[1]: Started Simple Service that runs for 10s and then succeeds.
Jun 11 19:34:29 localhost systemd[1]: Started /usr/bin/true.
Jun 11 19:34:39 localhost systemd[1]: simple10s.service: Succeeded.
Jun 11 19:34:39 localhost systemd[1]: Stopping /usr/bin/true...
Jun 11 19:34:39 localhost systemd[1]: run-r2867d350197b4a0fb0c2a7791039e991.service: Succeeded.
Jun 11 19:34:39 localhost systemd[1]: Stopped /usr/bin/true.
# cat /tmp/datelog
Tue 11 Jun 2024 07:34:39 PM UTC
The BindsTo=
property links the transient service in such a way that when the main service exits, it will also exit, thus running its ExecStopPost
command.
The problem with this is that there is no way to cancel the transient service without running its command. Because the “real” command is in ExecStopPost
, whatever way you try to stop it will still cause this to execute.
It is possible to allow the cancellation of the transient command, but it is a bit more verbose. If you let it run to completion:
# sytemctl start simple10s
# sudo systemd-run --property ExecStopPost="sh -c 'if test "\$SERVICE_RESULT" = success; then date >/tmp/datelog; fi'" --property BindsTo=simple10s.service sleep infinity
Running as unit: run-r3da04227980d47eeaf8d5299b8f14ca8.service
# journalctl --since=-1min -u test10s -u run-r3da04227980d47eeaf8d5299b8f14ca8.service
-- Journal begins at Tue 2024-04-23 18:57:44 UTC, ends at Tue 2024-06-11 20:25:07 UTC. --
Jun 11 20:24:52 localhost systemd[1]: Started Simple Service that runs for 10s and then succeeds.
Jun 11 20:24:52 localhost systemd[1]: Started /usr/bin/sleep infinity.
Jun 11 20:25:02 localhost systemd[1]: simple10s.service: Succeeded.
Jun 11 20:25:02 localhost systemd[1]: Stopping /usr/bin/sleep infinity...
Jun 11 20:25:02 localhost systemd[1]: run-r3da04227980d47eeaf8d5299b8f14ca8.service: Succeeded.
Jun 11 20:25:02 localhost systemd[1]: Stopped /usr/bin/sleep infinity.
# cat /tmp/datelog
Tue 11 Jun 2024 08:25:02 PM UTC
And if you decide to kill it before the main service finishes:
# sytemctl start simple10s
# systemd-run --property ExecStopPost="sh -c 'if test "\$SERVICE_RESULT" = success; then date >/tmp/datelog; fi'" --property BindsTo=test10s.service sleep infinity
Running as unit: run-r231cd647fb0b4839b8ac10e93b9ead68.service
# systemctl kill --signal SIGKILL run-r231cd647fb0b4839b8ac10e93b9ead68.service
# sudo journalctl --since=-1min -u simple10s -u run-r231cd647fb0b4839b8ac10e93b9ead68.service --no-pager
-- Journal begins at Tue 2024-04-23 18:57:44 UTC, ends at Tue 2024-06-11 20:28:59 UTC. --
Jun 11 20:27:12 localhost systemd[1]: Started Simple Service that runs for 10s and then succeeds.
Jun 11 20:27:12 localhost systemd[1]: Started /usr/bin/sleep infinity.
Jun 11 20:27:16 localhost systemd[1]: run-r231cd647fb0b4839b8ac10e93b9ead68.service: Sent signal SIGKILL to main process 959185 (sleep) on client request.
Jun 11 20:27:16 localhost systemd[1]: run-r231cd647fb0b4839b8ac10e93b9ead68.service: Main process exited, code=killed, status=9/KILL
Jun 11 20:27:16 localhost systemd[1]: run-r231cd647fb0b4839b8ac10e93b9ead68.service: Failed with result 'signal'.
Jun 11 20:27:22 localhost systemd[1]: simple10s.service: Succeeded.
# cat /tmp/datelog
cat: /tmp/datelog: No such file or directory
It is necessary to use systemctl kill --signal SIGKILL
instead of systemctl stop
because the underlying service needs to exit with a non-zero status.
TL;DR
To run a command after a oneshot service exits, use:
systemd-run --no-block --property After=${the_service} ${the_command}
This can be canceled by simply running systemctl stop ${the_transient_service_name}
.
To run a command after a non-oneshot service exits, use:
systemd-run \
--property ExecStopPost="sh -c 'if test "\$SERVICE_RESULT" = success; then ${the_command}; fi'" \
--property BindsTo=${the_service} \
sleep infinity
This can be canceled by running systemctl kill --signal SIGKILL ${the_transient_service_name}
.
I think the first case with the oneshot is pretty easy to use, but the second case for all other service types is a bit too verbose and error-prone for me to use regularly, especially since the kill-loop is already firmly in my toolbox.