Nested runsvdir

How to have a runit runsvdir started (and stopped!) as a runit service

tl;dr

executable script control/t:

#!/bin/sh
exec sv -v h .

and, as expected, run:

#!/bin/sh
exec runsvdir ./subservices

or for an individual user:

exec chpst -u peter runsvdir ~peter/.config/runit/default

Full description

It can be handy to run runit's runsvdir in a nested way. Possible applications include giving individual users a directory where they can manage their own daemons, or to manage a cluster of services that are started and stopped together (in my use case it's an application of semi-independent device synchronization services).

The tricky thing is: Just creating a service with the straight-forward run file from above does not quite cut it, because when the service is shut down with sv d, the runsvdir process receives the TERM, CONT and KILL signals, which make it terminate, and its child processes get reparented to init, where the underlying runsv processes keep running; moreover, a subsequent sv u creates competing runsv processes that spew errors.

As a remedy, the control/t script has to be added. As described in the runsv documentation, both the d (down) and the x (exit) commands can be implented in the t (terminate) script, where the "threatening" TERM/KILL signal is intercepted and an explicit SIGHUP is sent. That signal is interpreted by runsvdir as "terminate each service". To avoid raciness, the -v switch in the t script makes the shutdown wait for the inner runsvdir to actually shut down.

Please note that this is a workflow that (so far) has worked for me, and that I a not familiar enough with the details of runit to be confident that there are no conceptual shortcomings or other pitfalls.

More features

It seems that the following executable script as check allows propagating startup checks (so a sv -v u root_service would wait for all child services to be started up):

#!/bin/sh
exec sv check ./subservices/*

I suspect this could have a race condition (what if the check were called before the run process had a chance to even create the supervision direcories, might a check find that there is nothig to wait for?), so I urge even more caution with it.

The above script contains configuration duplication with the run script. I'm pondering using a workflow where there's a static control directory that serves the generic "this service is a runsvdir service" case, which would then be symlinked into the respective services, and might even create (symlinks to) static run/check scripts, but that's a different story.

Other approaches

A 2004 discussion indicated that svwaitdown could be used in such situations; however, runit has changed in the meantime, and that command is not available any more.

Cluster services can also be included in a runlevel by symlinking all the cluster's individual services in there, but that seems hard to manage to me.

About this text

This text was written by chrysn <chrysn@fsfe.org> 2016-10-18, and is published under the terms of CC-BY-SA.

The original URI for this document is http://christian.amsuess.com/tutorials/nested-runsvdir/. It is available as HTML or as reStructuredText.