How to daemonize a process

Creating a daemon isn't specially difficult, but it involves some parts of the system that developers often don't deal with explicitly, and may not understand well. This can lead to cargo-cult programming, and sometimes bugs.

Cookbook

Here is a simple Perl routine that will daemonize a process
use POSIX;

sub daemonize
{
    fork and exit;
    POSIX::setsid();
    fork and exit;
    umask 0;
    chdir '/';
    close STDIN;
    close STDOUT;
    close STDERR;
}
If the daemon has code that relies on STDIO, you can reopen the system file descriptors on /dev/null
sub daemonize
{
    fork and exit;
    POSIX::setsid();
    fork and exit;
    umask 0;
    chdir '/';
    open STDIN , '<', '/dev/null';
    open STDOUT, '>', '/dev/null';
    open STDERR, '>', '/dev/null';
If you want to spawn some other program, add an exec call. Leave the system file descriptors open for the exec'd program.
sub daemonize
{
    fork and exit;
    POSIX::setsid();
    fork and exit;
    umask 0;
    chdir '/';
    exec 'my-daemon';
}
If you want the parent to continue execution, do a return instead of exit
sub daemonize
{
    fork and return;
    POSIX::setsid();
    fork and exit;
    umask 0;
    chdir '/';
    exec 'my-daemon';
For production code, add error checking
use POSIX;

sub Fork
{
    my $pid = fork;
    defined $pid or die "Can't fork: $!\n";
    $pid
}

sub daemonize
{
    Fork and return;
    POSIX::setsid();
    Fork and exit;
    umask 0;
    chdir '/' or die "Can't chdir to /: $!\n";
    exec 'my-daemon'
    die "Can't exec my-daemon: $!\n";
}

Analysis

To create a daemon, you have to do six things.
  1. fork (first time)
    The first fork allows the child to run the daemon while the parent continues execution, or exits. In addition, it ensures that the child is not a process group leader, so that the following setsid() call will succeed.
  2. setsid
    This is how we lose our controlling terminal. setsid makes our process the session leader of a new session, and when a new session is created, the session leader has no controlling terminal.
  3. fork (second time)
    The leader of a new session has no controlling terminal, but it could acquire one, for example, by opening /dev/tty or the like. To ensure that this doesn't happen, we fork again.

    After the second fork, the child is not the session leader (the parent is the session leader) and only a session leader can acquire a controlling terminal. Of course, the child could become a session leader by executing setsid again, but presumably anyone who does that knows what they are doing.

  4. umask 0
    umask is inherited across fork and exec, and we don't want our daemon to be limited by whatever umask it happens to inherit from its parent.
  5. chdir '/'
    The current working directory is inherited across fork and exec. If we leave the daemon running in some random directory, then the file system that contains that directory can never be umounted, and someone might want to.
  6. close
    Even though the daemon has no controlling terminal, it may still have open file descriptors on a terminal, and these can cause trouble. For reliable operation, close the system file descriptors.
In Perl, non-system file descriptors (above $^F) are automatically marked close-on-exec.

Some Perl modules dup STDERR. You can close STDERR, but the dup'd descriptor will remain open on the terminal. One way to avoid this is to close STDERR in a BEGIN block, before the modules load

BEGIN { close STDERR; }
If that isn't convenient, you can hunt down the dup'd descriptors and close them with code like
my $fd2    = "/proc/$$/fd/2";
my $stderr = readlink $fd2 or die "Can't readlink $fd2: $!\n";

for my $link (</proc/$$/fd/*>)
{
    my($fd) = $link =~ /(\d+)$/;
    $fd > $^F or next;
    readlink($link) eq $stderr and POSIX::close($fd);
}

Context

Most of the confusion surrounds the setsid() call. The setsid(2) and setpgid(2) man pages state the relevant facts; however, as is typical for Unix documentation, they provide little context.

Here is some context.

Processes and terminals

What all this is about is managing which terminals talk to which processes. A terminal provides three things to a process A terminal that is configured to provide these things is called a controlling terminal for the process. A process may or may not have a controlling terminal. Generally, you want processes that you run in a shell session, like ls or cat, to have a controlling terminal. That's how you control them. Conversely, you want daemons to not have a controlling terminal. and, perhaps most importantly The problem with signals is that they are likely to cause the daemon to exit, and the whole point of running a daemon is to have a process that does not exit (at least, not unless you want it to).

A particular problem is SIGHUP. HUP is short for Hang UP; the terminology dates back to the days when people connected physical terminals to computers using modems and phone lines.

Many programmers will recognize SIGHUP as the signal that you send to a daemon to tell it to re-read its configuration files. However, the original use of SIGHUP was to tell a processes that the phone connection to its terminal had been broken: that it literally no longer had a controlling terminal. The default action for a process that receives SIGHUP is to terminate (see signal(7)).

The modern equivalent of a physical terminal is a pseudo-terminal, of the sort provided by XTerm or PuTTY. When a pseudo-terminal exits (for example, because the user closes the window on their screen) the OS sends SIGHUP to processes that were controlled by that terminal. (See setpgid(2) for details.) If your daemon still has a controlling terminal, it is liable to exit when this happens. So you want your daemon to not have a controlling terminal.

A daemon that is properly started with no controlling terminal will never receive SIGHUP from the OS. Thus, it is convenient to repurpose SIGHUP to tell the daemon to read its configuration files.

Sessions and process groups

A daemon just needs to lose its controlling terminal, but a shell needs to use its controlling terminal, and to manage the use of that terminal by the processes that it starts. For example, if you do
% cat > foo
then you want STDIN, Ctl-D, and Ctl-C to be directed to the cat, not to your shell.

If you do

% cat foo | sort | more
then you want to be able to use STDIN to control the more, and you want Ctl-C to kill off the entire pipeline, which means that it has to be delivered to all three processes.

If you do

% sort foo > bar &
% more baz
then you want the sort to run quietly in the background while you use the terminal to view baz.

To manage all this, Unix provides some addition structures for processes

This is a strict containment hierarchy. It looks like
+-session 1 ----------------+
|                           |
|  +-process group 1-----+  |
|  |                     |  |
|  | process 1           |  |
|  | process 2           |  |
|  |                     |  |
|  +---------------------+  |
|                           |
|  +-process group 3-----+  |
|  |                     |  |
|  | process 3           |  |
|  | process 4           |  |
|  | process 5           |  |
|  |                     |  |
|  +---------------------+  |
|                           |
+---------------------------+
Generally, a session corresponds to a terminal session: a shell and all the processes that the shell runs on behalf of the user. When you start a new terminal, the OS creates a new session for you, and the shell becomes the session leader.

The shell typically creates a new process group for each pipeline that it runs (even if there is only one process in the pipeline).

Any process in the session can write to the terminal. That is why you sometimes see output from background processes in the middle of something else.

One process group is designated foreground; all other process groups are in background. Processes in the foreground group can read from the terminal; processes in the background will block if they try to read from STDIN. If a signal is generated by the terminal (e.g. Ctl-C), then that signal is delivered to each process in the foreground process group.

This system allows shells to manage processes in the way that users want, and to make input, output, and signals work the way that users expect.

A process that wants to get out of the system—to not be in the same session as the shell—calls setsid(). It then becomes the session leader of a new session, and that session has no controlling terminal.


Notes

It then becomes the session leader of a new session
It's a bit like going off into the desert to start your own cult.

Steven W. McDougall / resume / swmcd@world.std.com / 2013 Nov 17