2017-01-13
This post takes a short break from ASDL to expand on one of my Hacker News comments.
I realize I haven't done much of what I mentioned in the second sentence of this blog: explain why the Unix shell is interesting. This is partly because almost nobody has questioned this project — it appears programmers do want a better shell.
But there are a aspects of the language design that are rare, and they're worth explaining. This is the first of at least three topics.
Configuring Unix Daemons
Systemd has the stated goal of replacing shell scripts in the boot process of a Linux system.
In response to yet another thread about systemd's design
tradeoffs, agumonkey
claimed that shell doesn't have enough abstraction
power. He suggested instead a Lisp-like configuration system:
(->(path"/usr/bin/some-service""arg0"...)(wrap:pre(lambda()...):post(lambda()..))(retry5)...(timeout5))
I pointed out that shell already supports this kind of higher-order programming. For example, here's a function that takes a command and tries it five times:
retry(){local n=$1shiftfor i in $(seq $n);do"$@"done}
It can be used like this:
$ retry 5 hello-sleep 0.1
hello
hello
hello
hello
hello
Here we are passing an integer 5
and a code snippethello-sleep 0.1
to
the retry
function. Because retry
treats code as data, you can call it ahigher-order function.
Taking it further, we can compose our retry
function with the timeout
binary in coreutils
by prepending two more words:
$ timeout 0.3 $0 retry 5 hello-sleep 0.1
hello
hello
hello # killed after 0.3 seconds
(Runnable code is available in forth-like directory of theoilshell/blog-code repository).
Because functions can be composed by simple juxtaposition, I said that shell has a Forth-like quality.
In the Forth language, functions can be composed like this because they work on an implicit stack of arguments and return values. If that doesn't make sense, this blog post may help.
Shell doesn't have an implicit stack, but the uniform representation of words in
the argv
array, and "splicing" with "$@"
, results in code that feels
similar.
In contrast, this mechanism isn't idiomatic in Python or JavaScript. I tried
porting demo.sh
to Python with demo.py, and it sort of works if you write
all functions like f(*args)
. But this goes against the grain of the
ecosystem. In these languages, functions and arguments are treated differently
from a syntactic and semantic point of view.
daemontools and Bernstein Chaining
In the book The Art of Unix Programming, which is a great exposition of the Unix philosophy, Eric Raymond calls the techniqueBernstein chaining.
Daniel J. Bernstein uses this shell technique in software likeqmail and daemontools to follow the principle of least privilege.
In contrast to systemd, daemontools is a Unix init
toolkit which
relies on the idiom of small C programs composed with shell scripts.Celebrating daemontools makes a good case for it and shows examples.
Here's an excerpt that uses Bernstein chaining of setuidgid
and softlimit
,
as well as the builtin exec
:
# change to the user 'sample', and then limit the stack segment# to 2048 bytes, the number of open file descriptors to 3, and# the number of processes to 1:exec setuidgid sample \
softlimit -n 2048 -o 3 -p 1\
some-small-daemon -n
Daemontools is minimally documented and doesn't see much use today, butrunit has the same architecture, as well as a collection of tiny shell scripts that illustrate its use.
They are admittedly a bit cryptic, but the architecture is what I care about.systemd does separate some of this functionality in a separate systemd-nspawn binary, but it doesn't appear to be used much without the rest of systemd.
Conclusion
daemontools and systemd are interesting because they represent extremes with respect to the modularity of their design.
Since I'm writing a shell, it shouldn't be a surprise that I'm biased toward the style of daemontools. But systemd has valid criticisms of shell scripts. The language has many problems, array syntax being one example.
On the other hand, I wouldn't be surprised if systemd configuration accidentally turns Turing-complete, as shell and make did.
I don't know what the best answer is, but I think that an improved shell will
help the situation. At the very least, Lisp isn't necessary. With oil
, I
aim to preserve the timeless architectural characteristics of shell, while
abandoning ugly, inconsistent syntax, and smoothing over its sharp corners.
Appendix: Commands that Compose
Here is a list of tools that can be composed in this Forth-like manner:
sudo
: Run a command as another user.chroot
: Run a command with a different root directory.env
: Run a command in a different environment./usr/bin/time
: Run a command and system summarize resource usage.su
: change user ID. This has the questionable interface of taking a shell string with-c
instead of passing its remaining args, which leads to quoting problems.ssh
: Run a command. Also has quoting problems.strace
: Trace system calls and signals.- systemd-nspawn
- Most tools in daemontools, runit, and other toolkits like s6
gdb
: Debug native programs.
These are shell builtins that compose:
exec
: Replace the process image; wrapper for theexec()
system call.time
: In bash, this is a builtin which also takes a block, e.g.time { echo 1; echo 2; }
.command
andbuiltin
: Change the lookup order of the first word — is it an external command in$PATH
or internal to the shell?
Updates (2017/1/24)
Thanks to Eric Wieser for fixing the style of the Python version. The pattern works better if every function looks like
myfunc(myarg, f, *args)
, but I would still say it goes against the grain of the ecosystem, as mentioned above.The next post in this series also mentions Forth, because functions in Forth compose in a point-free style. Bernstein chaining isnot quite point-free because we mention
"$@"
, but pipelines do compose in a point-free style.
Discuss this post on Reddit