String
interpolation is a convenient language feature which makes creating
human-readable strings that contain variables easier. The classical approach
to this in languages without string interpolation is something like printf
- a function which accepts a format string and a variable number of
arguments. The format string contains special sequences of characters which
get replaced with the values of passed arguments, like this:
This has various problems, for example the possibility to supply more (or less) arguments than expected. String interpolation allows you to embed variables directly into the format string:
As with many features, some languages provide string interpolation and some don't. In the dark and troubled past of JavaScript, for example, some people switched to CoffeeScript for its string interpolation support (among other things.) More recently, string interpolation came to Python in the form of f-strings (PEP-498) and to plain JavaScript in the form of template literals (supported by Babel and other transpilers). While the feature is most often associated with dynamic, dynamically typed languages, some statically typed ones, for example C# and Swift, also support it, so it's not exclusive to the former.
Help! No string interpolation in Racket!
Racket - like JS at one point - lacks the string interpolation support. If Racket was JS, the only options would be to either write a transpiler, or switch to another language. Both choices normally require a lot of work, so developers tend to learn to live without the feature instead and move on.
But Racket is not JS, it's a Lisp. All Lisps are programmable programming languages[1], but Racket takes it up to eleven with the multidute of tools for language extension (and creation)[2]. How much work would it be to make Racket interpolate strings?
How Racket processes code
To process any code, Racket performs these steps:
- reading - accomplished by two functions,
read
andread-syntax
, reading means taking a string and returning an AST[3] (which happens to be a normal list with normal symbols, strings, numbers, etc. Such a list can be wrapped with a syntax object, which adds some meta-data to the raw list. You can always get a raw list out of syntax object withsyntax->datum
). - expansion - all the special forms and macros in the read code are expanded until only the most basic forms (modules, function applications, value literals) are left.
- compilation - Racket performs byte-code compilation on the expanded code. This is where most of the optimizations are applied.
- execution - compiled byte-code is handed to the VM for execution. At this stage, on some architectures, further JIT compilation happens, turning (possibly portions of) bytecode into native code.
The first two steps are completely customizable: code expansion with macros, and code reading via hooks into the Racket parser (you can also write a new parser from scratch (or generate it) if you want, but frequently it's not necessary).
Extending the parser
The Racket parser is a simple, recursive-descent one and it works by
utilizing a readtable
- a mapping from characters to functions responsible
for parsing what follows the given character. The read
and read-syntax
functions utilize current-readtable
, which is
a parameter
you can override. Adding a new syntax to the reader requires just three
steps:
- create a function for parsing the new syntax into valid Racket forms
- get the
current-readtable
and add your function to it - make that expanded readtable from the prev step the new
current-readtable
Interpolated string syntax
We'll get to writing the parsing function in a minute, but first we need to
know what is it we're going to parse and what the results of the parsing
should be. The exact syntax I have in mind would look like this (with #@
distinguishing interpolated strings from normal ones):
Interpolated string compilation
String interpolation, for example in CoffeeScript, is often compiled to an expression which concatenates the chunks of the string, like this:
In Racket we have no +
operator for concatenating the strings and even if
we did have it, the prefix notation and resulting nesting wouldn't be
pretty. The easier and more natural way is to use a string-append
procedure, which - conveniently - accepts any number of arguments. The
syntax introduced in the previous section would compile down to something
like this:
The parsing function
The parsing function needs to take the string form as an argument and output the code as above. Here's the simplest implementation I could think of:
It's dead simple - just a couple of lines - and it glosses over important
details, like the possibility of nesting the @{}
expressions and syntax
errors handling, but it works for the simple cases. We can easily test it,
like this:
The boilerplate code
The hard part ends here - the rest is just some boilerplate we need to make
Racket aware of our extension to its reader. In Racket, this is often done
by defining a new language - working with #lang lang-name
syntax, but in
our case we don't need (or want) to create a whole language (it's simpler
than it sounds, but it's still a bit more work) - we just change the reader
and can reuse everything else from some other language.
For situations like these, Racket supports a special syntax to
its #lang
line, which looks like this:
Generating a reader module
Let's first define reader.rkt
. Writing a whole new reader is of course
an option but it's a lot of unnecessary work, as the generic read
andread-syntax
[4] need to handle some more complex things, like
module declarations, which we don't care about. Instead, we can use a
special language for defining readers, which will generate appropriateread
and read-syntax
implementations for us provided we supply a single
function to it:
The #:language
option takes a function which accepts a string (of what
remains after the reader module name on the #lang
line) and returns a
language name to use. A simple read
happens to work here.
The #:wrapper1
option takes a function, which takes another function as
input and returns the parsed code. The passed in function, when invoked,
performs normal read.
Extending and installing a readtable
With #:wrapper1
option, in conjunction with parameterize
, it's very easy
to install our readtable[5]:
Wrapping parse so that it works inside readtable
The only part missing is a read-str
function, which is a bit more
complicated. The function passed to make-readtable
may be called in two
different ways, depending on whether it's used from inside read
orread-syntax
. The function distinguishes the cases based on a number of
arguments it gets, and returns either syntax object or a simple list
depending on the case. Here's its implementation:
It works!
The reader extension can be used like this:
It also works for more complex expressions out of the box:
This is of course a pretty basic as far as string interpolation features go, but it took thirty (34 to be precise) lines of code (excluding comments) to build it. Adding another twenty lines would be enough to implement error checking, and using one of the parser generators could even shorten the code.
That's it!
Aside from how easy it is to extend Racket syntax one thing deserves a mention. That is, such syntax extensions are easily composable (as long as they don't want to install extensions to the same entry in the readtable) - you can load them one by one, each extending the readtable.
The language extension and creation tools of Racket don't end there. Robust
module system with support for exporting syntax extensions, powerfulsyntax-parse
,
traditional syntax-rules
and syntax-case
,
many tools for generating parsers
and lexers -
all these features are geared towards meta-lingustic programming. Racket is
a programming language laboratory and allows you to bend it to better suit
your needs safely (ie. ensuring that only your code will be affected). Very
few languages or transpilers offer that much freedom with that kind of
safety guarantees.
It's an ideal environment for people who know what they need from their language - with Racket you're never stuck, waiting for the next release to support some feature. On the other hand, despite there being a great many helpers and libraries, you need to know at least some basics of programming language creation to do this yourself. The good news is that the extensions written by others are easily installable with a single command of a Racket package manager.
Comments