In this tenth installment of my series on Scheme macros, we will be looking at some meta macros—macros that make the act of writing macros easier and cleaner.
Specifically, we’re going to look at macros for the following:
- Defining simpler macros more cleanly using
defmacro, - generating fresh symbols when we need them using
gensym, and - ensuring single evaluation of, for instance, macro arguments using
once.
All of these are well-known macros and techniques, but we’ll be writing our own very simple implementations to gain an understanding of how an implementation like that could work. This means that for all of them, we are limiting their power. I’ll leave you with references to go deeper into a possible implementation for all of them. We’re not just going to scratch the surface, but we’re also not going spelunking today.
As always, we’re going to use my little (mostly dead) Scheme dialect zepto for the implementation, but all of this should be portable with small tweaks.
I’ve written more define-syntax macros than I can remember, and I still find them awkward to write and look at. I understand their power, but I still prefer a simpler function-like definition most of the time. Something like this:
(defmacro my-when (test body ...)
(if test
(begin body ...)
#f))
(my-when #t (write "hello"))
(my-when #f (write "nope"))
It has less power, but also less overhead, and captures our intent. It’s perfect for simple macros with a single clause and without literal sets. Luckily, we can define this short-hand quite easily.
(define-syntax defmacro
(syntax-rules ()
((_ name (args ...) form)
(define-syntax name
(syntax-rules ()
((name args ...)
form))))))
If you’ve followed along for this entire series, this should read almost naturally. In case the nested define-syntax throws you off, however, here’s an explanation: we define a macro that takes its argument and splices it into another macro definition, such that we get this transformation:
(defmacro my-when (test body ...)
(if test
(begin body ...)
#f))
; expands to
(define-syntax my-when
(syntax-rules ()
((my-when test body ...)
(if test
(begin body ...)
#f))))
It’s a nice little shorthand that makes reading the macro much easier!
If you want to learn more about how someone could implement a “full definition” of this, I encourage you to take a look at how Racket implements `define-macro`.
Sometimes, we just need to generate a unique identifier. Macro hygiene handles this for us, but occasionally we will need to splice in an identifier into a piece of syntax. In a lot of Lisps, that is exactly what gensym does.
(defmacro x () (gensym))
(write (x))
(write (x))
The identifier printed out should be unique across calls to x.
Again, we can implement this quite simply with a counter:
(define gensym-prefix "GENSYM-")
(define gensym-counter 1000)
(defmacro gensym ()
(begin
(set! gensym-counter (+ gensym-counter 1))
(string->symbol (++ gensym-prefix (->string gensym-counter)))))
Here, too, our newly minted defmacro makes things quite readable and short. All we do is increment our counter, append it to a prefix, and return the resulting symbol. It’s not exactly the cleanest code, but it approximates what every implementation of gensym does, including the implementation for Carp (a statically compiled Lisp I’ve worked on) by yours truly.
In zepto, compile time and runtime are not separated and thread safety is not an issue, so this is fine. In other languages, this might be a bit more of a problem, which is why they choose a different implementation path.
For instance, in Racket this is implemented in C, although it does essentially the same thing. Still, reading small definitions like these are good exercises to understand a virtual machine’s implementation, so if you are interested in how a Scheme might be implemented in C, this might help.
Macros have the inherent problem of dealing with syntax and thus behaving differently than we are used to from functions. Consider this:
(defmacro square (x)
(* x x))
(macro-expand '(square (+ 1 5)))
; => (* (+ 1 5) (+ 1 5))
Already we are duplicating work. Occasionally we’d like to be able to ensure a piece of code is really just evaluated once. Something like this:
(defmacro square (x)
(once (x)
(* x x)))
This macro is a classic known originally as once-only, implemented by the brilliant Peter Norvig. I first learned about it in the book Let over Lambda by Doug Hoyte.
Now, unfortunately we cannot do this in `syntax-rules` Scheme, since it would break hygiene, so we’d have to rely on something like this:
(define-syntax once
(syntax-rules ()
((_ ((t1 e1) (t2 e2) ...) body ...)
(let ((t1 e1) (t2 e2) ...) body ...))))
(defmacro square (s)
(once ((t s))
(* t t)))
Since this is boring and essentially boils down to a small wrapper around let, I believe we can do better if we are willing to throw hygiene under the bus and eval our way to success! This is left as an exercise to the reader.
But fret not! We will still explore this macro, just in a setting that is more suited to it and where hygiene is not a concern. I’ll reach for Carp, but you might just as well look at it in Common Lisp or Clojure.
(defndynamic replacerfn [arg]
[(gensym) arg])
(defndynamic generate-let [acc replacer]
(append acc [(list 'quote (car replacer)) (cadr replacer)]))
(defndynamic generate-reverse-let [acc replacer]
(append acc [(cadr replacer) (list 'quote (car replacer))]))
(defmacro defmacro! [name args body]
(let [replacer (map replacerfn args)]
(eval
`(defmacro %name %args
%(list 'list '(quote let) (reduce generate-let [] replacer)
(list 'let (reduce generate-reverse-let [] replacer)
body))))))
(defmacro! square [y] `(* %y %y))
(eval (square (do (macro-log "hi") 10))) ; will print "hi" once
A few years ago, I implemented a version of this with a few more capabilities for a talk at the Recurse Center. The version here is derived from that. For once, attacking the code from various angles can help you get a handle on it.
This version essentially bakes once into the definition of defmacro, making our macro more robust against re-evaluation.
Nonetheless, I will explain it to you. It essentially relies on a two-way binding at different times. Let’s go by evaluation order instead of definition order. The inner let will be evaluated first. It binds all symbols to their respective gensyms. This will then be evaluated to rewrite the body—every occurrence of the original variable will be replaced by the gensym.
Let’s illustrate it (this is not exactly what happens, but might help with the intuition):
; original pass:
(defmacro! square [y]
`(* %y %y))
; first pass:
(defmacro square [y]
(let [y 'gensym-symbol]
`(* %y %y)))
; evaluated first pass:
(defmacro square [y]
`(* %gensym-symbol %gensym-symbol))
In the second pass, the outer let kicks in. It will be added to the definition rather than executed, and will bind the generated symbols to the original variable:
(defmacro square [y]
`(* %gensym-symbol %gensym-symbol))
; expanded:
(defmacro square [y]
(let [gensym-symbol y]
`(* %gensym-symbol %gensym-symbol)))
And this is the code we finally end up with, ensuring that y will only be evaluated once, stored in a variable, and then rewritten.
If this was a bit much, don’t fret! This macro is often described as one of the pinnacles of the craft, and Peter Norvig wrote “If you can understand how to write and when to use once-only, then you truly understand macros.” It’s a macro worth studying, but one that feels slippery even after you’ve implemented it a few times in various languages.
And that concludes today’s session on macros! We’re getting into truly advanced territory now, implementing macros to make writing macros more convenient.
I hope you enjoyed this session, I know I did! These macros are near and dear to my heart, and each one represents another layer of the macro onion peeled back!
Let me know whether you liked this one, and if you have any more macro or language feature requests! I’ll be sure to put it in the backlog! See you around!