Learning about macros in Lisps was one of my biggest whoa-moments in my programming career and since then I’ve given presentations about them to audiences ranging from 1 to 100 people. I have a little script that I follow in which I implement a custom form of the if-conditional. Unfortunately, I don’t think I’ve managed to generate many whoa-moments. I’m probably not doing macros justice. It’s not an easy task as they can be complex beasts to play with.

As we are experimenting with Clojure, I eventually needed a tool that I knew was going to be a macro, and I built a simple version of it. The tool is assert_difference. I don’t know where it first appeared, but Rails ships with one and not satisfied with that one I built one a few years ago. In the simplest case it allows you to do this:

assert_difference("User.count()", 1) do

assert_difference("User.count()", 0) do

assert_difference("User.count()", -1) do

The problem

Do you see what’s wrong there? Well, wrong is a strong word. What’s not as good as it could be? It’s the fact that User.count is expressed as a string and not code, when it is code. The reason for doing that is that we don’t want that code to run, we want to have it in a way that we can run it, run the body of the function (adding users, removing users, etc) and run it again comparing it with the output of the previous one.

There’s no (nice) way in Ruby or many languages to express code and not run it. Rephrasing that, there’s no (nice) way to have code as data in Ruby as in most languages. I’m not picking on Ruby, I love that language, I’m just using it because it’s the language I’m most familiar with; what I’m saying is probably true about any programming language you chose.

One of the things you’ll hear repeated over and over in the land of the Lisp, be it Common Lisp, Scheme or Clojure is that code is data. And that’s what we want here, we want to have a piece of code as data. We want this:

assert_difference(User.count(), 1) do

assert_difference(User.count(), 0) do

assert_difference(User.count(), -1) do

which in Lisp syntax it would look like this:

(assert-difference (user-count) 1

(assert-difference (user-count) 0

(assert-difference (user-count) -1

and this, ladies and gentlemen, is not only easy, it’s good practice.

Enter the macro

I achieved it with a simple 4-line macro and here it is:

(defmacro assert-difference [form delta & body]
  `(let [count# ~form]
     (assert-equal (+ count# ~delta) ~form)))

If you are not familiar with Clojure that will be very hard to read, so, let me help you:

The first line defines the macro with the name assert-difference  and getting 3 or more parameters with the first one called form , the second delta  and all other parameters, as a list, in body. So, in this example:

(assert-difference (user-count) 1

we end up with:

  • form => (user-count)
  • delta => 1
  • body => [(add-user-to-database)]

Note that the parameters to the macro didn’t get the value of calling (user-count), it got the code itself, unexecuted, represented as data that we can inspect and play with, not an unparsed string.

The body of the macro is a bit cryptic because it’s a template. The backtick at the beginning just identifies it as a template and ~  means “replace this variable with the parameter”. ~@  is a special version of ~  that we have to use because body contains a list of statements instead of a single one. That means that:

`(let [count# ~form]
    (assert-equal (+ count# ~delta) ~form))

turns into:

(let [count# (user-count)]
   (assert-equal (+ count# 1) (user-count)))

Is it starting to make sense? count#  is a variable that is set to (user-count), then we execute the body, that is (add-user-to-database) and then we execute  (user-count) again and compare it count#  plus delta. This is the code that’s emitted by the macro, this is the code that actually gets compiled and executed.

If you are wondering about why the variable name has a hash at the end, imagine that variable was just named count  instead and the macro was used like this:

(let [count 10]
  (assert-difference (user-count) 1
    (add-users-to-database (generate-users count)))

That snippet defines count, but then the macro defines count again, by the time you reach (generate-users count) that count  was masked by the macro-generated one. That bug can be very hard to debug. The hash at the ends makes it into a uniquely named variable, something like count__28766__auto__ , that is consistent with the other mentions of count#  within that macro.

Isn’t that beautiful?

The real solution

The actual macro that I’m using for now is:

(defmacro is-different [[form delta] & body]
  `(let [count# ~form]
     (is (= (+ count# ~delta) ~form))))

which I’m not going to package and release yet like I did with assert_difference because it’s nowhere near finished and I’m not going to keep on improving it until I see the actual patterns that I like for my tests.

You might notice that it doesn’t use assert-equal. That’s a function I made up because I believe it was familiar for non-clojurians reading this post. When using clojure.test, you actually do this:

(is (= a b))

There’s one and only one thing to remember: is . Think of is  as a generic assert and that’s actually all that you need. No long list of asserts like MiniTest has: assert, assert_block, assert_empty, assert_equal, assert_in_delta, assert_in_epsilon, assert_includes, assert_instance_of, assert_kind_of, assert_match, assert_nil, assert_operator, assert_output, assert_predicate, assert_raises, assert_respond_to, assert_same, assert_send, assert_silent and, assert_throws.

In a programming language like Ruby we need all those assertions because we want to be able to say things such as “1 was expected to be equal to 2, but it isn’t” which can only be done if you do:

assert_equal(1, 2)

but not if you do

assert(1 == 2)

because in the second case, assert doesn’t have any visibility into the fact that it was an equality comparison, it would just say something like “true was expected, but got false” which is not very useful.

Do you see where this is going? is  is a macro, so it has visibility into the code, it can both run the code and use the code as data and thus generate errors such as:

FAIL in (test-users) (user.clj:12)
expected: 1
  actual: 2
    diff: - 1
          + 2

which if you ask me, is a lot of very beautiful detail to come out of just

(is (= 1 2))

But language X!

When talking to people about this, I often get rebuttals that this or that language can do it too and yes, other languages can do some things like this.

For example, we could argue that this is possible in Ruby if you encase the code in an anonymous function when passing it around, such as:

assert_difference(->{User.count}, 1) do

but it’s not as nice and we don’t see it very often. What we see is 20 assert methods like in MiniTest. To have an impact in the language, these techniques need to be easy, nice, first class citizens, the accepted style. Otherwise they might as well not exist.

An even better syntax for blocks in Ruby might help with the issue and indeed what you are left with is a different approach to incredible flexibility and it already exists. It’s called Smalltalk and there’s some discussion about closures, what you have in Lisp,  and objects, what you have in Smalltalk, being equivalent.

I’m also aware of a few languages having templating systems to achieve things such as this, like Template Haskell, but they are always quite hard to use and left for the true experts. You rarely see them covered in a book for beginners of the language, like macros tend to be covered for Lisp.

There are also languages that have a string based macro system, like C and I’ve been told that Tcl does as well. The problem with this is that it’s from hard to impossible to build something that’s of medium complexity, to the point that you are recommended to stay away from it.

All of the alternative solutions I mention so far have a problem: code is not (nice) data. When a macro in Lisp gets a piece of code such as:

(+ 1 2)

that code is received as a list of three elements, containing + , 1  and 2 . If the macro instead received:

(1 2 +)

the code would be a list containing 1, 2 and +. Note that it’s not valid Lisp code, it doesn’t have to be because it’s not being compiled and executed. The output of a macro has to be valid Lisp code, the input can be whatever and thus, making a macro that changes the language from prefix notation to suffix notation, like in the last snippet of code, is one of the few first exercises you do when learning to make macros.

What makes it so easy to get code as data and work with it and then execute the data as code is the fact that Lisp’s syntax is very close to the abstract syntax tree of the language. The abstract syntax tree of Lisp programs is something Lisp programmers are familiar with intuitively while most programmers of other languages have no idea what the AST looks like for their programs. Indeed, I don’t know what it looks like for any of the Ruby code I wrote.

Most programmers don’t know what an AST actually is, but even the Lisp programmers that don’t know what an AST is have an intuition for what the ASTs of their programs are.

This is why many claim Lisp to be the most powerful programming language out there. You could start thinking of another programming language that has macros that receive code as data and that their syntax is close to the AST and if you find one of those, congratulations, you found a programming language of the Lisp family because pretty much those properties make it a member of the Lisp family.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s