As all students at DIKU know from Standard ML or Haskell, pattern matching is a feature seen in many functional languages. It is often praised for the ability to make recursive function definitions shorter and much more elegant.

There's much more to pattern matching than what we see in Standard ML, however. To explore this, we will be looking at Mathematica, a language where everything you enter is considered as an value of an algebraic data type. This leaves a need for good pattern matching, as it's a good way to operate on ADTs.

Functions are modelled by specifying substitution rules for given patterns. Take the factorial function, for instance

Line one specifies that the symbol fact[0] has the value 1. Line two specifies, that if we see a symbol with head fact, containing one argument, it should perform the substitution shown and evaluate the result.

If we try to evaluate fact[0] and fact[5], we get 1 and 120 in response as we might expect. If we try to evaluate fact[5, 4], we might expect an error, but we simply get fact[5, 4] in response. We haven't defined a substitution rule with a pattern matching that symbol, so we get the symbol back in unevaluated form. (Remember, everything is an ADT.)

It is also possible to make conditional patterns. You may know this from Haskell, where it's called a guard. If you want your factorial function to only be defined on non-negative integers, for instance, you may impose that as a restriction on the latter of the two rules in the definition, like so:

Here we explicitly require n to be positive for the latter pattern to work. As can be seen in the example, either an explicit expression or a predicate can be used.

Unlike many other languages, Mathematica also allows for patterns to match sequences. This could, for instance, have been used to great effect in solving problem 7 of the IP exam set for 2012. The problem states that the students must make a function, which opens a binary PPM file, inverts its pixels, and saves it back to another file. In doing so, one will find it necessary to split the header from the body, since the header doesn't need to be altered. The problem text helpfully says, that one can assume that the header ends with 255 followed by whitespace, and that it's the first occurence of that number.

This can be done rather elegantly in Mathematica, by making use of sequence patterns:

Here the __ in head__ and body__ makes them match a sequence of characters, making splitting the list at the first occurence of 255 a simple and elegant task.

To contrast this with SML, a similar function would look something like this:

Here, we manually have to traverse the list until the pattern matches.


Mathematica has many more offerings as far as pattern matching are concerned. If you have time, I highly recommend taking a look at what the language has to offer. Since Mathematica licenses can be somewhat costly, I can recommend giving Mathematica a spin on DIKU's servers. It's available on the tuxrays, by running the command MathKernel (for the command-line version) or Mathematica (for the graphical version).