Programming with Functions #7: Expressions over statements
Another idea you may already know but not associate with functional programming is that of an expression — as opposed to a statement. When writing code in the imperative style we build functions with statements. It’s how we order the program to do something: get data from here, modify it like that, save it there. From this point of view, the main purpose of a function is to modify the state of the program — that is, data outside the function itself. But in school, in mathematics, we also learned about functions, and they were not like that. Instead of modifying anything, they used the arguments to produce a new result. They were expressions.
An expression is a combination of one or more constants, variables, operators, and functions that the programming language interprets and computes to produce another value.
Please note the words I used: It does not mean that an expression cannot perform operations on the system — so-called side-effects — and it does not mean that a function made of statements cannot return a result. In fact, in most programming languages most functions lie in the grey area between these opposites. The difference is in what we consider to be the main purpose of a given function. In functional programming we favour expressions for many reasons. Some of them we will talk about in a while, but now I’d like to focus on one which is not so easy to define: on how writing expressions instead of statements affects the way we write code in general.
My first attempt at writing Scala after years of Java looked something like this:
def foo(arg1: Something, arg2: SomethingElse): Foo = {
statement1
statement2
...
fooValue // some value of type Foo is put at the end
//so it’s going to be returned
}
That wasn’t really Scala, wasn’t it? Even if the definition of an expression is a bit fuzzy, this is absolutely not what I had in mind when I first mentioned expressions. Let’s start again. And let’s start small.
def mul(x: Int, y: Int): Int = x * ydef catColours(cats: Seq[Cat]): Set[Colour] =
cats.map(_.colour).toSetdef catColoursNoGinger(cats: Seq[Cat]): Set[Colour] =
cats.filter(_.colour != CatColour.Ginger)
.map(_.colour)
.toSet
Expressions can be much more complicated than that but you should get the feeling: they consist of a chain — or a tree — of operations where each operation returns some data (quite possibly because it’s an expression as well), and that data is then taken over by the subsequent operation, one that again returns data, and so on, and so on. I like to imagine it as a sequence of interconnected pipes and data being something I want to push through those pipes to the other end. With each used pipe, data changes somehow and eventually comes out as something very different from what I put in.
Just as with pipes, when we connect expressions together and test that they work well, we can start treating them as one expression, and use them to form even more complex expressions. For example:
def catColoursNoGinger(cats: Seq[Cat]): Set[Colour] =
catColours(cats).filter(_ != CatColour.Ginger)
And then, if your language of choice supports expression syntax in constructs like if/else, match/case, and others, you can do some interesting stuff with it:
sealed trait Actionobject Action {
case object AllColours extends Action
case object NoGinger extends Action
}def getColours(action: Action, cats: Seq[Cat]): Set[Colour] =
action match {
case Action.AllColours => catColours(cats)
case Action.NoGinger => catColoursNoGinger(cats)
}
The mix of code focused on expressions and pattern matching is a powerful thing. It’s often overlooked in tutorials and my guess why is that it’s just too obvious and transparent to people who use it for a longer time. It enables us to write complicated reasoning clearly and concisely. Add to this mix the expressive type system and straightforward, unambiguous names for functions, fields, and classes (sorry, this one is on you — naming is one of the hard problems of Computer Science), and you get very close to the point where you can give your code to another person and they might actually read it and understand it.
You will quickly find out that the more expressions you write, the more expressions you write. Expressions work best when they’re short and do just one thing. If you see that your expression does too much — split it into two or extract a sub-part. In most cases, it’s very easy to do. You may, however, want to watch out for a few things:
- An expression used only once ➡️ inline it or save the result in a value.
2. An expression used many times but always giving the same result ➡️ save the result in a value.
3. An expression used many times and giving different results, but used only in one expression of a higher order ➡️ make it into a nested function.
4. And, lastly, if there is an expression that is just too complex and unreadable ➡️ split it into a few intermediate steps and save their results in values.
def bathCat(cat: Cat): Try[Cat] = {
def tryBathing(bath: Bath, safety: Safety, tryN: Int): Try[Cat] = {
val catCaught = catchCat(cat, safety)
bath(catCaught).recoverWith {
case _ if tryN < MAX_NUMBER_OF_TRIES => tryBathing(bath, safety, tryN + 1)
}
}
val bath = prepareBath()
val safety = wearGloves()
tryBathing(bath, safety, 0)
.map(releaseCat)
.recoverWith(_ => giveUp())
.tap(_ => treatWounds())
}
(Please note that this code is a bit different from what I show in the video. I couldn’t decide between two versions. The difference is that I imagine the Bath
class to have an apply
method which returns a Try[Cat]
- a success if we managed to bath the cat or a failure otherwise. bath.apply(catCaught)
method can be shortened to bath(catCaught)
and Try
has a method recoverWith
which lets us react to a failure. In this case, if we fail to bath the cat, we can try again, unless we already tried tryN
times in which case we give up).
These cases indicate that you might benefit from having a statement in your code, but those statements are non-obligatory — you can use them for better readability and performance, but you could do without them.
Writing expressions rather than statements is therefore more about the mindset than any coding tricks. From my experience, since I switched from Java to Scala, and started to pay attention to write my code in a more expression-oriented style, I started writing more smaller methods than before, the number of variables in my code dropped a lot, and the code I wrote started to be almost instantly reusable. An expression written to work on one piece of data can be very easily used as a function provided as an argument to one of the higher-order functions working on a collection of that data. Almost no refactoring is necessary.
Pure and impure functions
We say that a function — an expression — is pure if it works only on the argument given to it, and returns a new value, without modifying the state of the program. A pure function is very easy to test or even prove to be correct: we can go through all possible combinations of its arguments and prove that in each case we get a valid result. From now on the function can be a black box — we don’t have to have the slightest idea how its insides work. It works. That’s it.
One thing it gives us is referential transparency: An expression is called referentially transparent if it can be replaced with its corresponding value without changing the program’s behavior.[1] This, in turn, makes programs better in a number of ways. If we are sure that an expression always gives the same result for the same arguments, the most common arguments and their results can be saved to some kind of a cache, saving the CPU time that otherwise would have to be spent on calculating the same result over and over again. It also means that the compiler can automatically inline our code, reducing the number of function jumps. If the pure expression is accessed by many threads, the program can virtually ignore many safety tricks, knowing for sure that the expression will not try to sneakily access or modify some shared data. And, last but not least, it means that we are able to unit test literally every use case.
In practice, only a handful of programming languages, Haskell being the most popular of them, require the programmer to write pure functions. Some others, like Kotlin or Pascal/Delphi, have a special syntax to mark that you write expressions, but they still allow side-effects. In Scala, everything is an expression, even constructs such as if/else and match/case, but still there’s no way to prevent modifications to the global state.
And when the function does modify the program’s state we say that it is impure. The modification may happen through some sort of a global reference to an external mutable data structure which the function has access to, or because one of the arguments is, or has such a reference. The special case of an impure function is one where the function returns nothing — Unit in Scala, void in C, C++, and Java — and all its work is done through changes of the global state.
In theory, every impure function can be reorganized into a pure one. All things taken from the context can be replaced with arguments. Whatever side-effects we perform can become a part of the result. Sometimes the amount of required work is so big, and the resulting code is so complex, that it’s not worth it. But it is possible, and a growing number of libraries in FP-oriented languages try to make it easier for programmers. The main concept behind it is sometimes called an IO monad — it’s outside of scope if this tutorial, but in short, you can imagine it this way: Instead of actually making changes to the environment (which is usually the filesystem, hence I/O), you pass around an object which works like a builder of a scenario of what you want to do. You give it as an argument to a pure function, the function — instead of making side-effects — records in it what it needs to do, and returns it, so it can be passed to another pure function.
Then, at some point, the program decides that it can wait no longer, and executes all the records in the order they were accumulated. This gives us better control over what changes are done when. On the other hand, many of those pure functions that made records also register callbacks: when the changes are actually made, those callbacks should be executed and perform some additional operations.
You can read more on the IO monad here from Alvin Alexander.
But how about we talk about monads for a moment? It’s going to be the topic of the next chapter so I will give you some time to mentally prepare for that.