Erlang: Functions (Part 4).

As we have seen in the previous articles, functions elevate some tiny little things to the new heights. I find it really cool you could create something new in terms of existing things just by combining them in the right way (composition). Functions also allow you to manipulate complex objects if you guarantee you could break them to smaller pieces (recursion). Combined together these two principles create a universe of beautiful solutions. In this article we're going to explore some new forms of composition and recursion.

Sometimes recursion could lead you to unexpected results. We once created a hyper operator by just utilizing the successor and predecessor functions. The predecessor function (just to remind you) is a function that decrements its only parameter by one. If we start applying the predecessor function to any positive integer over and over we would eventually reach zero (which is an even number as you might know). Thinking this way we could try and implement two functions that test positive integers for being even or odd numbers.

-export([is_even/1, is_odd/1]).

is_even(0) -> true;
is_even(N) -> is_odd(N - 1).

is_odd(0) -> false;
is_odd(N) -> is_even(N - 1).

Here we introduce a few new concepts. First of all, functions that return a boolean value are called predicates (yet this is not a complete definition and the term itself has different interpretations, if you need more information please start here). So, is_even and is_odd are both unary predicates. It's also very common to utilize the is prefix for unary predicates when naming your functions.

Secondly, an interesting fact here is that both predicates have mutual base cases. If we're calling is_even and the number is zero, then the answer is true, because zero is even. Otherwise, we're decrementing the argument by 1 and giving the control to the is_odd function. If at this point the argument is zero then the result is false, otherwise the argument is decremented again and is_even is called. And so on and so forth until it reaches zero and stops the recursion.

Functions with mutual base cases represent so called mutual recursion. While in everyday programming you will face such recursion rarely it might be common in such tasks as writing parsers and producing some sequences for analytical purposes.

You might be also wondering why implement such simple thing that way while there are generic and performant alternatives. Let me illustrate them.

is_even2(N) -> N rem 2 =:= 0.

is_even3(N) -> 1 band N =:= 0.

I'm omitting is_odd here; those are just to illustrate the idea after all. You could see that the same function could be implemented in many different ways and it's your responsibility to utilize the version that suits your needs best. However while working with negative integers both the above functions require you to have either rem (integer remainder) or band (bitwise and) operator to be supported.

On the other hand our initial version of the is_even and is_odd functions requires nothing but the predecessor function, which could be easily replaced by an analogue operation over some other non-integer structure (for instance, tuples). This might be confusing for now but this topic will be explained in detail as soon as we discuss functions and their relation to types.

Now, let's see how we could filter out a list of integers so that if you provide a lists of integers it would return you either a list of odd or even numbers depending on which version you're using.

filter_even([]) -> [];
filter_even([H|T]) ->
    case is_even(H) of
        true -> [H|filter_even(T)];
        false -> filter_even(T)

filter_odd([]) -> [];
filter_odd([H|T]) ->
    case is_odd(H) of
        true -> [H|filter_odd(T)];
        false -> filter_odd(T)

You might notice these functions are almost identical. The only difference is the exact function we call to test a single number. If only we had a mechanism to generalize such cases...

Erlang is a functional language. Among other properties Erlang treats functions as first-class citizens. It means we could assign functions to variables, test them for equality, consider functions as arguments of other functions and also return them as a result of evaluation of other functions. In other words functions in Erlang are in fact like all other types we use regularly.

To refer to a function we use the keyword fun followed by either the function's name or a combination of module and function depending on where we refer it from and the function's arity. For instance, if we refer to the is_even function from where it's defined (lesson4), we could just use the short form which is fun is_even/1.

The syntax could be generalized as fun M:F/A, where MFA stands for module, function and arity. You should also keep in mind that any component could be either constant or variable.

1> IsOdd = fun lesson4:is_odd/1.
fun lesson4:is_odd/1
2> IsOdd(42).
3> M = lesson4.
4> F = is_even.
5> A = 1.
6> IsEven = fun M:F/A.
fun lesson4:is_even/1
7> IsEven(3).
8> fun lesson4:is_even/1(0).

It's not always necessary to assign a reference to a variable to call it. Instead you could just call it right away by providing its arguments, as you could see in the very last example (step 8). You could also wrap a reference in parentheses before calling it which is a common way to do it. Here is an example.

1> (fun lesson4:is_even/1)(6).

The fact we could associate any function with a variable and then call it means we could now implement a generic version of the filter function. Let's do this.

filter(_, []) -> [];
filter(P, [H|T]) ->
    case P(H) of
        true -> [H|filter(P, T)];
        false -> filter(P, T)

Filter is a higher-order function. Higher-order functions (unlike first-order functions) take other functions as arguments or return other functions as a result, elevating our ability to abstract things to a new level.

To call the filter function you should provide a reference to some unary predicate. Let's illustrate it by simplifying the filter_even and filter_odd functions.

filter_even_simple(L) -> filter(fun is_even/1, L).

filter_odd_simple(L) -> filter(fun is_odd/1, L).

One important thing I have to mention is that our generic filter function is already implemented as a part of the lists module. While it's fun to write our own versions of generic and fundamental functions (not just fun but also really useful for educational purposes) it's always a good idea to utilize Erlang's built-in functions (BIFs) when possible due to performance reasons. It's also a good practice to utilize the Standard Library even if you know some functions aren't BIFs because these modules and functions are well tested and documented.

Now let's get back to the predicates. It's sometimes useful to check whether at least one element in the list satisfies some condition or conditions. This task could be generalized as a higher-order function which is a predicate itself called any. This function takes a predicate P (that's intended to test a single element in the list) and the list itself and returns true if at least one element of the list satisfies the predicate. Otherwise it returns false.

any(_, []) -> false;
any(P, [H|T]) -> P(H) orelse any(P, T).

Oh boy, I do love this function! It's quite unique among all the previous functions we have seen so far because it has two base cases one of which is at the same clause that forms the recursion. I mean, look at the second clause, you could see here is the orelse operator! It does work in that way so we call any (which is the second operand) if only the first operand is false. Hence, P(H) is a base case! Let's do some tests first to get a better understanding.

1> false orelse 1.
2> true orelse 1.
3> true andalso 1.
4> false andalso 1.
5> 2 < 1 andalso io:fwrite("you won't see this~n").

The andalso and orelse operators represent so called Short-Circuit Evaluation, means their second operand won't be evaluated at all if not needed. In the very last example you could see the io:fwrite function is not evaluated because 2 is more than 1 hence the equation is false already and there is no need to evaluate it further.

You could also notice that unlike in and and or the second operand could be evaluated to anything, not just some boolean value. And one last note is that you could use andalso and orelse in guards (we will discuss it one day if there is a luck).

Any has its dual called all. The all function is almost the same but it does return true if all elements in the list satisfy the given predicate.

all(_, []) -> true;
all(P, [H|T]) -> P(H) andalso all(P, T).

As I said both functions are almost the same but you could notice the first clauses return different boolean values. That's because in the case of all if all elements satisfy the given predicate then the recursion terminates when calling the all function against an empty list so that it must return true, otherwise it would produce incorrect results. In the case of any if all elements do not satisfy the given predicate then the very last call would be made against an empty list, hence the result should be false (there are no elements that satisfy P).

Now let me introduce the zip function. This function takes 2 lists and combines them into a single list so that each element of each source list would be combined into a binary tuple. Let me show you how it works.

1> lists:zip([1, 2, 3], [a, b, c]).

Cool, huh? Well, one thing though... I did mention you should always be using the Standard Library whenever possible. But for some weird reason the zip function that's provided by the lists module does only work when there are 2 lists of same size. Giving it lists of different sizes would simply produce an error. So we're going to implement our own version!

% differs from lists:zip!
zip([HX|TX], [HY|TY]) -> [{HX, HY}|zip(TX, TY)];
zip(_, _) -> [].

Let's now try it in erl.

1> lesson4:zip([1, 2, 3], [a, b, c, d]).

Splendid! Now, let me show you something. Consider we have a list of integers. If we try to combine it with its own tail (the list itself minus the head element) we would get each element paired with the element next to it. Let's see.

1> Numbers = [1, 2, 3, 4, 5].
2> lesson4:zip(Numbers, tl(Numbers)).

We could now use this property to define a predicate that checks whether a list is sorted or not.

gte({X, Y}) -> X =< Y.

lte({X, Y}) -> X >= Y.

is_sorted([]) -> true;
is_sorted([_|T] = L) ->
    Pairs = zip(L, T),
    all(fun gte/1, Pairs) orelse
    all(fun lte/1, Pairs).

I suppose the gte and lte functions are simple and straightforward. The is_sorted predicate needs some little clarification though. If you take a look in the line number 7 we define a variable called Pairs. That's because the result of calling the zip function is going to be used twice in the worst scenario (when the first all evaluates to false). So, involving a variable here is quite reasonable as we're getting rid of unnecessary computations making the function more efficient. Though, involving variables in functions is an advanced topic (surprise!) as it requires you to understand scoping. I do recommend you to avoid variables in functions during learning Erlang. This would force you to keep your functions short and simple and as a result teaches you good practices. Scoping rules will be discussed in one of the upcoming articles.

Let's see how is_sorted works.

1> lesson4:is_sorted([1, 2, 3, 4, 5, 6]).
2> lesson4:is_sorted([1, 2, 3, 5, 4, 6]).
3> lesson4:is_sorted([6, 5, 4, 3, 2, 1]).

You could see that our predicate does work correctly for both ascending and descending orders, that's why we needed to involve 2 calls of the all function.

That's it for now. Play with the functions from this article, write your own ones and read more about higher-order functions using the links you could find here or over the Internet. In the next article (articles?) I'm going to introduce you Funs and explain why these tiny little things are really a new level to abstract things. Stay tuned!

P.S.: Don't forget to inspect and download the sources.