In this article, I demonstrate how to build a basic iterator using different recursive function patterns. This is the first of a series of articles for learning functional programming in F#.
This article focuses on describing these patterns well. Examples are kept simple by design. A solution using imperative programming is also included for reference.
1. Article example: point1D
and line
1.1 Types
All of the F# examples for this article will use these user-defined line
and point1D
discriminated union types.
type point1D: P of int
type line: L of int
1.2 Example task
We want to write a function points
of type line -> Set<point1D>
that computes the set of points along the line.
I.e. points
with a line of length \(4\) has the result \(\{ 0, 1, 2, 3, 4 \}\).
2. Iterative method
In Java for reference, we can use a for loop, and accumulate the points in a set:
class Line
{
public static Set<Integer> points(int w)
{
var points: new HashSet<Integer>();
for (int i: 0; i <= w; i++) {
points.add(i);
}
return points;
}
public static void test() {
System.out.println(points(4));
// Prints: [0, 1, 2, 3, 4]
}
}
And indeed in F#, the easiest way to compute this set probably be to use set comprehension. This would be the preferable solution in production F# code:
let points_1D (L(w)): set [ for x in 0..w -> C(x) ];;
Using this function produces the expected result. In F# Interactive:
points_1D (L(4))
val it: Set<point1D>: set [C 0; C 1; C 2; C 3; C 4]
3. Basic recursion
A recursive function should have the following basic elements:
- One or more base cases. These are typically trivial solutions where recursion can end.
- The problem decomposes into sub-problems.
- It is possible to recursively call for the correct solution to the sub-problem.
- The sub-problems re-combine to solve the original problem.
In defining points_1D
recursively, we can choose to make the base case either \(\{ 0 \}\) or \(\{ 4 \}\). The former is cleaner as the base case is the same for all problems:
let rec points_1D_rev (L(w)):
match w with
| 0 -> set [C(0)] // Matches 0
| x -> (points_1D_rev(L(w-1))).Add(C(x)) // Matches 4, 3, 2, 1
It may be the case that ordering matters. I.e. when the function returns a list. It is possible to use \(\{ 4 \}\) as the base case and start from zero with more scaffolding:
let points_1D_fwd (L(w)):
let rec aux toAdd:
match toAdd with
| n when n: w -> set [C(n)] // Matches 4
| n -> (aux (n+1)).Add(C(n)) // Matches 0, 1, 2, 3
aux 0 // Is the value of 'points_1D_fwd'
Both of these functions produce the expected result. In F# Interactive:
points_1D_rev (L(2))
val it: Set<point1D>: set [C 0; C 1; C 2]
points_1D_fwd (L(3))
val it: Set<point1D>: set [C 0; C 1; C 2; C 3]
3.1 About non-tail recursion
This is an example of non-tail recursion:
- First the value(s) of the recursive call(s) must be obtained and bound. The function call remains on the stack until this is done.
- The function then evaluates to a new value. This happens here when the
Add(...)
method is used with the set. - The new value is the result of the function call.
3.2 Stack overflow
The disadvantage of non-tail recursion is that a series of recursive calls will build up on the stack, an area of memory which can easily overflow. This is a well-known problem when recursion is used elsewhere in programming.
The exact limit depends on the size of the stack frame for the function in question. In my case, anything exceeding about 20,000 in line length will fail.
The following sections are about ways to deal with this problem through tail recursion. Following these patterns, F# is able to optimize how recursion is modelled in memory to avoid stack overflow.
4. Tail recursion using an accumulator
Tail recursion using an accumulator works as follows:
- The accumulated value so far is an argument of the recursive function call. Here the initial function call is with an empty set. Subsequent recursive calls will be with a partially completed set.
- The function binds an altered accumulator. Here using the
Set.add
function. - The final result of the function will come directly from a new recursive call to itself with the altered accumulator. This function call is finished and is removed from the stack.
Using tail recursion, F# is able to optimize how the function uses the stack.
Building on the previous example, a tail recursive points_1D_rev
looks like this in F#:
let points_1D_acc (L(w)):
let rec aux w acc:
match w with
| 0 -> Set.add (C(0)) acc
| x -> aux (w-1) (Set.add (C(x)) acc)
aux w Set.empty
This function is not subject to stack overflow problems. Even 100,000 recursions will work.
// Set filter applied to display the end of the set.
Set.filter (fun (C(value)) -> value >= 99997) (points_1D_acc (L(100000)))
val it: Set<point1D>: set [C 99997; C 99998; C 99999; C 100000]
4.1 Code semantics
Named bindings are allowed within a match case. I.e. I can choose to make the changed accumulator stand out more clearly in the code:
let points_1D_acc2 (L(w)):
let rec aux w acc:
match w with
| 0 -> Set.add (C(0)) acc
| x ->
let updatedAcc: Set.add (C(x)) acc
aux (w-1) updatedAcc
aux w Set.empty
Also notice that here, the base case adds a final element to the set. This is a design choice for you to make. Often this last call would be made to simply return the accumulator with no further changes.
4.2 Note on testing
When writing this article, I first coded the tail recursive examples incorrectly. The function missed its base case, built up gigabytes of set data on the heap, and never returned a value. Make sure to test your code!
5. Tail recursion using a continuation function
While the point1D
example does not motivate the use of continuations, the subject is still covered in this article for completeness.
Sometimes it is not possible to use an accumulator to build tail recursion. A common case is when a function has two or more recursive calls. In these cases we can build a tail recursive function using continuation functions. Compared to using an accumulator, it works as follows:
- In place of an accumulator, we pass a continuation function. This function takes an accumulator as an argument.
- On the initial call, pass
fun a -> a
for the continuation function argument. Also known as the identity function, this has the shorthand bindid
in F#. - Each recursive call should then pass a continuation function in the call for the next recursion.
- The function takes an accumulator as an argument.
- It must call the continuation with the accumulator updated as required for the function.
- The result of the base case is normally to call the continuation function with a base accumulator. Often this accumulator is empty.
- Alternatively, the auxiliary function can return a function to be called with an empty accumulator.
A tail recursive points_1D_rev
using continuations looks like this:
let points_1D_cont1 (L(w)):
let rec aux w cont:
match w with
| 0 -> cont (set [C(0)])
| x -> aux (w-1) (fun st -> cont (Set.add (C(x)) st))
aux w id
5.1 Note on testing
To be able to test this method, you must change your solution build mode to “release”. This makes tail recursion using continuations work correctly on large computations. Stack traces will not be available in this mode.
5.2 Code semantics
Here you may also want to style your code for readability. In example:
let points_1D_cont2 (L(w)):
let rec aux w andContinue:
match w with
| 0 -> set [C(0)] |> andContinue
| x -> aux (w-1) (fun st ->
Set.add (C(x)) st |> andContinue)
aux w id
5.3 Wrong continuation function
Consider the following code example:
let points_1D_bad_cont (L(w)):
let rec aux w cont:
match w with
| 0 -> cont (set [C(0)])
| x -> aux (w-1) (fun st ->
Set.add (C(x)) (cont st)) // Bad continuation function.
aux w id
This code example above is still technically tail recursive; your IDE may even say so. The function builds a series of continuation functions on the heap.
However, calling the continuation function for the result will overflow the stack. In my case, anything exceeding about 18,000 in line length fails.
The reason for this is obvious. The continuation call within the continuation function is not tail recursive, and so F# cannot optimize it.
To avoid this, make sure that your continuation function actually makes a tail call, with the changes to the accumulator it needs to make. Make sure to test your code!