Function Pointers, Delegates, and Lambdas
Function pointers are for storing addresses of functions in order to execute those functions at a later time. Function pointers are similar to their counterparts in the C programming language.
Delegates store both a function pointer and the context to execute that function pointer in. The stored context can either be the scope that the function execution will take place or a struct
or class
object.
Delegates enable closures as well, a concept that is supported by most functional programming languages.
Function pointers
We have seen in the previous chapter that it is possible to take addresses of functions with the &
operator. In one of those examples, we passed such an address to a function template.
Taking advantage of the fact that template type parameters can match any type, let's pass a function pointer to a template to observe its type by printing its .stringof
property:
import std.stdio; int myFunction(char c, double d) { return 42; } void main() { myTemplate(&myFunction); // Taking the function's address and // passing it as a parameter } void myTemplate(T)(T parameter) { writeln("type : ", T.stringof); writeln("value: ", parameter); }
The output of the program reveals the type and the address of myFunction()
:
type : int function(char c, double d) value: 406948
Member function pointers
The address of a member function can be taken either on a type or on an object of a type, with different results:
struct MyStruct { void func() { } } void main() { auto o = MyStruct(); auto f = &MyStruct.func; // on a type auto d = &o.func; // on an object static assert(is (typeof(f) == void function())); static assert(is (typeof(d) == void delegate())); }
As the two static assert
lines above indicate, f
is a function
and d
is a delegate
. We will see later below that d
can be called directly but f
needs an object to be called on.
Definition
Similar to regular pointers, each function pointer type can point exactly to a particular type of function; the parameter list and the return type of the function pointer and the function must match. Function pointers are defined by the function
keyword between the return type and the parameter list of that particular type:
return_type function(parameters) ptr;
The names of the parameters (c
and d
in the output above) are optional. Because myFunction()
takes a char
and a double
and returns an int
, the type of a function pointer that can point at myFunction()
must be defined accordingly:
int function(char, double) ptr = &myFunction;
The line above defines ptr
as a function pointer taking two parameters (char
and double
) and returning int
. Its value is the address of myFunction()
.
Function pointer syntax is relatively harder to read; it is common to make code more readable by an alias
:
alias CalculationFunc = int function(char, double);
That alias makes the code easier to read:
CalculationFunc ptr = &myFunction;
As with any type, auto
can be used as well:
auto ptr = &myFunction;
Calling a function pointer
Function pointers can be called exactly like functions:
int result = ptr('a', 5.67); assert(result == 42);
The call ptr('a', 5.67)
above is the equivalent of calling the actual function by myFunction('a', 5.67)
.
When to use
Because function pointers store what function to call and they are called exactly like the functions that they point at, function pointers effectively store the behavior of the program for later.
There are many other features of D that are about program behavior. For example, the appropriate function to call to calculate the wages of an Employee
can be determined by the value of an enum
member:
final switch (employee.type) { case EmployeeType.fullTime: fullTimeEmployeeWages(); break; case EmployeeType.hourly: hourlyEmployeeWages(); break; }
Unfortunately, that method is relatively harder to maintain because it obviously has to support all known employee types. If a new type of employee is added to the program, then all such switch
statements must be located so that a new case
clause is added for the new employee type.
A more common alternative of implementing behavior differences is polymorphism. An Employee
interface can be defined and different wage calculations can be handled by different implementations of that interface:
interface Employee { double wages(); } class FullTimeEmployee : Employee { double wages() { double result; // ... return result; } } class HourlyEmployee : Employee { double wages() { double result; // ... return result; } } // ... double result = employee.wages();
Function pointers are yet another alternative for implementing different behavior. They are more common in programming languages that do not support object oriented programming.
Function pointer as a parameter
Let's design a function that takes an array and returns another array. This function will filter out elements with values less than or equal to zero, and multiply the others by ten:
int[] filterAndConvert(const int[] numbers) { int[] result; foreach (e; numbers) { if (e > 0) { // filtering, immutable newNumber = e * 10; // and conversion result ~= newNumber; } } return result; }
The following program demonstrates its behavior with randomly generated values:
import std.stdio; import std.random; void main() { int[] numbers; // Random numbers foreach (i; 0 .. 10) { numbers ~= uniform(0, 10) - 5; } writeln("input : ", numbers); writeln("output: ", filterAndConvert(numbers)); }
The output contains numbers that are ten times the original numbers, which were greater than zero to begin with. The original numbers that have been selected are highlighted:
input : [-2, 2, -2, 3, -2, 2, -1, -4, 0, 0] output: [20, 30, 20]
filterAndConvert()
is for a very specific task: It always selects numbers that are greater than zero and always multiplies them by ten. It could be more useful if the behaviors of filtering and conversion were parameterized.
Noting that filtering is a form of conversion as well (from int
to bool
), filterAndConvert()
performs two conversions:
number > 0
, which producesbool
by considering anint
value.number * 10
, which producesint
from anint
value.
Let's define convenient aliases for function pointers that would match the two conversions above:
alias Predicate = bool function(int); // makes bool from int alias Convertor = int function(int); // makes int from int
Predicate
is the type of functions that take int
and return bool
, and Convertor
is the type of functions that take int
and return int
.
If we provide such function pointers as parameters, we can have filterAndConvert()
use those function pointers during its work:
int[] filterAndConvert(const int[] numbers, Predicate predicate, Convertor convertor) { int[] result; foreach (number; numbers) { if (predicate(number)) { immutable newNumber = convertor(number); result ~= newNumber; } } return result; }
filterAndConvert()
is now an algorithm that is independent of the actual filtering and conversion operations. When desired, its earlier behavior can be achieved by the following two simple functions:
bool isGreaterThanZero(int number) { return number > 0; } int tenTimes(int number) { return number * 10; } // ... writeln("output: ", filterAndConvert(numbers, &isGreaterThanZero, &tenTimes));
This design allows calling filterAndConvert()
with any filtering and conversion behaviors. For example, the following two functions would make filterAndConvert()
produce the negatives of the even numbers:
bool isEven(int number) { return (number % 2) == 0; } int negativeOf(int number) { return -number; } // ... writeln("output: ", filterAndConvert(numbers, &isEven, &negativeOf));
The output:
input : [3, -3, 2, 1, -5, 1, 2, 3, 4, -4] output: [-2, -2, -4, 4]
As seen in these examples, sometimes such functions are so trivial that defining them as proper functions with name, return type, parameter list, and curly brackets is unnecessarily wordy.
As we will see below, the =>
syntax of anonymous functions makes the code more concise and more readable. The following line has anonymous functions that are the equivalents of isEven()
and negativeOf()
, without proper function definitions:
writeln("output: ", filterAndConvert(numbers,
number => (number % 2) == 0,
number => -number));
Function pointer as a member
Function pointers can be stored as members of structs and classes as well. To see this, let's design a class
that takes the predicate and convertor as constructor parameters in order to use them later on:
class NumberHandler { Predicate predicate; Convertor convertor; this(Predicate predicate, Convertor convertor) { this.predicate = predicate; this.convertor = convertor; } int[] handle(const int[] numbers) { int[] result; foreach (number; numbers) { if (predicate(number)) { immutable newNumber = convertor(number); result ~= newNumber; } } return result; } }
An object of that type can be used similarly to filterAndConvert()
:
auto handler = new NumberHandler(&isEven, &negativeOf); writeln("result: ", handler.handle(numbers));
Anonymous functions
The code can be more readable and concise when short functions are defined without proper function definitions.
Anonymous functions, which are also knows as function literals or lambdas, allow defining functions inside of expressions. Anonymous functions can be used at any point where a function pointer can be used.
We will get to their shorter =>
syntax later below. Let's first see their full syntax, which is usually too wordy especially when it appears inside of other expressions:
function return_type(parameters) { /* operations */ }
For example, an object of NumberHandler
that produces 7 times the numbers that are greater than 2 can be constructed by anonymous functions as in the following code:
new NumberHandler(function bool(int number) { return number > 2; }, function int(int number) { return number * 7; });
Two advantages of the code above is that the functions are not defined as proper functions and that their implementations are visible right where the NumberHandler
object is constructed.
Note that the anonymous function syntax is very similar to regular function syntax. Although this consistency has benefits, the full syntax of anonymous functions makes code too wordy.
For that reason, there are various shorter ways of defining anonymous functions.
Shorter syntax
When the return type can be deduced from the return
statement inside the anonymous function, then the return type need not be specified (The place where the return type would normally appear is highlighted by code comments.):
new NumberHandler(function /**/(int number) { return number > 2; }, function /**/(int number) { return number * 7; });
Further, when the anonymous function does not take parameters, its parameter list need not be provided. Let's consider a function that takes a function pointer that takes nothing and returns double
:
void foo(double function() func) { // ... }
Anonymous functions that are passed to that function need not have the empty parameter list. Therefore, all three of the following anonymous function syntaxes are equivalent:
foo(function double() { return 42.42; }); foo(function () { return 42.42; }); foo(function { return 42.42; });
The first one is written in the full syntax. The second one omits the return type, taking advantage of the return type deduction. The third one omits the unnecessary parameter list.
Even further, the keyword function
need not be provided either. In that case it is left to the compiler to determine whether it is an anonymous function or an anonymous delegate. Unless it uses a variable from one of the enclosing scopes, it is a function:
foo({ return 42.42; });
Most anonymous functions can be defined even shorter by the lambda syntax.
Lambda syntax instead of a single return
statement
In most cases even the shortest syntax above is unnecessarily cluttered. The curly brackets that are just inside the function parameter list make the code harder to read and a return
statement as well as its semicolon inside a function argument looks out of place.
Let's start with the full syntax of an anonymous function that has a single return
statement:
function return_type(parameters) { return expression; }
We have already seen that the function
keyword is not necessary and the return type can be deduced:
(parameters) { return expression; }
The equivalent of that definition is the following =>
syntax, where the =>
characters replace the curly brackets, the return
keyword, and the semicolon:
(parameters) => expression
The meaning of that syntax can be spelled out as "given those parameters, produce this expression (value)".
Further, when there is a single parameter, the parentheses around the parameter list can be omitted as well:
single_parameter => expression
On the other hand, to avoid grammar ambiguities, the parameter list must still be written as empty parentheses when there is no parameter at all:
() => expression
Programmers who know lambdas from other languages may make a mistake of using curly brackets after the =>
characters, which can be valid D syntax with a different meaning:
// A lambda that returns 'a + 1' auto l0 = (int a) => a + 1 // A lambda that returns a parameter-less lambda that // returns 'a + 1' auto l1 = (int a) => { return a + 1; } assert(l0(42) == 43); assert(l1(42)() == 43); // Executing what l1 returns
Let's use the lambda syntax in a predicate passed to std.algorithm.filter
. filter()
takes a predicate as its template parameter and a range as its function parameter. It applies the predicate to each element of the range and returns the ones that satisfy the predicate. One of several ways of specifying the predicate is the lambda syntax.
(Note: We will see ranges in a later chapter. At this point, it should be sufficient to know that D slices are ranges.)
The following lambda is a predicate that matches elements that are greater than 10:
import std.stdio; import std.algorithm; void main() { int[] numbers = [ 20, 1, 10, 300, -2 ]; writeln(numbers.filter!(number => number > 10)); }
The output contains only the elements that satisfy the predicate:
[20, 300]
For comparison, let's write the same lambda in the longest syntax. The curly brackets that define the body of the anonymous function are highlighted:
writeln(numbers.filter!(function bool(int number) { return number > 10; }));
As another example, this time let's define an anonymous function that takes two parameters. The following algorithm takes two slices and passes their corresponding elements one by one to a function
that itself takes two parameters. It then collects and returns the results as another slice:
import std.exception; int[] binaryAlgorithm(int function(int, int) func, const int[] slice1, const int[] slice2) { enforce(slice1.length == slice2.length); int[] results; foreach (i; 0 .. slice1.length) { results ~= func(slice1[i], slice2[i]); } return results; }
Since the function
parameter above takes two parameters, lambdas that can be passed to binaryAlgorithm()
must take two parameters as well:
import std.stdio; void main() { writeln(binaryAlgorithm((a, b) => (a * 10) + b, [ 1, 2, 3 ], [ 4, 5, 6 ])); }
The output contains ten times the elements of the first array plus the elements of the second array (e.g. 14 is 10 * 1 + 4):
[14, 25, 36]
Delegates
A delegate is a combination of a function pointer and the context that it should be executed in. Delegates also support closures in D. Closures are a feature supported by many functional programming languages.
As we have seen in the Lifetimes and Fundamental Operations chapter, the lifetime of a variable ends upon leaving the scope that it is defined in:
{ int increment = 10; // ... } // ← the life of 'increment' ends here
That is why the address of such a local variable cannot be returned from a function.
Let's imagine that increment
is a local variable of a function that itself returns a function
. Let's make it so that the returned lambda happens to use that local variable:
alias Calculator = int function(int); Calculator makeCalculator() { int increment = 10; return value => increment + value; // ← compilation ERROR }
That code is in error because the returned lambda makes use of a local variable that is about to go out of scope. If the code were allowed to compile, the lambda would be trying to access increment
, whose life has already ended.
For that code to be compiled and work correctly, the lifetime of increment
must at least be as long as the lifetime of the lambda that uses it. Delegates extend the lifetime of the context of a lambda so that the local state that the function uses remains valid.
delegate
syntax is similar to function
syntax, the only difference being the keyword. That change is sufficient to make the previous code compile:
alias Calculator = int delegate(int); Calculator makeCalculator() { int increment = 10; return value => increment + value; }
Having been used by a delegate, the local variable increment
will now live as long as that delegate lives. The variable is available to the delegate just as any other variable would be, mutable as needed. We will see examples of this in the next chapter when using delegates with opApply()
member functions.
The following is a test of the delegate above:
auto calculator = makeCalculator(); writeln("The result of the calculation: ", calculator(3));
Note that makeCalculator()
returns an anonymous delegate. The code above assigns that delegate to the variable calculator
and then calls it by calculator(3)
. Since the delegate is implemented to return the sum of its parameter and the variable increment
, the code outputs the sum of 3 and 10:
The result of the calculation: 13
Shorter syntax
As we have already used in the previous example, delegates can take advantage of the shorter syntaxes as well. When neither function
nor delegate
is specified, the type of the lambda is decided by the compiler, depending on whether the lambda accesses local state. If so, then it is a delegate
.
The following example has a delegate that does not take any parameters:
int[] delimitedNumbers(int count, int delegate() numberGenerator) { int[] result = [ -1 ]; result.reserve(count + 2); foreach (i; 0 .. count) { result ~= numberGenerator(); } result ~= -1; return result; }
The function delimitedNumbers()
generates a slice where the first and last elements are -1. It takes two parameters that specify the other elements that come between those first and last elements.
Let's call that function with a trivial delegate that always returns the same value. Remember that when there is no parameter, the parameter list of a lambda must be specified as empty:
writeln(delimitedNumbers(3, () => 42));
The output:
-1 42 42 42 -1
Let's call delimitedNumbers()
this time with a delegate that makes use of a local variable:
int lastNumber; writeln(delimitedNumbers( 15, () => lastNumber += uniform(0, 3))); writeln("Last number: ", lastNumber);
Although that delegate produces a random value, since the value is added to the last one, none of the generated values is less than its predecessor:
-1 0 2 3 4 6 6 8 9 9 9 10 12 14 15 17 -1 Last number: 17
An object and a member function as a delegate
We have seen that a delegate is nothing but a function pointer and the context that it is to be executed in. Instead of those two, a delegate can also be composed of a member function and an existing object that that member function is to be called on.
The syntax that defines such a delegate from an object is the following:
&object.member_function
Let's first observe that such a syntax indeed defines a delegate
by printing its string
representation:
import std.stdio; struct Location { long x; long y; void moveHorizontally(long step) { x += step; } void moveVertically(long step) { y += step; } } void main() { auto location = Location(); writeln(typeof(&location.moveHorizontally).stringof); }
According to the output, the type of moveHorizontally()
called on location
is indeed a delegate
:
void delegate(long step)
Note that the &
syntax is only for constructing the delegate. The delegate will be called later by the function call syntax:
// The definition of the delegate variable: auto directionFunction = &location.moveHorizontally; // Calling the delegate by the function call syntax: directionFunction(3); writeln(location);
Since the delegate
combines the location
object and the moveHorizontally()
member function, calling the delegate is the equivalent of calling moveHorizontally()
on location
. The output indicates that the object has indeed moved 3 steps horizontally:
Location(3, 0)
Function pointers, lambdas, and delegates are expressions. They can be used in places where a value of their type is expected. For example, a slice of delegate
objects is initialized below from delegates constructed from an object and its various member functions. The delegate
elements of the slice are later called just like functions:
auto location = Location(); void delegate(long)[] movements = [ &location.moveHorizontally, &location.moveVertically, &location.moveHorizontally ]; foreach (movement; movements) { movement(1); } writeln(location);
According to the elements of the slice, the location has been changed twice horizontally and once vertically:
Location(2, 1)
Delegate properties
The function and context pointers of a delegate can be accessed through its .funcptr
and .ptr
properties, respectively:
struct MyStruct { void func() { } } void main() { auto o = MyStruct(); auto d = &o.func; assert(d.funcptr == &MyStruct.func); assert(d.ptr == &o); }
It is possible to make a delegate
from scratch by setting those properties explicitly:
struct MyStruct { int i; void func() { import std.stdio; writeln(i); } } void main() { auto o = MyStruct(42); void delegate() d; assert(d is null); // null to begin with d.funcptr = &MyStruct.func; d.ptr = &o; d(); }
Calling the delegate above as d()
is the equivalent of the expression o.func()
(i.e. calling MyStruct.func
on o
):
42
Lazy parameters are delegates
We saw the lazy
keyword in the Function Parameters chapter:
void log(Level level, lazy string message) { if (level >= interestedLevel) { writeln(message); } } // ... if (failedToConnect) { log(Level.medium, format("Failure. The connection state is '%s'.", getConnectionState())); }
Because message
is a lazy
parameter above, the entire format
expression (including the getConnectionState()
call that it makes) would be evaluated if and when that parameter is used inside log()
.
Behind the scenes, lazy parameters are in fact delegates and the arguments that are passed to lazy parameters are delegate objects that are created automatically by the compiler. The code below is the equivalent of the one above:
void log(Level level, string delegate() lazyMessage) { // (1) if (level >= interestedLevel) { writefln("%s", lazyMessage()); // (2) } } // ... if (failedToConnect) { log(Level.medium, delegate string() { // (3) return format( "Failure. The connection state is '%s'.", getConnectionState()); }); }
- The
lazy
parameter is not astring
but a delegate that returns astring
. - The delegate is called to get its return value.
- The entire expression is wrapped inside a delegate and returned from it.
Lazy variadic functions
When a function needs a variable number of lazy parameters, it is necessarily impossible to specify those unknown number of parameters as lazy
.
The solution is to use variadic delegate
parameters. Such parameters can receive any number of expressions that are the same as the return type of those delegates. The delegates cannot take parameters:
import std.stdio; void foo(double delegate()[] args...) { foreach (arg; args) { writeln(arg()); // Calling each delegate } } void main() { foo(1.5, () => 2.5); // 'double' passed as delegate }
Note how both a double
expression and a lambda are matched to the variadic parameter. The double
expression is automatically wrapped inside a delegate and the function prints the values of all its effectively-lazy parameters:
1.5 2.5
A limitation of this method is that all parameters must be the same type (double
above). We will see later in the More Templates chapter how to take advantage of tuple template parameters to remove that limitation.
toString()
with a delegate
parameter
We have defined many toString()
functions up to this point in the book to represent objects as strings. Those toString()
definitions all returned a string
without taking any parameters. As noted by the comment lines below, structs and classes took advantage of toString()
functions of their respective members by simply passing those members to format()
:
import std.stdio; import std.string; struct Point { int x; int y; string toString() const { return format("(%s,%s)", x, y); } } struct Color { ubyte r; ubyte g; ubyte b; string toString() const { return format("RGB:%s,%s,%s", r, g, b); } } struct ColoredPoint { Color color; Point point; string toString() const { /* Taking advantage of Color.toString and * Point.toString: */ return format("{%s;%s}", color, point); } } struct Polygon { ColoredPoint[] points; string toString() const { /* Taking advantage of ColoredPoint.toString: */ return format("%s", points); } } void main() { auto polygon = Polygon( [ ColoredPoint(Color(10, 10, 10), Point(1, 1)), ColoredPoint(Color(20, 20, 20), Point(2, 2)), ColoredPoint(Color(30, 30, 30), Point(3, 3)) ]); writeln(polygon); }
In order for polygon
to be sent to the output as a string
on the last line of the program, all of the toString()
functions of Polygon
, ColoredPoint
, Color
, and Point
are called indirectly, creating a total of 10 strings in the process. Note that the strings that are constructed and returned by the lower-level functions are used only once by the respective higher-level function that called them.
However, although a total of 10 strings get constructed, only the very last one is printed to the output:
[{RGB:10,10,10;(1,1)}, {RGB:20,20,20;(2,2)}, {RGB:30,30,30;(3,3)}]
However practical, this method may degrade the performance of the program because of the many string
objects that are constructed and promptly thrown away.
An overload of toString()
avoids this performance issue by taking a delegate
parameter:
void toString(void delegate(const(char)[]) sink) const;
As seen in its declaration, this overload of toString()
does not return a string
. Instead, the characters that are going to be printed are passed to its delegate
parameter. It is the responsibility of the delegate
to append those characters to the single string
that is going to be printed to the output.
All the programmer needs to do differently is to call std.format.formattedWrite
instead of std.string.format
and pass the delegate
parameter as its first parameter (in UFCS below). Also note that the following calls are providing the format strings as template arguments to take advantage of formattedWrite
's compile-time format string checks.
import std.stdio; import std.format; struct Point { int x; int y; void toString(void delegate(const(char)[]) sink) const { sink.formattedWrite!"(%s,%s)"(x, y); } } struct Color { ubyte r; ubyte g; ubyte b; void toString(void delegate(const(char)[]) sink) const { sink.formattedWrite!"RGB:%s,%s,%s"(r, g, b); } } struct ColoredPoint { Color color; Point point; void toString(void delegate(const(char)[]) sink) const { sink.formattedWrite!"{%s;%s}"(color, point); } } struct Polygon { ColoredPoint[] points; void toString(void delegate(const(char)[]) sink) const { sink.formattedWrite!"%s"(points); } } void main() { auto polygon = Polygon( [ ColoredPoint(Color(10, 10, 10), Point(1, 1)), ColoredPoint(Color(20, 20, 20), Point(2, 2)), ColoredPoint(Color(30, 30, 30), Point(3, 3)) ]); writeln(polygon); }
The advantage of this program is that, even though there are still a total of 10 calls made to various toString()
functions, those calls collectively produce a single string
, not 10.
Summary
- The
function
keyword is for defining function pointers to be called later just like a function. - The
delegate
keyword is for defining delegates. A delegate is the pair of a function pointer and the context that that function pointer to be executed in. - A
delegate
can be created from an object and a member function by the syntax&object.member_function
. - A delegate can be constructed explicitly by setting its
.funcptr
and.ptr
properties. - Anonymous functions and anonymous delegates (lambdas) can be used in places of function pointers and delegates in expressions.
- There are several syntaxes for lambdas, the shortest of which is for when the equivalent consists only of a single
return
statement:parameter => expression
. - A more efficient overload of
toString()
takes adelegate
.