More Functions
Functions have been covered in the following chapters so far in the book:
This chapter will cover more features of functions.
Return type attributes
Functions can be marked as auto
, ref
, inout
, and auto ref
. These attributes are about return types of functions.
auto
functions
The return types of auto
functions need not be specified:
auto add(int first, double second) { double result = first + second; return result; }
The return type is deduced by the compiler from the return
expression. Since the type of result
is double
, the return type of add()
is double
.
If there are more than one return
statement, then the return type of the function is their common type. (We have seen common type in the Ternary Operator ?: chapter.) For example, because the common type of int
and double
is double
, the return type of the following auto
function is double
as well:
auto func(int i) { if (i < 0) { return i; // returns 'int' here } return i * 1.5; // returns 'double' here } void main() { // The return type of the function is 'double' auto result = func(42); static assert(is (typeof(result) == double)); }
ref
functions
Normally, the expression that is returned from a function is copied to the caller's context. ref
specifies that the expression should be returned by-reference instead.
For example, the following function returns the greater of its two parameters:
int greater(int first, int second) { return (first > second) ? first : second; }
Normally, both the parameters and the return value of that function are copied:
import std.stdio; void main() { int a = 1; int b = 2; int result = greater(a, b); result += 10; // ← neither a nor b changes writefln("a: %s, b: %s, result: %s", a, b, result); }
Because the return value of greater()
is copied to result
, adding to result
affects only that variable; neither a
nor b
changes:
a: 1, b: 2, result: 12
ref
parameters are passed by references instead of being copied. The same keyword has the same effect on return values:
ref int greater(ref int first, ref int second) { return (first > second) ? first : second; }
This time, the returned reference would be an alias to one of the arguments and mutating the returned reference would modify either a
or b
:
int a = 1; int b = 2; greater(a, b) += 10; // ← either a or b changes writefln("a: %s, b: %s", a, b);
Note that the returned reference is incremented directly. As a result, the greater of the two arguments changes:
a: 1, b: 12
Local reference requires a pointer: An important point is that although the return type is marked as ref
, a
and b
would still not change if the return value were assigned to a local variable:
int result = greater(a, b); result += 10; // ← only result changes
Although greater()
returns a reference to a
or b
, that reference gets copied to the local variable result
, and again neither a
nor b
changes:
a: 1, b: 2, result: 12
For result
be a reference to a
or b
, it has to be defined as a pointer:
int * result = &greater(a, b); *result += 10; writefln("a: %s, b: %s, result: %s", a, b, *result);
This time result
would be a reference to either a
or b
and the mutation through it would affect the actual variable:
a: 1, b: 12, result: 12
It is not possible to return a reference to a local variable: The ref
return value is an alias to one of the arguments that start their lives even before the function is called. That means, regardless of whether a reference to a
or b
is returned, the returned reference refers to a variable that is still alive.
Conversely, it is not possible to return a reference to a variable that is not going to be alive upon leaving the function:
ref string parenthesized(string phrase) { string result = '(' ~ phrase ~ ')'; return result; // ← compilation ERROR } // ← the lifetime of result ends here
The lifetime of local result
ends upon leaving the function. For that reason, it is not possible to return a reference to that variable:
Error: escaping reference to local variable result
auto ref
functions
auto ref
helps with functions like parenthesized()
above. Similar to auto
, the return type of an auto ref
function is deduced by the compiler. Additionally, if the returned expression can be a reference, that variable is returned by reference as opposed to being copied.
parenthesized()
can be compiled if the return type is auto ref
:
auto ref string parenthesized(string phrase) { string result = '(' ~ phrase ~ ')'; return result; // ← compiles }
The very first return
statement of the function determines whether the function returns a copy or a reference.
auto ref
is more useful in function templates where template parameters may be references or copies depending on context.
inout
functions
The inout
keyword appears for parameter and return types of functions. It works like a template for const
, immutable
, and mutable.
Let's rewrite the previous function as taking string
(i.e. immutable(char)[]
) and returning string
:
string parenthesized(string phrase) { return '(' ~ phrase ~ ')'; } // ... writeln(parenthesized("hello"));
As expected, the code works with that string
argument:
(hello)
However, as it works only with immutable
strings, the function can be seen as being less useful than it could have been:
char[] m; // has mutable elements m ~= "hello"; writeln(parenthesized(m)); // ← compilation ERROR
Error: function deneme.parenthesized (string phrase) is not callable using argument types (char[])
The same limitation applies to const(char)[]
strings as well.
One solution for this usability issue is to overload the function for const
and mutable strings:
char[] parenthesized(char[] phrase) { return '(' ~ phrase ~ ')'; } const(char)[] parenthesized(const(char)[] phrase) { return '(' ~ phrase ~ ')'; }
That design would be less than ideal due to the obvious code duplications. Another solution would be to define the function as a template:
T parenthesized(T)(T phrase) {
return '(' ~ phrase ~ ')';
}
Although that would work, this time it may be seen as being too flexible and potentially requiring template constraints.
inout
is very similar to the template solution. The difference is that not the entire type but just the mutability attribute is deduced from the parameter:
inout(char)[] parenthesized(inout(char)[] phrase) { return '(' ~ phrase ~ ')'; }
inout
transfers the deduced mutability attribute to the return type.
When the function is called with char[]
, it gets compiled as if inout
is not specified at all. On the other hand, when called with immutable(char)[]
or const(char)[]
, inout
means immutable
or const
, respectively.
The following code demonstrates this by printing the type of the returned expression:
char[] m; writeln(typeof(parenthesized(m)).stringof); const(char)[] c; writeln(typeof(parenthesized(c)).stringof); immutable(char)[] i; writeln(typeof(parenthesized(i)).stringof);
The output:
char[] const(char)[] string
Behavioral attributes
pure
, nothrow
, and @nogc
are about function behaviors.
pure
functions
As we have seen in the Functions chapter, functions can produce return values and side effects. When possible, return values should be preferred over side effects because functions that do not have side effects are easier to make sense of, which in turn helps with program correctness and maintainability.
A similar concept is the purity of a function. Purity is defined differently in D from most other programming languages: In D, a function that does not access mutable global or static
state is pure. (Since input and output streams are considered as mutable global state, pure functions cannot perform input or output operations either.)
In other words, a function is pure if it produces its return value and side effects only by accessing its parameters, local variables, and immutable global state.
An important aspect of purity in D is that pure functions can mutate their parameters.
Additionally, the following operations that mutate the global state of the program are explicitly allowed in pure functions:
- Allocate memory with the
new
expression - Terminate the program
- Access the floating point processing flags
- Throw exceptions
The pure
keyword specifies that a function should behave according to those conditions and the compiler guarantees that it does so.
Naturally, since impure functions do not provide the same guarantees, a pure function cannot call impure functions.
The following program demonstrates some of the operations that a pure function can and cannot perform:
import std.stdio; import std.exception; int mutableGlobal; const int constGlobal; immutable int immutableGlobal; void impureFunction() { } int pureFunction(ref int i, int[] slice) pure { // Can throw exceptions: enforce(slice.length >= 1); // Can mutate its parameters: i = 42; slice[0] = 43; // Can access immutable global state: i = constGlobal; i = immutableGlobal; // Can use the new expression: auto p = new int; // Cannot access mutable global state: i = mutableGlobal; // ← compilation ERROR // Cannot perform input and output operations: writeln(i); // ← compilation ERROR static int mutableStatic; // Cannot access mutable static state: i = mutableStatic; // ← compilation ERROR // Cannot call impure functions: impureFunction(); // ← compilation ERROR return 0; } void main() { int i; int[] slice = [ 1 ]; pureFunction(i, slice); }
Although they are allowed to, some pure functions do not mutate their parameters. Following from the rules of purity, the only observable effect of such a function would be its return value. Further, since the function cannot access any mutable global state, the return value would be the same for a given set of arguments, regardless of when and how many times the function is called during the execution of the program. This fact gives both the compiler and the programmer optimization opportunities. For example, instead of calling the function a second time for a given set of arguments, its return value from the first call can be cached and used instead of actually calling the function again.
Since the exact code that gets generated for a template instantiation depends on the actual template arguments, whether the generated code is pure depends on the arguments as well. For that reason, the purity of a template is inferred by the compiler from the generated code. (The pure
keyword can still be specified by the programmer.) Similarly, the purity of an auto
function is inferred.
As a simple example, since the following function template would be impure when N
is zero, it would not be possible to call templ!0()
from a pure function:
import std.stdio; // This template is impure when N is zero void templ(size_t N)() { static if (N == 0) { // Prints when N is zero: writeln("zero"); } } void foo() pure { templ!0(); // ← compilation ERROR } void main() { foo(); }
The compiler infers that the 0
instantiation of the template is impure and rejects calling it from the pure function foo()
:
Error: pure function 'deneme.foo' cannot call impure function
'deneme.templ!0.templ'
However, since the instantiation of the template for values other than zero is pure, the program can be compiled for such values:
void foo() pure { templ!1(); // ← compiles }
We have seen earlier above that input and output functions like writeln()
cannot be used in pure functions because they access global state. Sometimes such limitations are too restrictive e.g. when needing to print a message temporarily during debugging. For that reason, the purity rules are relaxed for code that is marked as debug
:
import std.stdio; debug size_t fooCounter; void foo(int i) pure { debug ++fooCounter; if (i == 0) { debug writeln("i is zero"); i = 42; } // ... } void main() { foreach (i; 0..100) { if ((i % 10) == 0) { foo(i); } } debug writefln("foo is called %s times", fooCounter); }
The pure function above mutates the global state of the program by modifying a global variable and printing a message. Despite those impure operations, it still can be compiled because those operations are marked as debug
.
Note: Remember that those statements are included in the program only if the program is compiled with the ‑debug
command line switch.
Member functions can be marked as pure
as well. Subclasses can override impure functions as pure
but the reverse is not allowed:
interface Iface { void foo() pure; // Subclasses must define foo as pure. void bar(); // Subclasses may define bar as pure. } class Class : Iface { void foo() pure { // Required to be pure // ... } void bar() pure { // pure although not required // ... } }
Delegates and anonymous functions can be pure as well. Similar to templates, whether a function or delegate literal, or auto
function is pure is inferred by the compiler:
import std.stdio; void foo(int delegate(double) pure dg) { int i = dg(1.5); } void main() { foo(a => 42); // ← compiles foo((a) { // ← compilation ERROR writeln("hello"); return 42; }); }
foo()
above requires that its parameter be a pure delegate. The compiler infers that the lambda a => 42
is pure and allows it as an argument for foo()
. However, since the other delegate is impure it cannot be passed to foo()
:
Error: function deneme.foo (int delegate(double) pure dg) is not callable using argument types (void)
One benefit of pure
functions is that their return values can be used to initialize immutable
variables. Although the array produced by makeNumbers()
below is mutable, it is not possible for its elements to be changed by any code outside of that function. For that reason, the initialization works.
int[] makeNumbers() pure { int[] result; result ~= 42; return result; } void main() { immutable array = makeNumbers(); }
nothrow
functions
We saw the exception mechanism in the Exceptions chapter.
It would be good practice for functions to document the types of exceptions that they may throw under specific error conditions. However, as a general rule, callers should assume that any function can throw any exception.
Sometimes it is more important to know that a function does not emit any exception at all. For example, some algorithms can take advantage of the fact that certain of their steps cannot be interrupted by an exception.
nothrow
guarantees that a function does not emit any exception:
int add(int lhs, int rhs) nothrow { // ... }
Note: Remember that it is not recommended to catch Error
nor its base class Throwable
. What is meant here by "any exception" is "any exception that is defined under the Exception
hierarchy." A nothrow
function can still emit exceptions that are under the Error
hierarchy, which represents irrecoverable error conditions that should preclude the program from continuing its execution.
Such a function can neither throw an exception itself nor can call a function that may throw an exception:
int add(int lhs, int rhs) nothrow { writeln("adding"); // ← compilation ERROR return lhs + rhs; }
The compiler rejects the code because add()
violates the no-throw guarantee:
Error: function 'deneme.add' is nothrow yet may throw
This is because writeln
is not (and cannot be) a nothrow
function.
The compiler can infer that a function can never emit an exception. The following implementation of add()
is nothrow
because it is obvious to the compiler that the try-catch
block prevents any exception from escaping the function:
int add(int lhs, int rhs) nothrow { int result; try { writeln("adding"); // ← compiles result = lhs + rhs; } catch (Exception error) { // catches all exceptions // ... } return result; }
As mentioned above, nothrow
does not include exceptions that are under the Error
hierarchy. For example, although accessing an element of an array with []
can throw RangeError
, the following function can still be defined as nothrow
:
int foo(int[] arr, size_t i) nothrow { return 10 * arr[i]; }
As with purity, the compiler automatically deduces whether a template, delegate, or anonymous function is nothrow
.
@nogc
functions
D is a garbage collected language. Many data structures and algorithms in most D programs take advantage of dynamic memory blocks that are managed by the garbage collector (GC). Such memory blocks are reclaimed again by the GC by an algorithm called garbage collection.
Some commonly used D operations take advantage of the GC as well. For example, elements of arrays live on dynamic memory blocks:
// A function that takes advantage of the GC indirectly int[] append(int[] slice) { slice ~= 42; return slice; }
If the slice does not have sufficient capacity, the ~=
operator above allocates a new memory block from the GC.
Although the GC is a significant convenience for data structures and algorithms, memory allocation and garbage collection are costly operations that make the execution of some programs noticeably slow.
@nogc
means that a function cannot use the GC directly or indirectly:
void foo() @nogc { // ... }
The compiler guarantees that a @nogc
function does not involve GC operations. For example, the following function cannot call append()
above, which does not provide the @nogc
guarantee:
void foo() @nogc { int[] slice; // ... append(slice); // ← compilation ERROR }
Error: @nogc function 'deneme.foo' cannot call non-@nogc function
'deneme.append'
Code safety attributes
@safe
, @trusted
, and @system
are about the code safety that a function provides. As with purity, the compiler infers the safety level of templates, delegates, anonymous functions, and auto
functions.
@safe
functions
A class of programming errors involve corrupting data at unrelated locations in memory by writing at those locations unintentionally. Such errors are mostly due to mistakes made in using pointers and applying type casts.
@safe
functions guarantee that they do not contain any operation that may corrupt memory. The compiler does not allow the following operations in @safe
functions:
- Pointers cannot be converted to other pointer types other than
void*
. - A non-pointer expression cannot be converted to a pointer value.
- Pointer values cannot be changed (no pointer arithmetic; however, assigning a pointer to another pointer of the same type is safe).
- Unions that have pointer or reference members cannot be used.
- Functions marked as
@system
cannot be called. - Exceptions that are not descended from
Exception
cannot be caught. - Inline assembler cannot be used.
- Mutable variables cannot be cast to
immutable
. immutable
variables cannot be cast to mutable.- Thread-local variables cannot be cast to
shared
. shared
variables cannot be cast to thread-local.- Addresses of function-local variables cannot be taken.
__gshared
variables cannot be accessed.
@trusted
functions
Some functions may actually be safe but cannot be marked as @safe
for various reasons. For example, a function may have to call a library written in C, where no language support exists for safety in that language.
Some other functions may actually perform operations that are not allowed in @safe
code, but may be well tested and trusted to be correct.
@trusted
is an attribute that communicates to the compiler that although the function cannot be marked as @safe
, consider it safe. The compiler trusts the programmer and treats @trusted
code as if it is safe. For example, it allows @safe
code to call @trusted
code.
@system
functions
Any function that is not marked as @safe
or @trusted
is considered @system
, which is the default safety attribute.
Compile time function execution (CTFE)
In many programming languages, computations that are performed at compile time are very limited. Such computations are usually as simple as calculating the length of a fixed-length array or simple arithmetic operations:
writeln(1 + 2);
The 1 + 2
expression above is compiled as if it has been written as 3
; there is no computation at runtime.
D has CTFE, which allows any function to be executed at compile time as long as it is possible to do so.
Let's consider the following program that prints a menu to the output:
import std.stdio; import std.string; import std.range; string menuLines(string[] choices) { string result; foreach (i, choice; choices) { result ~= format(" %s. %s\n", i + 1, choice); } return result; } string menu(string title, string[] choices, size_t width) { return format("%s\n%s\n%s", title.center(width), '='.repeat(width), // horizontal line menuLines(choices)); } void main() { enum drinks = menu("Drinks", [ "Coffee", "Tea", "Hot chocolate" ], 20); writeln(drinks); }
Although the same result can be achieved in different ways, the program above performs non-trivial operations to produce the following string
:
Drinks ==================== 1. Coffee 2. Tea 3. Hot chocolate
Remember that the initial value of enum
constants like drinks
must be known at compile time. That fact is sufficient for menu()
to be executed at compile time. The value that it returns at compile time is used as the initial value of drinks
. As a result, the program is compiled as if that value is written explicitly in the program:
// The equivalent of the code above: enum drinks = " Drinks \n" "====================\n" " 1. Coffee\n" " 2. Tea\n" " 3. Hot chocolate\n";
For a function to be executed at compile time, it must appear in an expression that in fact is needed at compile time:
- Initializing a
static
variable - Initializing an
enum
variable - Calculating the length of a fixed-length array
- Calculating a template value argument
Clearly, it would not be possible to execute every function at compile time. For example, a function that accesses a global variable cannot be executed at compile time because the global variable does not start its life until run time. Similarly, since stdout
is available only at run time, functions that print cannot be executed at compile time.
The __ctfe
variable
It is a powerful aspect of CTFE that the same function is used for both compile time and run time depending on when its result is needed. Although the function need not be written in any special way for CTFE, some operations in the function may make sense only at compile time or run time. The special variable __ctfe
can be used to differentiate the code that are only for compile time or only for run time. The value of this variable is true
when the function is being executed for CTFE, false
otherwise:
import std.stdio; size_t counter; int foo() { if (!__ctfe) { // This code is for execution at run time ++counter; } return 42; } void main() { enum i = foo(); auto j = foo(); writefln("foo is called %s times.", counter); }
As counter
lives only at run time, it cannot be incremented at compile time. For that reason, the code above attempts to increment it only for run-time execution. Since the value of i
is determined at compile time and the value of j
is determined at run time, foo()
is reported to have been called just once during the execution of the program:
foo is called 1 times.
Summary
- The return type of an
auto
function is deduced automatically. - The return value of a
ref
function is a reference to an existing variable. - The return value of an
auto ref
function is a reference if possible, a copy otherwise. inout
carries theconst
,immutable
, or mutable attribute of the parameter to the return type.- A
pure
function cannot access mutable global or static state. The compiler infers the purity of templates, delegates, anonymous functions, andauto
functions. nothrow
functions cannot emit exceptions. The compiler infers whether a template, delegate, anonymous function, orauto
function is no-throw.@nogc
functions cannot involve GC operations.@safe
functions cannot corrupt memory. The compiler infers the safety attributes of templates, delegates, anonymous functions, andauto
functions.@trusted
functions are indeed safe but cannot be specified as such; they are considered@safe
both by the programmer and the compiler.@system
functions can use every D feature.@system
is the default safety attribute.- Functions can be executed at compile time as well (CTFE). This can be differentiated by the value of the special variable
__ctfe
.