Programming in D - Function Parameters

Function Parameters

This chapter covers various kinds of function parameters.

Some of the concepts of this chapter have already appeared earlier in the book. For example, the ref keyword that we saw in the foreach Loop chapter was making actual elements available in foreach loops as opposed to copies of those elements.

Additionally, we covered the const and immutable keywords and the differences between value types and reference types in previous chapters.

We have written functions that produced results by making use of their parameters. For example, the following function uses its parameters in a calculation:

double weightedAverage(double quizGrade, double finalGrade) {
    return quizGrade * 0.4 + finalGrade * 0.6;
}

That function calculates the average grade by taking 40% of the quiz grade and 60% of the final grade. Here is how it may be used:

    int quizGrade = 76;
    int finalGrade = 80;

    writefln("Weighted average: %2.0f",
             weightedAverage(quizGrade, finalGrade));

Parameters are always copied

In the code above, the two variables are passed as arguments to weightedAverage(). The function uses its parameters. This fact may give the false impression that the function uses the actual variables that have been passed as arguments. In reality, what the function uses are copies of those variables.

This distinction is important because modifying a parameter changes only the copy. This can be seen in the following function that is trying to modify its parameter (i.e. making a side effect). Let's assume that the following function is written for reducing the energy of a game character:

void reduceEnergy(double energy) {
    energy /= 4;
}

Here is a program that tests reduceEnergy():

import std.stdio;

void reduceEnergy(double energy) {
    energy /= 4;
}

void main() {
    double energy = 100;

    reduceEnergy(energy);
    writeln("New energy: ", energy);
}

The output:

New energy: 100     ← Not changed

Although reduceEnergy() drops the value of its parameter to a quarter of its original value, the variable energy in main() does not change. The reason for this is that the energy variable in main() and the energy parameter of reduceEnergy() are separate; the parameter is a copy of the variable in main().

To observe this more closely, let's insert some writeln() expressions:

import std.stdio;

void reduceEnergy(double energy) {
    writeln("Entered the function      : ", energy);
    energy /= 4;
    writeln("Leaving the function      : ", energy);
}

void main() {
    double energy = 100;

    writeln("Calling the function      : ", energy);
    reduceEnergy(energy);
    writeln("Returned from the function: ", energy);
}

The output:

Calling the function      : 100
Entered the function      : 100
Leaving the function      : 25   ← the parameter changes,
Returned from the function: 100  ← the variable remains the same

Referenced variables are not copied

Even parameters of reference types like slices, associative arrays, and class variables are copied to functions. However, the original variables that are referenced (i.e. elements of slices and associative arrays, and class objects) are not copied. Effectively, such variables are passed to functions as references: the parameter becomes another reference to the original object. It means that a modification made through the reference modifies the original object as well.

Being slices of characters, this applies to strings as well:

import std.stdio;

void makeFirstLetterDot(dchar[] str) {
    str[0] = '.';
}

void main() {
    dchar[] str = "abc"d.dup;
    makeFirstLetterDot(str);
    writeln(str);
}

The change made to the first element of the parameter affects the actual element in main():

.bc

However, the original slice and associative array variables are still passed by copy. This may have surprising and seemingly unpredictable results unless the parameters are qualified as ref themselves.

Surprising reference semantics of slices

As we saw in the Slices and Other Array Features chapter, adding elements to a slice may terminate element sharing. Obviously, once sharing ends, a slice parameter like str above would not be a reference to the elements of the passed-in original variable anymore.

For example, the element that is appended by the following function will not be seen by the caller:

import std.stdio;

void appendZero(int[] arr) {
    arr ~= 0;
    writefln("Inside appendZero()       : %s", arr);
}

void main() {
    auto arr = [ 1, 2 ];
    appendZero(arr);
    writefln("After appendZero() returns: %s", arr);
}

The element is appended only to the function parameter, not to the original slice:

Inside appendZero()       : [1, 2, 0]
After appendZero() returns: [1, 2]    ← No 0

If the new elements need to be appended to the original slice, then the slice must be passed as ref:

void appendZero(ref int[] arr) {
    // ...
}

The ref qualifier will be explained below.

Surprising reference semantics of associative arrays

Associative arrays that are passed as function parameters may cause surprises as well because associative arrays start their lives as null, not empty.

In this context, null means an uninitialized associative array. Associative arrays are initialized automatically when their first key-value pair is added. As a consequence, if a function adds an element to a null associative array, then that element cannot be seen in the original variable because although the parameter is initialized, the original variable remains null:

import std.stdio;

void appendElement(int[string] aa) {
    aa["red"] = 100;
    writefln("Inside appendElement()       : %s", aa);
}

void main() {
    int[string] aa;    // ← null to begin with
    appendElement(aa);
    writefln("After appendElement() returns: %s", aa);
}

The original variable does not have the added element:

Inside appendElement()       : ["red":100]
After appendElement() returns: []    ← Still null

On the other hand, if the associative array were not null to begin with, then the added element would be seen by the caller as well:

    int[string] aa;
    aa["blue"] = 10;  // ← Not null before the call
    appendElement(aa);

This time the added element is seen by the caller:

Inside appendElement()       : ["red":100, "blue":10]
After appendElement() returns: ["red":100, "blue":10]

For that reason, it may be better to pass the associative array as a ref parameter, which will be explained below.

Parameter qualifiers

Parameters are passed to functions according to the general rules described above:

Value types are copied, after which the original variable and the copy are independent.
Reference types are copied as well but by nature of reference types, both the original reference and the parameter provide access to the same variable.

Those are the default rules that are applied when parameter definitions have no qualifiers. The following qualifiers change the way parameters are passed and what operations are allowed on them.

`in`

By default, in parameters are the same as const parameters. They cannot be modified:

void foo(in int value) {
    value = 1;    // ← compilation ERROR
}

However, when the ‑preview=in compiler command line switch is used, then in parameters become more useful to express the intent of "this parameter is used only as input by this function":

$ dmd -preview=in deneme.d

-preview=in changes the meaning of in parameters and allows the compiler to choose more appropriate methods when passing arguments for in parameters:

The meaning of in changes to mean const scope (see below for scope)
Unlike ref parameters, even rvalues can be passed as in parameters (see below for ref and the next chapter for rvalues)
Types that would cause a side effect when copied (e.g. when the copy constructor is defined) or types that cannot be copied (e.g. copy constructor is disabled) are passed by reference

I recommend in parameters over const parameters regardless of whether the ‑preview=in command line switch is used or not.

`out`

We know that functions return what they produce as their return values. The fact that there is only one return value is sometimes limiting as some functions may need to produce more than one result. (Note: It is possible to return more than one result by defining the return type as a Tuple or a struct. We will see these features in later chapters.)

The out keyword makes it possible for functions to return results through their parameters. When out parameters are modified within the function, those modifications affect the original variable that has been passed to the function. In a sense, the assigned value goes out of the function through the out parameter.

Let's have a look at a function that divides two numbers and produces both the quotient and the remainder. The return value is used for the quotient and the remainder is returned through the out parameter:

import std.stdio;

int divide(int dividend, int divisor, out int remainder) {
    remainder = dividend % divisor;
    return dividend / divisor;
}

void main() {
    int remainder;
    int result = divide(7, 3, remainder);

    writeln("result: ", result, ", remainder: ", remainder);
}

Modifying the remainder parameter of the function modifies the remainder variable in main() (their names need not be the same):

result: 2, remainder: 1

Regardless of their values at the call site, out parameters are first assigned to the .init value of their types automatically:

import std.stdio;

void foo(out int parameter) {
    writeln("After entering the function      : ", parameter);
}

void main() {
    int variable = 100;

    writeln("Before calling the function      : ", variable);
    foo(variable);
    writeln("After returning from the function: ", variable);
}

Even though there is no explicit assignment to the parameter in the function, the value of the parameter automatically becomes the initial value of int, affecting the variable in main():

Before calling the function      : 100
After entering the function      : 0  ← the value of int.init
After returning from the function: 0

As this demonstrates, out parameters cannot pass values into functions; they are strictly for passing values out of functions.

We will see in later chapters that returning Tuple or struct types are better alternatives to out parameters.

`const`

I recommend in parameters over const parameters.

As we saw earlier, const guarantees that the parameter will not be modified inside the function. It is helpful for the programmers to know that certain variables will not be changed by a function. const also makes functions more useful by allowing const, immutable, and mutable variables to be passed through that parameter:

import std.stdio;

dchar lastLetter(const dchar[] str) {
    return str[$ - 1];
}

void main() {
    writeln(lastLetter("constant"));
}

`immutable`

As we saw earlier, immutable makes functions require that certain variables must be immutable. Because of such a requirement, the following function can only be called with strings with immutable elements (e.g. string literals):

import std.stdio;

dchar[] mix(immutable dchar[] first,
            immutable dchar[] second) {
    dchar[] result;
    int i;

    for (i = 0; (i < first.length) && (i < second.length); ++i) {
        result ~= first[i];
        result ~= second[i];
    }

    result ~= first[i..$];
    result ~= second[i..$];

    return result;
}

void main() {
    writeln(mix("HELLO", "world"));
}

Since it forces a requirement on the parameter, immutable parameters should be used only when immutability is required. Otherwise, in general const is more useful because it accepts immutable, const, and mutable variables.

`ref`

This keyword allows passing a variable by reference even though it would normally be passed as a copy (i.e. by value).

Rvalues (see the next chapter) cannot be passed to functions as ref parameters.

For the reduceEnergy() function that we saw earlier to modify the original variable, it must take its parameter as ref:

import std.stdio;

void reduceEnergy(ref double energy) {
    energy /= 4;
}

void main() {
    double energy = 100;

    reduceEnergy(energy);
    writeln("New energy: ", energy);
}

This time, the modification that is made to the parameter changes the original variable in main():

New energy: 25

As can be seen, ref parameters can be used both as input and output. ref parameters can also be thought of as aliases of the original variables. The function parameter energy above is an alias of the variable energy in main().

Similar to out parameters, ref parameters allow functions to have side effects as well. In fact, reduceEnergy() does not return a value; it only causes a side effect through its single parameter.

The programming style called functional programming favors return values over side effects, so much so that some functional programming languages do not allow side effects at all. This is because functions that produce results purely through their return values are easier to understand, implement, and maintain.

The same function can be written in a functional programming style by returning the result, instead of causing a side effect. The parts of the program that changed are highlighted:

import std.stdio;

double reducedEnergy(double energy) {
    return energy / 4;
}

void main() {
    double energy = 100;

    energy = reducedEnergy(energy);
    writeln("New energy: ", energy);
}

Note the change in the name of the function as well. Now it is a noun as opposed to a verb.

`auto ref`

This qualifier can only be used with templates. As we will see in the next chapter, an auto ref parameter takes lvalues by reference and rvalues by copy.

`inout`

Despite its name consisting of in and out, this keyword does not mean input and output; we have already seen that input and output is achieved by the ref keyword.

inout carries the mutability of the parameter to the return type. If the parameter is const, immutable, or mutable; then the return value is also const, immutable, or mutable; respectively.

To see how inout helps in programs, let's look at a function that returns a slice to the inner elements of its parameter:

import std.stdio;

int[] inner(int[] slice) {
    if (slice.length) {
        --slice.length;               // trim from the end

        if (slice.length) {
            slice = slice[1 .. $];    // trim from the beginning
        }
    }

    return slice;
}

void main() {
    int[] numbers = [ 5, 6, 7, 8, 9 ];
    writeln(inner(numbers));
}

The output:

[6, 7, 8]

According to what we have established so far in the book, in order for the function to be more useful, its parameter should be const(int)[] because the elements are not being modified inside the function. (Note that there is no harm in modifying the parameter slice itself, as it is a copy of the original variable.)

However, defining the function that way would cause a compilation error:

int[] inner(const(int)[] slice) {
    // ...
    return slice;    // ← compilation ERROR
}

The compilation error indicates that a slice of const(int) cannot be returned as a slice of mutable int:

Error: cannot implicitly convert expression (slice) of type
const(int)[] to int[]

One may think that specifying the return type as const(int)[] would be the solution:

const(int)[] inner(const(int)[] slice) {
    // ...
    return slice;    // now compiles
}

Although the code now compiles, it brings a limitation: even when the function is called with a slice of mutable elements, this time the returned slice ends up consisting of const elements. To see how limiting this would be, let's look at the following code, which tries to modify the inner elements of a slice:

    int[] numbers = [ 5, 6, 7, 8, 9 ];
    int[] middle = inner(numbers);    // ← compilation ERROR
    middle[] *= 10;

The returned slice of type const(int)[] cannot be assigned to a slice of type int[], resulting in an error:

Error: cannot implicitly convert expression (inner(numbers))
of type const(int)[] to int[]

However, since we started with a slice of mutable elements, this limitation is artificial and unfortunate. inout solves this mutability problem between parameters and return values. It is specified on both the parameter and the return type and carries the mutability of the former to the latter:

inout(int)[] inner(inout(int)[] slice) {
    // ...
    return slice;
}

With that change, the same function can now be called with const, immutable, and mutable slices:

    {
        int[] numbers = [ 5, 6, 7, 8, 9 ];
        // The return type is a slice of mutable elements
        int[] middle = inner(numbers);
        middle[] *= 10;
        writeln(middle);
    }
    {
        immutable int[] numbers = [ 10, 11, 12 ];
        // The return type is a slice of immutable elements
        immutable int[] middle = inner(numbers);
        writeln(middle);
    }
    {
        const int[] numbers = [ 13, 14, 15, 16 ];
        // The return type is a slice of const elements
        const int[] middle = inner(numbers);
        writeln(middle);
    }

`lazy`

It is natural to expect that arguments are evaluated before entering functions that use those arguments. For example, the function add() below is called with the return values of two other functions:

    result = add(anAmount(), anotherAmount());

In order for add() to be called, first anAmount() and anotherAmount() must be called. Otherwise, the values that add() needs would not be available.

Evaluating arguments before calling a function is called eager evaluation.

However, depending on certain conditions, some parameters may not get a chance to be used in the function at all. In such cases, evaluating the arguments eagerly would be wasteful.

A classic example of this situation is a logging function that outputs a message only if the importance of the message is above a certain configuration setting:

enum Level { low, medium, high }

void log(Level level, string message) {
    if (level >= interestedLevel) {
        writeln(message);
    }
}

For example, if the user is interested only in the messages that are Level.high, a message with Level.medium would not be printed. However, the argument would still be evaluated before calling the function. For example, the entire format() expression below including the getConnectionState() call that it makes would be wasted if the message is never printed:

    if (failedToConnect) {
        log(Level.medium,
            format("Failure. The connection state is '%s'.",
                   getConnectionState()));
    }

The lazy keyword specifies that an expression that is passed as a parameter will be evaluated only if and when needed:

void log(Level level, lazy string message) {
   // ... the body of the function is the same as before ...
}

This time, the expression would be evaluated only if the message parameter is used.

One thing to be careful about is that a lazy parameter is evaluated every time that parameter is used in the function.

For example, because the lazy parameter of the following function is used three times in the function, the expression that provides its value is evaluated three times:

import std.stdio;

int valueOfArgument() {
    writeln("Calculating...");
    return 1;
}

void functionWithLazyParameter(lazy int value) {
    int result = value + value + value;
    writeln(result);
}

void main() {
    functionWithLazyParameter(valueOfArgument());
}

The output:

Calculating
Calculating
Calculating
3

`scope`

This keyword specifies that a parameter will not be used beyond the scope of the function. As of this writing, scope is effective only if the function is defined as @safe and if -dip1000 compiler switch is used. DIP is short for D Improvement Proposal. DIP 1000 is experimental as of this writing; so it may not work as expected in all cases.

$ dmd -dip1000 deneme.d

int[] globalSlice;

@safe int[] foo(scope int[] parameter) {
    globalSlice = parameter;    // ← compilation ERROR
    return parameter;           // ← compilation ERROR
}

void main() {
    int[] slice = [ 10, 20 ];
    int[] result = foo(slice);
}

The function above violates the promise of scope in two places: It assigns the parameter to a global variable, and it returns it. Both those actions would make it possible for the parameter to be accessed after the function finishes.

`shared`

This keyword requires that the parameter is shareable between threads of execution:

void foo(shared int[] i) {
    // ...
}

void main() {
    int[] numbers = [ 10, 20 ];
    foo(numbers);    // ← compilation ERROR
}

The program above cannot be compiled because the argument is not shared. The following is the necessary change to make it compile:

    shared int[] numbers = [ 10, 20 ];
    foo(numbers);    // now compiles

We will see the shared keyword later in the Data Sharing Concurrency chapter.

`return`

Sometimes it is useful for a function to return one of its ref parameters directly. For example, the following pick() function picks and returns one of its parameters randomly so that the caller can mutate the lucky one directly:

import std.stdio;
import std.random;

ref int pick(ref int lhs, ref int rhs) {
    return uniform(0, 2) ? lhs : rhs;    // ← compilation ERROR
}

void main() {
    int a;
    int b;

    pick(a, b) = 42;

    writefln("a: %s, b: %s", a, b);
}

As a result, either a or b inside main() is assigned the value 42:

a: 42, b: 0

a: 0, b: 42

Unfortunately, one of the arguments of pick() may have a shorter lifetime than the returned reference. For example, the following foo() function calls pick() with two local variables, effectively itself returning a reference to one of them:

import std.random;

ref int pick(ref int lhs, ref int rhs) {
    return uniform(0, 2) ? lhs : rhs;    // ← compilation ERROR
}

ref int foo() {
    int a;
    int b;

    return pick(a, b);    // ← BUG: returning invalid reference
}

void main() {
    foo() = 42;           // ← BUG: writing to invalid memory
}

Since the lifetimes of both a and b end upon leaving foo(), the assignment in main() cannot be made to a valid variable. This results in undefined behavior.

The term undefined behavior describes situations where the behavior of the program is not defined by the programming language specification. Nothing can be said about the behavior of a program that contains undefined behavior. (In practice though, for the program above, the value 42 would most likely be written to a memory location that used to be occupied by either a or b, potentially currently a part of an unrelated variable, effectively corrupting the value of that unrelated variable.)

The return keyword can be applied to a parameter to prevent such bugs. It specifies that a parameter must be a reference to a variable with a longer lifetime than the returned reference:

import std.random;

ref int pick(return ref int lhs, return ref int rhs) {
    return uniform(0, 2) ? lhs : rhs;
}

ref int foo() {
    int a;
    int b;

    return pick(a, b);    // ← compilation ERROR
}

void main() {
    foo() = 42;
}

This time the compiler sees that the arguments to pick() have a shorter lifetime than the reference that foo() is attempting to return:

Error: escaping reference to local variable a
Error: escaping reference to local variable b

This feature is called sealed references.

Note: Although it is conceivable that the compiler could inspect pick() and detect the bug even without the return keyword, it cannot do so in general because the bodies of some functions may not be available to the compiler during every compilation.

Summary

A parameter is what the function takes from its caller to accomplish its task.
An argument is an expression (e.g. a variable) that is passed to a function as a parameter.
Every argument is passed by copy by default. (Note that for reference types, it is the reference that is copied, not the original variable.)
in specifies that the parameter is used only for data input. Prefer in over const.
out specifies that the parameter is used only for data output.
ref specifies that the parameter is used for data input and data output.
auto ref is used in templates only. It specifies that if the argument is an lvalue, then a reference to it is passed; if the argument is an rvalue, then it is passed by copy.
const guarantees that the parameter is not modified inside the function. (Remember that const is transitive: any data reached through a const variable is const as well.) Prefer in over const.
immutable requires the argument to be immutable.
inout appears both at the parameter and the return type, and transfers the mutability of the parameter to the return type.
lazy is used to make a parameter be evaluated when (and every time) it is actually used.
scope guarantees that no reference to the parameter will be leaked from the function.
shared requires the parameter to be shared.
return on a parameter requires the parameter to live longer than the returned reference.

Exercise

The following program is trying to swap the values of two arguments:

import std.stdio;

void swap(int first, int second) {
    int temp = first;
    first = second;
    second = temp;
}

void main() {
    int a = 1;
    int b = 2;

    swap(a, b);

    writeln(a, ' ', b);
}

However, the program does not have any effect on a or b:

1 2          ← not swapped

Fix the function so that the values of a and b are swapped.

... the solution

[ ↢ Prev ] [ Next ↣ ]