Type Conversions
Variables must be compatible with the expressions that they take part in. As it has probably been obvious from the programs that we have seen so far, D is a statically typed language, meaning that the compatibility of types is validated at compile time.
All of the expressions that we have written so far always had compatible types because otherwise the code would be rejected by the compiler. The following is an example of code that has incompatible types:
char[] slice; writeln(slice + 5); // ← compilation ERROR
The compiler rejects the code due to the incompatible types char[]
and int
for the addition operation:
Error: incompatible types for ((slice) + (5)): 'char[]' and 'int'
Type incompatibility does not mean that the types are different; different types can indeed be used in expressions safely. For example, an int
variable can safely be used in place of a double
value:
double sum = 1.25; int increment = 3; sum += increment;
Even though sum
and increment
are of different types, the code above is valid because incrementing a double
variable by an int
value is legal.
Automatic type conversions
Automatic type conversions are also called implicit type conversions.
Although double
and int
are compatible types in the expression above, the addition operation must still be evaluated as a specific type at the microprocessor level. As you would remember from the Floating Point Types chapter, the 64-bit type double
is wider (or larger) than the 32-bit type int
. Additionally, any value that fits in an int
also fits in a double
.
When the compiler encounters an expression that involves mismatched types, it first converts the parts of the expressions to a common type and then evaluates the overall expression. The automatic conversions that are performed by the compiler are in the direction that avoids data loss. For example, double
can hold any value that int
can hold but the opposite is not true. The +=
operation above can work because any int
value can safely be converted to double
.
The value that has been generated automatically as a result of a conversion is always an anonymous (and often temporary) variable. The original value does not change. For example, the automatic conversion during +=
above does not change the type of increment
; it is always an int
. Rather, a temporary value of type double
is constructed with the value of increment
. The conversion that takes place in the background is equivalent to the following code:
{
double an_anonymous_double_value = increment;
sum += an_anonymous_double_value;
}
The compiler converts the int
value to a temporary double
value and uses that value in the operation. In this example, the temporary variable lives only during the +=
operation.
Automatic conversions are not limited to arithmetic operations. There are other cases where types are converted to other types automatically. As long as the conversions are valid, the compiler takes advantage of type conversions to be able to use values in expressions. For example, a byte
value can be passed for an int
parameter:
void func(int number) { // ... } void main() { byte smallValue = 7; func(smallValue); // automatic type conversion }
In the code above, first a temporary int
value is constructed and the function is called with that value.
Integer promotions
Values of types that are on the left-hand side of the following table never take part in arithmetic expressions as their actual types. Each type is first promoted to the type that is on the right-hand side of the table.
From | To |
---|---|
bool | int |
byte | int |
ubyte | int |
short | int |
ushort | int |
char | int |
wchar | int |
dchar | uint |
Integer promotions are applied to enum
values as well.
The reasons for integer promotions are both historical (where the rules come from C) and the fact that the natural arithmetic type for the microprocessor is int
. For example, although the following two variables are both ubyte
, the addition operation is performed only after both of the values are individually promoted to int
:
ubyte a = 1; ubyte b = 2; writeln(typeof(a + b).stringof); // the addition is not in ubyte
The output:
int
Note that the types of the variables a
and b
do not change; only their values are temporarily promoted to int
for the duration of the addition operation.
Arithmetic conversions
There are other conversion rules that are applied for arithmetic operations. In general, automatic arithmetic conversions are applied in the safe direction: from the narrower type to the wider type. Although this rule is easy to remember and is correct in most cases, automatic conversion rules are very complicated and in the case of signed-to-unsigned conversions, carry some risk of bugs.
The arithmetic conversion rules are the following:
- If one of the values is
real
, then the other value is converted toreal
- Else, if one of the values is
double
, then the other value is converted todouble
- Else, if one of the values is
float
, then the other value is converted tofloat
- Else, first integer promotions are applied according to the table above, and then the following rules are followed:
- If both types are the same, then no more steps needed
- If both types are signed or both types are unsigned, then the narrower value is converted to the wider type
- If the signed type is wider than the unsigned type, then the unsigned value is converted to the signed type
- Otherwise the signed type is converted to the unsigned type
Unfortunately, the last rule above can cause subtle bugs:
int a = 0; int b = 1; size_t c = 0; writeln(a - b + c); // Surprising result!
Surprisingly, the output is not -1, but size_t.max
:
18446744073709551615
Although one would expect (0 - 1 + 0)
to be calculated as -1, according to the rules above, the type of the entire expression is size_t
, not int
; and since size_t
cannot hold negative values, the result overflows and becomes size_t.max
.
Slice conversions
As a convenience, fixed-length arrays can automatically be converted to slices when calling a function:
import std.stdio; void foo() { int[2] array = [ 1, 2 ]; bar(array); // Passes fixed-length array as a slice } void bar(int[] slice) { writeln(slice); } void main() { foo(); }
bar()
receives a slice to all elements of the fixed-length array and prints it:
[1, 2]
Warning: A local fixed-length array must not be passed as a slice if the function stores the slice for later use. For example, the following program has a bug because the slice that bar()
stores would not be valid after foo()
exits:
import std.stdio; void foo() { int[2] array = [ 1, 2 ]; bar(array); // Passes fixed-length array as a slice } // ← NOTE: 'array' is not valid beyond this point int[] sliceForLaterUse; void bar(int[] slice) { // Saves a slice that is about to become invalid sliceForLaterUse = slice; writefln("Inside bar : %s", sliceForLaterUse); } void main() { foo(); /* BUG: Accesses memory that is not array elements anymore */ writefln("Inside main: %s", sliceForLaterUse); }
The result of such a bug is undefined behavior. A sample execution can prove that the memory that used to be the elements of array
has already been reused for other purposes:
Inside bar : [1, 2] ← actual elements Inside main: [4396640, 0] ← a manifestation of undefined behavior
const
conversions
As we have seen earlier in the Function Parameters chapter, reference types can automatically be converted to the const
of the same type. Conversion to const
is safe because the width of the type does not change and const
is a promise to not modify the variable:
char[] parenthesized(const char[] text) { return "{" ~ text ~ "}"; } void main() { char[] greeting; greeting ~= "hello world"; parenthesized(greeting); }
The mutable greeting
above is automatically converted to a const char[]
as it is passed to parenthesized()
.
As we have also seen earlier, the opposite conversion is not automatic. A const
reference is not automatically converted to a mutable reference:
char[] parenthesized(const char[] text) { char[] argument = text; // ← compilation ERROR // ... }
Note that this topic is only about references; since variables of value types are copied, it is not possible to affect the original through the copy anyway:
const int totalCorners = 4; int theCopy = totalCorners; // compiles (value type)
The conversion from const
to mutable above is legal because the copy is not a reference to the original.
immutable
conversions
Because immutable
specifies that a variable can never change, neither conversion from immutable
nor to immutable
are automatic:
string a = "hello"; // immutable characters char[] b = a; // ← compilation ERROR string c = b; // ← compilation ERROR
As with const
conversions above, this topic is also only about reference types. Since variables of value types are copied anyway, conversions to and from immutable
are valid:
immutable a = 10; int b = a; // compiles (value type)
enum
conversions
As we have seen in the enum
chapter, enum
is for defining named constants:
enum Suit { spades, hearts, diamonds, clubs }
Remember that since no values are specified explicitly above, the values of the enum
members start with zero and are automatically incremented by one. Accordingly, the value of Suit.clubs
is 3.
enum
values are atomatically converted to integral types. For example, the value of Suit.hearts
is taken to be 1 in the following calculation and the result becomes 11:
int result = 10 + Suit.hearts; assert(result == 11);
The opposite conversion is not automatic: Integer values are not automatically converted to corresponding enum
values. For example, the suit
variable below might be expected to become Suit.diamonds
, but the code cannot be compiled:
Suit suit = 2; // ← compilation ERROR
As we will see below, conversions from integers to enum
values are still possible but they must be explicit.
bool
conversions
Although bool
is the natural type of logical expressions, as it has only two values, it can be seen as a 1-bit integer and does behave like one in some cases. false
and true
are automatically converted to 0 and 1, respectively:
int a = false; assert(a == 0); int b = true; assert(b == 1);
Regarding literal values, the opposite conversion is automatic only for two special literal values: 0 and 1 are converted automatically to false
and true
, respectively:
bool a = 0; assert(!a); // false bool b = 1; assert(b); // true
Other literal values cannot be converted to bool
automatically:
bool b = 2; // ← compilation ERROR
Some statements make use of logical expressions: if
, while
, etc. For the logical expressions of such statements, not only bool
but most other types can be used as well. The value zero is automatically converted to false
and the nonzero values are automatically converted to true
.
int i; // ... if (i) { // ← int value is being used as a logical expression // ... 'i' is not zero } else { // ... 'i' is zero }
Similarly, null
references are automatically converted to false
and non-null
references are automatically converted to true
. This makes it easy to ensure that a reference is non-null
before actually using it:
int[] a; // ... if (a) { // ← automatic bool conversion // ... not null; 'a' can be used ... } else { // ... null; 'a' cannot be used ... }
Explicit type conversions
As we have seen above, there are cases where automatic conversions are not available:
- Conversions from wider types to narrower types
- Conversions from
const
to mutable immutable
conversions- Conversions from integers to
enum
values - etc.
If such a conversion is known to be safe, the programmer can explicitly ask for a type conversion by one of the following methods:
- Construction syntax
std.conv.to
functionstd.exception.assumeUnique
functioncast
operator
Construction syntax
The struct
and class
construction syntax is available for other types as well:
DestinationType(value)
For example, the following conversion makes a double
value from an int
value, presumably to preserve the fractional part of the division operation:
int i; // ... const result = double(i) / 2;
to()
for most conversions
The to()
function, which we have already used mostly to convert values to string
, can actually be used for many other types. Its complete syntax is the following:
to!(DestinationType)(value)
Being a template, to()
can take advantage of the shortcut template parameter notation: When the destination type consists only of a single token (generally, a single word), it can be called without the first pair of parentheses:
to!DestinationType(value)
The following program is trying to convert a double
value to short
and a string
value to int
:
void main() { double d = -1.75; short s = d; // ← compilation ERROR int i = "42"; // ← compilation ERROR }
Since not every double
value can be represented as a short
and not every string
can be represented as an int
, those conversions are not automatic. When it is known by the programmer that the conversions are in fact safe or that the potential consequences are acceptable, then the types can be converted by to()
:
import std.conv; void main() { double d = -1.75; short s = to!short(d); assert(s == -1); int i = to!int("42"); assert(i == 42); }
Note that because short
cannot carry fractional values, the converted value is -1.
to()
is safe: It throws an exception when a conversion is not possible.
assumeUnique()
for fast immutable
conversions
to()
can perform immutable
conversions as well:
int[] slice = [ 10, 20, 30 ]; auto immutableSlice = to!(immutable int[])(slice);
In order to guarantee that the elements of immutableSlice
will never change, it cannot share the same elements with slice
. For that reason, to()
creates an additional slice with immutable
elements above. Otherwise, modifications to the elements of slice
would cause the elements of immutableSlice
change as well. This behavior is the same with the .idup
property of arrays.
We can see that the elements of immutableSlice
are indeed copies of the elements of slice
by looking at the addresses of their first elements:
assert(&(slice[0]) != &(immutableSlice[0]));
Sometimes this copy is unnecessary and may slow the speed of the program noticeably in certain cases. As an example of this, let's look at the following function that takes an immutable
slice:
void calculate(immutable int[] coordinates) { // ... } void main() { int[] numbers; numbers ~= 10; // ... various other modifications ... numbers[0] = 42; calculate(numbers); // ← compilation ERROR }
The program above cannot be compiled because the caller is not passing an immutable
argument to calculate()
. As we have seen above, an immutable
slice can be created by to()
:
import std.conv; // ... auto immutableNumbers = to!(immutable int[])(numbers); calculate(immutableNumbers); // ← now compiles
However, if numbers
is needed only to produce this argument and will never be used after the function is called, copying its elements to immutableNumbers
would be unnecessary. assumeUnique()
makes the elements of a slice immutable
without copying:
import std.exception; // ... auto immutableNumbers = assumeUnique(numbers); calculate(immutableNumbers); assert(numbers is null); // the original slice becomes null
assumeUnique()
returns a new slice that provides immutable
access to the existing elements. It also makes the original slice null
to prevent the elements from accidentally being modified through it.
The cast
operator
Both to()
and assumeUnique()
make use of the conversion operator cast
, which is available to the programmer as well.
The cast
operator takes the destination type in parentheses:
cast(DestinationType)value
cast
is powerful even for conversions that to()
cannot safely perform. For example, to()
fails for the following conversions at runtime:
Suit suit = to!Suit(7); // ← throws exception bool b = to!bool(2); // ← throws exception
std.conv.ConvException@phobos/std/conv.d(1778): Value (7)
does not match any member value of enum 'Suit'
Sometimes only the programmer can know whether an integer value corresponds to a valid enum
value or that it makes sense to treat an integer value as a bool
. The cast
operator can be used when the conversion is known to be correct according the program's logic:
// Probably incorrect but possible: Suit suit = cast(Suit)7; bool b = cast(bool)2; assert(b);
cast
is the only option when converting to and from pointer types:
void * v; // ... int * p = cast(int*)v;
Although rare, some C library interfaces make it necessary to store a pointer value as a non-pointer type. If it is guaranteed that the conversion will preserve the actual value, cast
can convert between pointer and non-pointer types as well:
size_t savedPointerValue = cast(size_t)p; // ... int * p2 = cast(int*)savedPointerValue;
Summary
- Automatic type conversions are mostly in the safe direction: From the narrower type to the wider type and from mutable to
const
. - However, conversions to unsigned types may have surprising effects because unsigned types cannot have negative values.
enum
types can automatically be converted to integer values but the opposite conversion is not automatic.false
andtrue
are automatically converted to 0 and 1 respectively. Similarly, zero values are automatically converted tofalse
and nonzero values are automatically converted totrue
.null
references are automatically converted tofalse
and non-null
references are automatically converted totrue
.- The construction syntax can be used for explicit conversions.
to()
covers most of the explicit conversions.assumeUnique()
converts toimmutable
without copying.- The
cast
operator is the most powerful conversion tool.