Programming in D - Formatted Output

Formatted Output

This chapter is about features of the std.format module, not about the core features of the D language.

Like all modules that have the prefix std, std.format is a module inside Phobos, the standard library of D. There is not enough space to fully explore Phobos in this book.

D's input and output format specifiers are similar to the ones in the C language.

Before going further, I would like to summarize the format specifiers and flags, for your reference:

Flags (can be used together)
     -     flush left
     +     print the sign
     #     print in the alternative way
     0     print zero-filled
   space   print space-filled

Format Specifiers
     s     default
     b     binary
     d     decimal
     o     octal
    x,X    hexadecimal
    f,F    floating point in the standard decimal notation
    e,E    floating point in scientific notation
    a,A    floating point in hexadecimal notation
    g,G    as e or f

     ,     digit separators

     (     element format start
     )     element format end
     |     element delimiter

We have been using functions like writeln with multiple parameters as necessary to print the desired output. The parameters would be converted to their string representations and then sent to the output.

Sometimes this is not sufficient. The output may have to be in a very specific format. Let's look at the following code that is used to print items of an invoice:

    items ~= 1.23;
    items ~= 45.6;

    for (int i = 0; i != items.length; ++i) {
        writeln("Item ", i + 1, ": ", items[i]);
    }

The output:

Item 1: 1.23
Item 2: 45.6

Despite the information being correct, we may be required to print it in a different format. For example, maybe the decimal marks (the dots, in this case) must line up and we must ensure that there always are two digits after the decimal mark, as in the following output:

Item 1:     1.23
Item 2:    45.60

Formatted output is useful in such cases. The output functions that we have been using so far have counterparts that contain the letter f in their names: writef() and writefln(). The letter f is short for formatted. The first parameter of these functions is a format string that describes how the other parameters should be printed.

For example, writefln() can produce the desired output above with the following format string:

        writefln("Item %d:%9.02f", i + 1, items[i]);

The format string contains regular characters that are passed to the output as is, as well as special format specifiers that correspond to each parameter that is to be printed. Format specifiers start with the % character and end with a format character. The format string above has two format specifiers: %d and %9.02f.

Every specifier is associated with the respective parameter, usually in order of appearance. For example, %d is associated with i + 1 and %9.02f is associated with items[i]. Every specifier specifies the format of the parameter that it corresponds to. (Format specifiers may have parameter numbers as well. This will be explained later in the chapter.)

All of the other characters of the format string that are not part of format specifiers are printed as is. Such regular characters of the format specifier above are highlighted in "Item %d:%9.02f".

Format specifiers consist of several parts, most of which are optional. The part named position will be explained later below. The others are the following: (Note: The spaces between these parts are inserted here to help with readability; they are not part of the specifiers.)

    %  flags  width  separator  precision  format_character

The % character at the beginning and the format character at the end are required; the others are optional.

Because % has a special meaning in format strings, when we need to print a % as a regular character, we must type it as %%.

format_character

b: An integer argument is printed in the binary system.

o: An integer argument is printed in the octal system.

x and X: An integer argument is printed in the hexadecimal system; with lowercase letters when using x and with uppercase letters when using X.

d: An integer argument is printed in the decimal system; a negative sign is also printed if it is a signed type and the value is less than zero.

    int value = 12;

    writefln("Binary     : %b", value);
    writefln("Octal      : %o", value);
    writefln("Hexadecimal: %x", value);
    writefln("Decimal    : %d", value);

Binary     : 1100
Octal      : 14
Hexadecimal: c
Decimal    : 12

e: A floating point argument is printed according to the following rules.

a single digit before the decimal mark
a decimal mark if precision is nonzero
the required digits after the decimal mark, the number of which is determined by precision (default precision is 6)
the e character (meaning "10 to the power of")
the - or + character, depending on whether the exponent is less than or greater than zero
the exponent, consisting of at least two digits

E: Same as e, with the exception of outputting the character E instead of e.

f and F: A floating point argument is printed in the decimal system; there is at least one digit before the decimal mark and the default precision is 6 digits after the decimal mark.

g: Same as f if the exponent is between -5 and precision; otherwise same as e. precision does not specify the number of digits after the decimal mark, but the significant digits of the entire value. If there are no significant digits after the decimal mark, then the decimal mark is not printed. The rightmost zeros after the decimal mark are not printed.

G: Same as g, with the exception of outputting the character E.

a: A floating point argument is printed in the hexadecimal floating point notation:

the characters 0x
a single hexadecimal digit
a decimal mark if precision is nonzero
the required digits after the decimal mark, the number of which is determined by precision; if no precision is specified, then as many digits as necessary
the p character (meaning "2 to the power of")
the - or + character, depending on whether the exponent is less than or greater than zero
the exponent, consisting of at least one digit (the exponent of the value 0 is 0)

A: Same as a, with the exception of outputting the characters 0X and P.

    double value = 123.456789;

    writefln("with e: %e", value);
    writefln("with f: %f", value);
    writefln("with g: %g", value);
    writefln("with a: %a", value);

with e: 1.234568e+02
with f: 123.456789
with g: 123.457
with a: 0x1.edd3c07ee0b0bp+6

s: The value is printed in the same way as in regular output, according to the type of the argument:

bool values as true or false
integer values same as %d
floating point values same as %g
strings in UTF-8 encoding; precision determines the maximum number of bytes to use (remember that in UTF-8 encoding, the number of bytes is not the same as the number of characters; for example, the string "ağ" has 2 characters, consisting a total of 3 bytes)
struct and class objects as the return value of the toString() member functions of their types; precision determines the maximum number of bytes to use
arrays as their element values, side by side

    bool b = true;
    int i = 365;
    double d = 9.87;
    string s = "formatted";
    auto o = File("test_file", "r");
    int[] a = [ 2, 4, 6, 8 ];

    writefln("bool  : %s", b);
    writefln("int   : %s", i);
    writefln("double: %s", d);
    writefln("string: %s", s);
    writefln("object: %s", o);
    writefln("array : %s", a);

bool  : true
int   : 365
double: 9.87
string: formatted
object: File(55738FA0)
array : [2, 4, 6, 8]

width

This part determines the width of the field that the argument is printed in. If the width is specified as the character *, then the actual width value is read from the next argument (that argument must be an int). If width is a negative value, then the - flag is assumed.

    int value = 100;

    writefln("In a field of 10 characters:%10s", value);
    writefln("In a field of 5 characters :%5s", value);

In a field of 10 characters:       100
In a field of 5 characters :  100

separator

The comma character specifies to separate digits of a number in groups. The default number of digits in a group is 3 but it can be specified after the comma:

    writefln("%,f", 1234.5678);        // Groups of 3
    writefln("%,s", 1000000);          // Groups of 3
    writefln("%,2s", 1000000);         // Groups of 2

1,234.567,800
1,000,000
1,00,00,00

If the number of digits is specified as the character *, then the actual number of digits is read from the next argument (that argument must be an int).

    writefln("%,*s", 1, 1000000);      // Groups of 1

1,0,0,0,0,0,0

Similarly, it is possible to specify the separator character by using a question mark after the comma and providing the character as an additional argument before the number:

    writefln("%,?s", '.', 1000000);    // The separator is '.'

1.000.000

precision

Precision is specified after a dot in the format specifier. For floating point types, it determines the precision of the printed representation of the values. If the precision is specified as the character *, then the actual precision is read from the next argument (that argument must be an int). Negative precision values are ignored.

    double value = 1234.56789;

    writefln("%.8g", value);
    writefln("%.3g", value);
    writefln("%.8f", value);
    writefln("%.3f", value);

1234.5679
1.23e+03
1234.56789000
1234.568

    auto number = 0.123456789;
    writefln("Number: %.*g", 4, number);

Number: 0.1235

flags

More than one flag can be specified.

-: the value is printed left-aligned in its field; this flag cancels the 0 flag

    int value = 123;

    writefln("Normally right-aligned:|%10d|", value);
    writefln("Left-aligned          :|%-10d|", value);

Normally right-aligned:|       123|
Left-aligned          :|123       |

+: if the value is positive, it is prepended with the + character; this flag cancels the space flag

    writefln("No effect for negative values    : %+d", -50);
    writefln("Positive value with the + flag   : %+d", 50);
    writefln("Positive value without the + flag: %d", 50);

No effect for negative values    : -50
Positive value with the + flag   : +50
Positive value without the + flag: 50

#: prints the value in an alternate form depending on the format_character

o: the first character of the octal value is always printed as 0
x and X: if the value is not zero, it is prepended with 0x or 0X
floating points: a decimal mark is printed even if there are no significant digits after the decimal mark
g and G: even the insignificant zero digits after the decimal mark are printed

    writefln("Octal starts with 0                        : %#o", 1000);
    writefln("Hexadecimal starts with 0x                 : %#x", 1000);
    writefln("Contains decimal mark even when unnecessary: %#g", 1f);
    writefln("Rightmost zeros are printed                : %#g", 1.2);

Octal starts with 0                        : 01750
Hexadecimal starts with 0x                 : 0x3e8
Contains decimal mark even when unnecessary: 1.00000
Rightmost zeros are printed                : 1.20000

0: the field is padded with zeros (unless the value is nan or infinity); if precision is also specified, this flag is ignored

    writefln("In a field of 8 characters: %08d", 42);

In a field of 8 characters: 00000042

space character: if the value is positive, a space character is prepended to align the negative and positive values

    writefln("No effect for negative values: % d", -34);
    writefln("Positive value with space    : % d", 56);
    writefln("Positive value without space : %d", 56);

No effect for negative values: -34
Positive value with space    :  56
Positive value without space : 56

Positional parameters

We have seen above that the arguments are associated one by one with the specifiers in the format string. It is also possible to use position numbers within format specifiers. This enables associating the specifiers with specific arguments. Arguments are numbered in increasing fashion, starting with 1. The argument numbers are specified immediately after the % character, followed by a $:

    %  position$  flags  width  precision  format_character

An advantage of positional parameters is being able to use the same argument in more than one place in the same format string:

    writefln("%1$d %1$x %1$o %1$b", 42);

The format string above uses the argument numbered 1 within four specifiers to print it in decimal, hexadecimal, octal, and binary formats:

42 2a 52 101010

Another application of positional parameters is supporting multiple natural languages. When referred by position numbers, arguments can be moved anywhere within the specific format string for a given human language. For example, the number of students of a given classroom can be printed as in the following:

    writefln("There are %s students in room %s.", count, room);

There are 20 students in room 1A.

Let's assume that the program must also support Turkish. In this case the format string needs to be selected according to the active language. The following method takes advantage of the ternary operator:

    auto format = (language == "en"
                   ? "There are %s students in room %s."
                   : "%s sınıfında %s öğrenci var.");

    writefln(format, count, room);

Unfortunately, when the arguments are associated one by one, the classroom and student count information appear in reverse order in the Turkish message; the room information is where the count should be and the count is where the room should be:

20 sınıfında 1A öğrenci var.  ← Wrong: means "room 20", and "1A students"!

To avoid this, the arguments can be specified by numbers, such as 1$ and 2$, to associate each specifier with the exact argument:

    auto format = (language == "en"
                   ? "There are %1$s students in room %2$s."
                   : "%2$s sınıfında %1$s öğrenci var.");

    writefln(format, count, room);

Now the arguments appear in the proper order, regardless of the language selected:

There are 20 students in room 1A.

1A sınıfında 20 öğrenci var.

Formatted element output

Format specifiers between %( and %) are applied to every element of a container (e.g. an array or a range):

    auto numbers = [ 1, 2, 3, 4 ];
    writefln("%(%s%)", numbers);

The format string above consists of three parts:

%(: Start of element format
%s: Format for each element
%): End of element format

Each being printed with the %s format, the elements appear one after the other:

The regular characters before and after the element format are repeated for each element. For example, the {%s}, specifier would print each element between curly brackets separated by commas:

    writefln("%({%s},%)", numbers);

However, regular characters to the right of the format specifier are considered to be element delimiters and are printed only between elements, not after the last one:

{1},{2},{3},{4  ← '}' and ',' are not printed after the last element

%| is used for specifying the characters that should be printed even for the last element. Characters that are to the right of %| are considered to be the delimiters and are not printed for the last element. Conversely, characters to the left of %| are printed even for the last element.

For example, the following format specifier would print the closing curly bracket after the last element but not the comma:

    writefln("%({%s}%|,%)", numbers);

{1},{2},{3},{4}  ← '}' is printed after the last element as well

Unlike strings that are printed individually, strings that are printed as elements appear within double quotes:

    auto vegetables = [ "spinach", "asparagus", "artichoke" ];
    writefln("%(%s, %)", vegetables);

"spinach", "asparagus", "artichoke"

When the double quotes are not desired, the element format must be started with %-( instead of %(:

    writefln("%-(%s, %)", vegetables);

spinach, asparagus, artichoke

The same applies to characters as well. %( prints them within single quotes:

    writefln("%(%s%)", "hello");

'h''e''l''l''o'

%-( prints them without quotes:

    writefln("%-(%s%)", "hello");

hello

There must be two format specifiers for associative arrays: one for the keys and one for the values. For example, the following %s (%s) specifier would print first the key and then the value in parentheses:

    auto spelled = [ 1 : "one", 10 : "ten", 100 : "hundred" ];
    writefln("%-(%s (%s)%|, %)", spelled);

Also note that, being specified to the right of %|, the comma is not printed for the last element:

1 (one), 100 (hundred), 10 (ten)

`format`

Formatted output is available through the format() function of the std.string module as well. format() works the same as writef() but it returns the result as a string instead of printing it to the output:

import std.stdio;
import std.string;

void main() {
    write("What is your name? ");
    auto name = strip(readln());

    auto result = format("Hello %s!", name);
}

The program can make use of that result in later expressions.

Checked format string

There is an alternative syntax for functions like format in the standard library that take a format string (writef, writefln, formattedWrite, readf, formattedRead, etc.). It is possible to provide the format string as a template argument to these functions so that the validity of the format string and the arguments are checked at compile time:

import std.stdio;

void main() {
    writefln!"%s %s"(1);       // ← compilation ERROR (extra %s)
    writefln!"%s"(1, 2);       // ← compilation ERROR (extra 2)
    writefln!"%s %d"(1, 2.5);  // ← compilation ERROR (mismatched %d and 2.5)
}

The ! character above is the template instantiation operator, which we will see in a later chapter.

(Note: Although this snytax is safer because it catches potential programmer errors at compile time, it may also make compilation times longer.)

Exercises

Write a program that reads a value and prints it in the hexadecimal system.
Write a program that reads a floating point value and prints it as percentage value with two digits after the decimal mark. For example, if the value is 1.2345, it should print %1.23.

... the solutions

[ ↢ Prev ] [ Next ↣ ]