Programming in D - Modules and Libraries

Modules and Libraries

The building blocks of D programs (and libraries) are modules.

D modules are based on a simple concept: Every source file is a module. Accordingly, the single files that we have been writing our programs in have all been individual modules.

By default, the name of a module is the same as its filename without the .d extension. When explicitly specified, the name of the module is defined by the module keyword, which must appear as the first non-comment line in the source file.

For example, assuming that the name of a source file is "cat.d", the name of the module would be specified by the module keyword:

module cat;

class Cat {
    // ...
}

The module line is optional if the module is not part of any package (see below). When not specified, it is the same as the file name without the .d extension.

`static this()` and `static ~this()`

static this() and static ~this() at module scope are similar to their struct and class counterparts:

module cat;

static this() {
    // ... the initial operations of the module ...
}

static ~this() {
    // ... the final operations of the module ...
}

Code that are in these scopes are executed once for each thread. (Note that most programs consist of a single thread that starts executing the main() function.) Code that should be executed only once for the entire program (e.g. initializing shared and immutable variables) must be defined in shared static this() and shared static ~this() blocks, which will be covered in the Data Sharing Concurrency chapter.

File and module names

D supports Unicode in source code and module names. However, the Unicode support of file systems vary. For example, although most Linux file systems support Unicode, the file names in Windows file systems may not distinguish between lower and upper case letters. Additionally, most file systems limit the characters that can be used in file and directory names.

For portability reasons, I recommend that you use only lower case ASCII letters in file names. For example, "resume.d" would be a suitable file name for a class named Résumé.

Accordingly, the name of the module would consist of ASCII letters as well:

module resume;  // Module name consisting of ASCII letters

class Résumé {  // Program code consisting of Unicode characters
    // ...
}

Packages

A combination of related modules are called a package. D packages are a simple concept as well: The source files that are inside the same directory are considered to belong to the same package. The name of the directory becomes the name of the package, which must also be specified as the first parts of module names.

For example, if "cat.d" and "dog.d" are inside the directory "animal", then specifying the directory name along with the module name makes them be a part of the same package:

module animal.cat;

class Cat {
    // ...
}

Similarly, for the dog module:

module animal.dog;

class Dog {
    // ...
}

For modules that are parts of packages, the module line is not optional and the whole module name including the package name must be specified.

Since package names correspond to directory names, the package names of modules that are deeper than one directory level must reflect that hierarchy. For example, if the "animal" directory included a "vertebrate" directory, the name of a module inside that directory would include vertebrate as well:

module animal.vertebrate.cat;

The directory hierarchies can be arbitrarily complex depending on the needs of the program. Relatively short programs usually have all of their source files in a single directory.

Importing modules

The import keyword, which we have been using in almost every program so far, is for introducing a module to the current module:

import std.stdio;

The module name may contain the package name as well. For example, the std. part above indicates that stdio is a module that is a part of the std package.

The animal.cat and animal.dog modules would be imported similarly. Let's assume that the following code is inside a file named "deneme.d":

module deneme;        // the name of this module

import animal.cat;    // a module that it uses
import animal.dog;    // another module that it uses

void main() {
    auto cat = new Cat();
    auto dog = new Dog();
}

Note: As described below, for the program to be built correctly, those module files must also be provided to the linker.

More than one module can be imported at the same time:

import animal.cat, animal.dog;

Selective imports

Instead of importing a module as a whole with all of its names, it is possible to import just specific names from it.

import std.stdio : writeln;

// ...

    writefln("Hello %s.", name);    // ← compilation ERROR

The code above cannot be compiled because only writeln is imported, not writefln.

Selective imports are considered to be better than importing an entire module because it reduces the chance of name collisions. As we will see in an example below, a name collision can occur when the same name appears in more than one imported module.

Selective imports may reduce compilation times as well because the compiler needs to compile only the parts of a module that are actually imported. On the other hand, selective imports require more work as every imported name must be specified separately on the import line.

This book does not take advantage of selective imports mostly for brevity.

Local imports

So far we have always imported all of the required modules at the tops of programs:

import std.stdio;     // ← at the top
import std.string;    // ← at the top

// ... the rest of the module ...

Instead, modules can be imported at any other line of the source code. For example, the two functions of the following program import the modules that they need in their own scopes:

string makeGreeting(string name) {
    import std.string;

    string greeting = format("Hello %s", name);
    return greeting;
}

void interactWithUser() {
    import std.stdio;

    write("Please enter your name: ");
    string name = readln();
    writeln(makeGreeting(name));
}

void main() {
    interactWithUser();
}

Local imports are recommended over global imports because instead of importing every module unconditionally at the top, the compiler can import only the ones that are in the scopes that are actually used. If the compiler knows that the program never calls a function, it can ignore the import directives inside that function.

Additionally, a locally imported module is accessible only inside that local scope, further reducing the risk of name collisions.

We will later see in the Mixins chapter that local imports are in fact required for template mixins.

The examples throughout this book do not take advantage of local imports mostly because local imports were added to D after the start of writing this book.

Locations of modules

The compiler finds the module files by converting the package and module names directly to directory and file names.

For example, the previous two modules would be located as "animal/cat.d" and "animal/dog.d", respectively (or "animal\cat.d" and "animal\dog.d", depending on the file system). Considering the main source file as well, the program above consists of three files.

Long and short module names

The names that are used in the program may be spelled out with the module and package names:

    auto cat0 = Cat();
    auto cat1 = animal.cat.Cat();   // same as above

The long names are normally not needed but sometimes there are name conflicts. For example, when referring to a name that appears in more than one module, the compiler cannot decide which one is meant.

The following program is spelling out the long names to distinguish between two separate Jaguar structs that are defined in two separate modules: animal and car:

import animal.jaguar;
import car.jaguar;

// ...

    auto conflicted =  Jaguar();            // ← compilation ERROR

    auto myAnimal = animal.jaguar.Jaguar(); // ← compiles
    auto myCar    =    car.jaguar.Jaguar(); // ← compiles

Renamed imports

It is possible to rename imported modules either for convenience or to resolve name conflicts:

import carnivore = animal.jaguar;
import vehicle = car.jaguar;

// ...

    auto myAnimal = carnivore.Jaguar();       // ← compiles
    auto myCar    = vehicle.Jaguar();         // ← compiles

Instead of renaming the entire import, it is possible to rename individual imported symbols.

For example, when the following code is compiled with the -w compiler switch, the compiler would warn that sort() function should be preferred instead of .sort property:

import std.stdio;
import std.algorithm;

// ...

    auto arr = [ 2, 10, 1, 5 ];
    arr.sort;    // ← compilation WARNING
    writeln(arr);

Warning: use std.algorithm.sort instead of .sort property

Note: The arr.sort expression above is the equivalent of sort(arr) but it is written in the UFCS syntax, which we will see in a later chapter.

One solution in this case is to import std.algorithm.sort by renaming it. The new name algSort below means the sort() function and the compiler warning is eliminated:

import std.stdio;
import std.algorithm : algSort = sort;

void main() {
    auto arr = [ 2, 10, 1, 5 ];
    arr.algSort;
    writeln(arr);
}

Importing a package as a module

Sometimes multiple modules of a package may need to be imported together. For example, whenever one module from the animal package is imported, all of the other modules may need to be imported as well: animal.cat, animal.dog, animal.horse, etc.

In such cases it is possible to import some or all of the modules of a package by importing the package as if it were a module:

import animal;    // ← entire package imported as a module

It is achieved by a special configuration file in the package directory, which must always be named as package.d. That special file includes the module directive for the package and imports the modules of the package publicly:

// The contents of the file animal/package.d:
module animal;

public import animal.cat;
public import animal.dog;
public import animal.horse;
// ... same for the other modules ...

Importing a module publicly makes that module available to the users of the importing module as well. As a result, when the users import just the animal module (which actually is a package), they get access to animal.cat and all the other modules as well.

Deprecating features

Modules evolve over time and get released under new version numbers. Going forward from a particular version, the authors of the module may decide to deprecate some of its features. Deprecating a feature means that newly written programs should not rely on that feature anymore; using a deprecated feature is disapproved. Deprecated features may even be removed from the module in the future.

There can be many reasons why a feature is deprecated. For example, the new version of the module may include a better alternative, the feature may have been moved to another module, the name of the feature may have changed to be consistent with the rest of the module, etc.

The deprecation of a feature is made official by defining it with the deprecated attribute, optionally with a custom message. For example, the following deprecation message communicates to its user that the name of the function has been changed:

deprecated("Please use doSomething() instead.")
void do_something() {
    // ...
}

By specifying one of the following compiler switches, the user of the module can determine how the compiler should react when a deprecated feature is used:

-d: Using deprecated features should be allowed
-dw: Using deprecated features should produce compilation warnings
-de: Using deprecated features should produce compilation errors

For example, calling the deprecated feature in a program and compiling it with -de would fail compilation:

    do_something();

$ dmd deneme.d -de
deneme.d: Deprecation: function deneme.do_something is
deprecated - Please use doSomething() instead.

The name of a deprecated feature is usually defined as an alias of the new name:

deprecated("Please use doSomething() instead.")
alias do_something = doSomething;

void doSomething() {
    // ...
}

We will see the alias keyword in a later chapter.

Adding module definitions to the program

The import keyword is not sufficient to make modules become parts of the program. It simply makes available the features of a module inside the current module. That much is needed only to compile the code.

It is not possible to build the previous program only by the main source file, "deneme.d":

$ dmd deneme.d -w -de
deneme.o: In function `_Dmain':
deneme.d: undefined reference to `_D6animal3cat3Cat7__ClassZ'
deneme.d: undefined reference to `_D6animal3dog3Dog7__ClassZ'
collect2: ld returned 1 exit status

Those error messages are generated by the linker. Although they are not user-friendly messages, they indicate that some definitions that are needed by the program are missing.

The actual build of the program is the responsibility of the linker, which gets called automatically by the compiler behind the scenes. The compiler passes the modules that it has just compiled to the linker, and the linker combines those modules (and libraries) to produce the executable program.

For that reason, all of the modules that make up the program must be provided to the linker. For the program above to be built, "animal/cat.d" and "animal/dog.d" must also be specified on the compilation line:

$ dmd deneme.d animal/cat.d animal/dog.d -w -de

Instead of having to mention the modules individually every time on the command line, they can be combined as libraries.

Libraries

A collection of compiled modules is called a library. Libraries are not programs themselves; they do not have the main() function. Libraries contain compiled definitions of functions, structs, classes, and other features of modules, which are to be linked later by the linker to produce the program.

dmd's -lib command line option is for making libraries. The following command makes a library that contains the "cat.d" and the "dog.d" modules. The name of the library is specified with the -of switch:

$ dmd animal/cat.d animal/dog.d -lib -ofanimal -w -de

The actual name of the library file depends on the platform. For example, the extension of library files is .a under Linux systems: animal.a.

Once that library is built, It is not necessary to specify the "animal/cat.d" and "animal/dog.d" modules individually anymore. The library file is sufficient:

$ dmd deneme.d animal.a -w -de

The command above replaces the following one:

$ dmd deneme.d animal/cat.d animal/dog.d -w -de

As an exception, the D standard library Phobos need not be specified on the command line. That library is automatically included behind the scenes. Otherwise, it could be specified similar to the following line:

$ dmd deneme.d animal.a /usr/lib64/libphobos2.a -w -de

Note: The name and location of the Phobos library may be different on different systems.

Using libraries of other languages

D can use libraries that are written in some other compiled languages like C and C++. However, because different languages use different linkages, such libraries are available to D code only through their D bindings.

Linkage is the set of rules that determines the accessibility of entities in a library as well as how the names (symbols) of those entities are represented in compiled code. The names in compiled code are different from the names that the programmer writes in source code: The compiled names are name-mangled according to the rules of a particular language or compiler.

For example, according to C linkage, the C function name foo would be mangled with a leading underscore as _foo in compiled code. Name-mangling is more complex in languages like C++ and D because these languages allow using the same name for different entities in different modules, structs, classes, etc. as well as for overloads of functions. A D function named foo in source code has to be mangled in a way that would differentiate it from all other foo names that can exist in a program. Although the exact mangled names are usually not important to the programmer, the core.demangle module can be used to mangle and demangle symbols:

module deneme;

import std.stdio;
import core.demangle;

void foo() {
}

void main() {
    writeln(mangle!(typeof(foo))("deneme.foo"));
}

Note: mangle() is a function template, the syntax of which is unfamiliar at this point in the book. We will see templates later in the Templates chapter.

A function that has the same type as foo above and is named as deneme.foo, has the following mangled name in compiled code:

_D6deneme3fooFZv

Name mangling is the reason why linker error messages cannot include user-friendly names. For example, a symbol in an error message above was _D6animal3cat3Cat7__ClassZ instead of animal.cat.Cat.

The extern() attribute specifies the linkage of entities. The valid linkage types that can be used with extern() are C, C++, D, Objective-C, Pascal, System, and Windows. For example, when a D code needs to make a call to a function that is defined in a C library, that function must be declared as having C linkage:

// Declaring that 'foo' has C linkage (e.g. it may be defined
// in a C library):
extern(C) void foo();

void main() {
    foo();  // this call would be compiled as a call to '_foo'
}

In the case of C++ linkage, the namespace that a name is defined in is specified as the second argument to the extern() attribute. For example, according to the following declaration, bar() is the declaration of the function a::b::c::bar() defined in a C++ library (note that D code uses dots instead of colons):

// Declaring that 'bar' is defined inside namespace a::b::c
// and that it has C++ linkage:
extern(C++, a.b.c) void bar();

void main() {
    bar();          // a call to a::b::c::bar()
    a.b.c.bar();    // same as above
}

A file that contains such D declarations of the features of an external library is called a D binding of that library. Fortunately, in most cases programmers do not need to write them from scratch as D bindings for many popular non-D libraries are available through the Deimos project.

When used without a linkage type, the extern attribute has a different meaning: It specifies that the storage for a variable is the responsibility of an external library; the D compiler should not reserve space for it in this module. Having different meanings, extern and extern() can be used together:

// Declaring that the storage for 'g_variable' is already
// defined in a C library:
extern(C) extern int g_variable;

If the extern attribute were not specified above, while having C linkage, g_variable would be a variable of this D module.

[ ↢ Prev ] [ Next ↣ ]