Programming in D - Program Environment

Program Environment

We have seen that main() is a function. Program execution starts with main() and branches off to other functions from there. The definition of main() that we have used so far has been the following:

void main() {
    // ...
}

According to that definition main() does not take any parameters and does not return a value. In reality, in most systems every program necessarily returns a value to its environment when it ends, which is called an exit status or return code. Because of this, although it is possible to specify the return type of main() as void, it will actually return a value to the operating system or launch environment.

The return value of `main()`

Programs are always started by an entity in a particular environment. The entity that starts the program may be the shell where the user types the name of the program and presses the Enter key, a development environment where the programmer clicks the [Run] button, and so on.

In D and several other programming languages, the program communicates its exit status to its environment by the return value of main().

The exact meaning of return codes depend on the application and the system. In almost all systems a return value of zero means a successful completion, while other values generally mean some type of failure. There are exceptions to this, though. For instance, in OpenVMS even values indicate failure, while odd values indicate success. Still, in most systems the values in the range [0, 125] can be used safely, with values 1 to 125 having a meaning specific to that program.

For example, the common Unix program ls, which is used for listing contents of directories, returns 0 for success, 1 for minor errors and 2 for serious ones.

In many environments, the return value of the program that has been executed most recently in the terminal can be seen through the $? environment variable. For example, when we ask ls to list a file that does not exist, its nonzero return value can be observed with $? as seen below.

Note: In the command line interactions below, the lines that start with # indicate the lines that the user types. If you want to try the same commands, you must enter the contents of those lines except for the # character. Also, the commands below start a program named deneme; replace that name with the name of your test program.

Additionally, although the following examples show interactions in a Linux terminal, they would be similar but not exactly the same in terminals of other operating systems.

# ls a_file_that_does_not_exist
ls: cannot access a_file_that_does_not_exist: No such file
or directory
# echo $?
2      ← the return value of ls

`main()` always returns a value

Some of the programs that we have written so far threw exceptions when they could not continue with their tasks. As much as we have seen so far, when an exception is thrown, the program ends with an object.Exception error message.

When that happens, even if main() has been defined as returning void, a nonzero status code is automatically returned to the program's environment. Let's see this in action in the following simple program that terminates with an exception:

void main() {
    throw new Exception("There has been an error.");
}

Although the return type is specified as void, the return value is nonzero:

# ./deneme
object.Exception: There has been an error.
...
# echo $?
1

Similarly, void main() functions that terminate successfully also automatically return zero as their return values. Let's see this with the following program that terminates successfully:

import std.stdio;

void main() {
    writeln("Done!");
}

The program returns zero:

# ./deneme
Done!
# echo $?
0

Specifying the return value

To choose a specific return code we return a value from main() in the same way as we would from any other function. The return type must be specified as int and the value must be returned by the return statement:

import std.stdio;

int main() {
    int number;
    write("Please enter a number between 3 and 6: ");
    readf(" %s", &number);

    if ((number < 3) || (number > 6)) {
        stderr.writefln("ERROR: %s is not valid!", number);
        return 111;
    }

    writefln("Thank you for %s.", number);

    return 0;
}

When the entered number is within the valid range, the return value of the program is zero:

# ./deneme
Please enter a number between 3 and 6: 5
Thank you for 5.
# echo $?
0

When the number is outside of the valid range, the return value of the program is 111:

# ./deneme
Please enter a number between 3 and 6: 10
ERROR: 10 is not valid!
# echo $?
111

The value of 111 in the above program is arbitrary; normally 1 is suitable as the failure code.

Standard error stream `stderr`

The program above uses the stream stderr. That stream is the third of the standard streams. It is used for writing error messages:

stdin: standard input stream
stdout: standard output stream
stderr: standard error stream

When a program is started in a terminal, normally the messages that are written to stdout and stderr both appear on the terminal window. When needed, it is possible to redirect these outputs individually. This subject is outside of the focus of this chapter and the details may vary for each shell program.

Parameters of `main()`

It is common for programs to take parameters from the environment that started them. For example, we have already passed a file name as a command line option to ls above. There are two command line options in the following line:

# ls -l deneme
-rwxr-xr-x 1 acehreli users 460668 Nov  6 20:38 deneme

The set of command line parameters and their meanings are defined entirely by the program. Every program documents its usage, including what every parameter means.

The arguments that are used when starting a D program are passed to that program's main() as a slice of strings. Defining main() as taking a parameter of type string[] is sufficient to have access to program arguments. The name of this parameter is commonly abbreviated as args. The following program prints all of the arguments with which it is started:

import std.stdio;

void main(string[] args) {
    foreach (i, arg; args) {
        writefln("Argument %-3s: %s", i, arg);
    }
}

Let's start the program with arbitrary arguments:

# ./deneme some arguments on the command line 42 --an-option
Argument 0  : ./deneme
Argument 1  : some
Argument 2  : arguments
Argument 3  : on
Argument 4  : the
Argument 5  : command
Argument 6  : line
Argument 7  : 42
Argument 8  : --an-option

In almost all systems, the first argument is the name of the program, in the way it has been entered by the user. The other arguments appear in the order they were entered.

It is completely up to the program how it makes use of the arguments. The following program prints its two arguments in reverse order:

import std.stdio;

int main(string[] args) {
    if (args.length != 3) {
        stderr.writefln("ERROR! Correct usage:\n" ~
                        "  %s word1 word2", args[0]);
        return 1;
    }

    writeln(args[2], ' ', args[1]);

    return 0;
}

The program also shows its correct usage if you don't enter exactly two words:

# ./deneme
ERROR! Correct usage:
  ./deneme word1 word2
# ./deneme world hello
hello world

Command line options and the `std.getopt` module

That is all there is to know about the parameters and the return value of main(). However, parsing the arguments is a repetitive task. The std.getopt module is designed to help with parsing the command line options of programs.

Some parameters like "world" and "hello" above are purely data for the program to use. Other kinds of parameters are called command line options, and are used to change the behaviors of programs. An example of a command line option is the -l option that has been passed to ls above.

Command line options make programs more useful by removing the need for a human user to interact with the program to have it behave in a certain way. With command line options, programs can be started from script programs and their behaviors can be specified through command line options.

Although the syntax and meanings of command line arguments of every program is specific to that program, their format is somewhat standard. For example, in POSIX, command line options start with -- followed by the name of the option, and values come after = characters:

# ./deneme --an-option=17

The std.getopt module simplifies parsing such options. It has more capabilities than what is covered in this section.

Let's design a program that prints random numbers. Let's take the minimum, maximum, and total number of these numbers as program arguments. Let's require the following syntax to get these values from the command line:

# ./deneme --count=7 --minimum=10 --maximum=15

The getopt() function parses and assigns those values to variables. Similarly to readf(), the addresses of variables must be specified by the & operator:

import std.stdio;
import std.getopt;
import std.random;

void main(string[] args) {
    int count;
    int minimum;
    int maximum;

    getopt(args,
           "count", &count,
           "minimum", &minimum,
           "maximum", &maximum);

    foreach (i; 0 .. count) {
        write(uniform(minimum, maximum + 1), ' ');
    }

    writeln();
}

# ./deneme --count=7 --minimum=10 --maximum=15
11 11 13 11 14 15 10

Many command line options of most programs have a shorter syntax as well. For example, -c may have the same meaning as --count. Such alternative syntax for each option is specified in getopt() after a | character. There may be more than one shortcut for each option:

    getopt(args,
           "count|c", &count,
           "minimum|n", &minimum,
           "maximum|x", &maximum);

It is common to use a single dash for the short versions and the = character is usually either omitted or substituted by a space:

# ./deneme -c7 -n10 -x15
11 13 10 15 14 15 14
# ./deneme -c 7 -n 10 -x 15  
11 13 10 15 14 15 14

getopt() converts the arguments from string to the type of each variable. For example, since count above is an int, getopt() converts the value specified for the --count argument to an int. When needed, such conversions may also be performed explicitly by to.

So far we have used std.conv.to only when converting to string. to can, in fact, convert from any type to any type, as long as that conversion is possible. For example, the following program takes advantage of to when converting its argument to size_t:

import std.stdio;
import std.conv;

void main(string[] args) {
    // The default count is 10
    size_t count = 10;

    if (args.length > 1) {
        // There is an argument
        count = to!size_t(args[1]);
    }

    foreach (i; 0 .. count) {
        write(i * 2, ' ');
    }

    writeln();
}

The program produces 10 numbers when no argument is specified:

# ./deneme
0 2 4 6 8 10 12 14 16 18
# ./deneme 3
0 2 4

Environment variables

The environment that a program is started in generally provides some variables that the program can make use of. The environment variables can be accessed through the associative array interface of std.process.environment. For example, the following program prints the PATH environment variable:

import std.stdio;
import std.process;

void main() {
    writeln(environment["PATH"]);
}

The output:

# ./deneme
/usr/local/bin:/usr/bin

std.process.environment provides access to the environment variables through the associative array syntax. However, environment itself is not an associative array. When needed, the environment variables can be converted to an associative array by using toAA():

    string[string] envVars = environment.toAA();

Starting other programs

Programs may start other programs and become the environment for those programs. A function that enables this is executeShell, from the std.process module.

executeShell executes its parameter as if the command was typed at the terminal. It then returns both the return code and the output of that command as a tuple. Tuples are array-like structures, which we will see later in the Tuples chapter:

import std.stdio;
import std.process;

void main() {
    const result = executeShell("ls -l deneme");
    const returnCode = result[0];
    const output = result[1];

    writefln("ls returned %s.", returnCode);
    writefln("Its output:\n%s", output);
}

The output:

# ./deneme
ls returned 0.
Its output:
-rwxrwxr-x. 1 acehreli acehreli 1359178 Apr 21 15:01 deneme

Summary

Even when it is defined with a return type of void, main() automatically returns zero for success and nonzero for failure.
stderr is suitable to print error messages.
main can take parameters as string[].
std.getopt helps with parsing command line options.
std.process helps with accessing environment variables and starting other programs.

Exercises

Write a calculator program that takes an operator and two operands as command line arguments. Have the program support the following usage:
```
# ./deneme 3.4 x 7.8
26.52
```
Note: Because the * character has a special meaning on most terminals (more accurately, on most shells), I have used x instead. You may still use * as long as it is escaped as \*.
Write a program that asks the user which program to start, starts that program, and prints its output.

... the solutions

[ ↢ Prev ] [ Next ↣ ]

Program Environment

The return value of main()

main() always returns a value