Programming in D

`Object`

Classes that do not explicitly inherit any class, automatically inherit the Object class.

By that definition, the topmost class in any class hierarchy inherits Object:

// ": Object" is not written; it is automatic
class MusicalInstrument : Object  {
    // ...
}

// Inherits Object indirectly
class StringInstrument : MusicalInstrument {
    // ...
}

Since the topmost class inherits Object, every class indirectly inherits Object as well. In that sense, every class "is an" Object.

Every class inherits the following member functions of Object:

toString: The string representation of the object.
opEquals: Equality comparison with another object.
opCmp: Sort order comparison with another object.
toHash: Associative array hash value.

The last three of these functions emphasize the values of objects. They also make a class eligible for being the key type of associative arrays.

Because these functions are inherited, their redefinitions for the subclasses require the override keyword.

Note: Object defines other members as well. This chapter will include only those four member functions of it.

`typeid` and `TypeInfo`

Object is defined in the object module (which is not a part of the std package). The object module defines TypeInfo as well, a class that provides information about types. Every type has a distinct TypeInfo object. The typeid expression provides access to the TypeInfo object that is associated with a particular type. As we will see later below, the TypeInfo class can be used for determining whether two types are the same, as well as for accessing special functions of a type (toHash, postblit, etc.).

TypeInfo is always about the actual run-time type. For example, although both Violin and Guitar below inherit StringInstrument directly and MusicalInstrument indirectly, the TypeInfo instances of Violin and Guitar are different. They are exactly for Violin and Guitar types, respectively:

class MusicalInstrument {
}

class StringInstrument : MusicalInstrument {
}

class Violin : StringInstrument {
}

class Guitar : StringInstrument {
}

void main() {
    TypeInfo v = typeid(Violin);
    TypeInfo g = typeid(Guitar);
    assert(v != g);    // ← the two types are not the same
}

The typeid expressions above are being used with types like Violin itself. typeid can take an expression as well, in which case it returns the TypeInfo object for the run-time type of that expression. For example, the following function takes two parameters of different but related types:

import std.stdio;

// ...

void foo(MusicalInstrument m, StringInstrument s) {
    const isSame = (typeid(m) == typeid(s));

    writefln("The types of the arguments are %s.",
             isSame ? "the same" : "different");
}

// ...

    auto a = new Violin();
    auto b = new Violin();
    foo(a, b);

Since both arguments to foo() are two Violin objects for that particular call, foo() determines that their types are the same:

The types of the arguments are the same.

Unlike .sizeof and typeof, which never execute their expressions, typeid always executes the expression that it receives:

import std.stdio;

int foo(string when) {
    writefln("Called during '%s'.", when);
    return 0;
}

void main() {
    const s = foo("sizeof").sizeof;     // foo() is not called
    alias T = typeof(foo("typeof"));    // foo() is not called
    auto ti = typeid(foo("typeid"));    // foo() is called
}

The output indicates that only the expression of typeid is executed:

Called during 'typeid'.

The reason for this difference is because actual run-time types of expressions may not be known until those expressions are executed. For example, the exact return type of the following function would be either Violin or Guitar depending on the value of function argument i:

MusicalInstrument foo(int i) {
    return (i % 2) ? new Violin() : new Guitar();
}

There are various subclasses of TypeInfo for various kinds of types like arrays, structs, classes, etc. Of these, TypeInfo_Class can be particularly useful. For example, the name of the run-time type of an object can be obtained through its TypeInfo_Class.name property as a string. You can access the TypeInfo_Class instance of an object by its .classinfo property:

    TypeInfo_Class info = a.classinfo;
    string runtimeTypeName = info.name;

`toString`

Same with structs, toString enables using class objects as strings:

    auto clock = new Clock(20, 30, 0);
    writeln(clock);         // Calls clock.toString()

The inherited toString() is usually not useful; it produces just the name of the type:

deneme.Clock

The part before the name of the type is the name of the module. The output above indicates that Clock has been defined in the deneme module.

As we have seen in the previous chapter, this function is almost always overridden to produce a more meaningful string representation:

import std.string;

class Clock {
    override string toString() const {
        return format("%02s:%02s:%02s", hour, minute, second);
    }

    // ...
}

class AlarmClock : Clock {
    override string toString() const {
        return format("%s ♫%02s:%02s", super.toString(),
                      alarmHour, alarmMinute);
    }

    // ...
}

// ...

    auto bedSideClock = new AlarmClock(20, 30, 0, 7, 0);
    writeln(bedSideClock);

The output:

20:30:00 ♫07:00

`opEquals`

As we have seen in the Operator Overloading chapter, this member function is about the behavior of the == operator (and the != operator indirectly). The return value of the operator must be true if the objects are considered to be equal and false otherwise.

Warning: The definition of this function must be consistent with opCmp(); for two objects that opEquals() returns true, opCmp() must return zero.

Contrary to structs, the compiler does not call a.opEquals(b) right away when it sees the expression a == b. When two class objects are compared by the == operator, a four-step algorithm is executed:

bool opEquals(Object a, Object b) {
    if (a is b) return true;                          // (1)
    if (a is null || b is null) return false;         // (2)
    if (typeid(a) == typeid(b)) return a.opEquals(b); // (3)
    return a.opEquals(b) && b.opEquals(a);            // (4)
}

If the two variables provide access to the same object (or they are both null), then they are equal.
Following from the previous check, if only one is null then they are not equal.
If both of the objects are of the same type, then a.opEquals(b) is called to determine the equality.
Otherwise, for the two objects to be considered equal, opEquals must have been defined for both of their types and a.opEquals(b) and b.opEquals(a) must agree that the objects are equal.

Accordingly, if opEquals() is not provided for a class type, then the values of the objects are not considered; rather, equality is determined by checking whether the two class variables provide access to the same object:

    auto variable0 = new Clock(6, 7, 8);
    auto variable1 = new Clock(6, 7, 8);

    assert(variable0 != variable1); // They are not equal
                                    // because the objects are
                                    // different

Even though the two objects are constructed by the same arguments above, the variables are not equal because they are not associated with the same object.

On the other hand, because the following two variables provide access to the same object, they are equal:

    auto partner0 = new Clock(9, 10, 11);
    auto partner1 = partner0;

    assert(partner0 == partner1);   // They are equal because
                                    // the object is the same

Sometimes it makes more sense to compare objects by their values instead of their identities. For example, it is conceivable that variable0 and variable1 above compare equal because their values are the same.

Different from structs, the type of the parameter of opEquals for classes is Object:

class Clock {
    override bool opEquals(Object o) const {
        // ...
    }

    // ...
}

As you will see below, the parameter is almost never used directly. For that reason, it should be acceptable to name it simply as o. Most of the time the first thing to do with that parameter is to use it in a type conversion.

The parameter of opEquals is the object that appears on the right-hand side of the == operator:

    variable0 == variable1;    // o represents variable1

Since the purpose of opEquals() is to compare two objects of this class type, the first thing to do is to convert o to a variable of the same type of this class. Since it would not be appropriate to modify the right-hand side object in an equality comparison, it is also proper to convert the type as const:

    override bool opEquals(Object o) const {
        auto rhs = cast(const Clock)o;

        // ...
    }

As you would remember, rhs is a common abbreviation for right-hand side. Also, std.conv.to can be used for the conversion as well:

import std.conv;
// ...
        auto rhs = to!(const Clock)(o);

If the original object on the right-hand side can be converted to Clock, then rhs becomes a non-null class variable. Otherwise, rhs is set to null, indicating that the objects are not of the same type.

According to the design of a program, it may make sense to compare objects of two incompatible types. I will assume here that for the comparison to be valid, rhs must not be null; so, the first logical expression in the following return statement checks that it is not null. Otherwise, it would be an error to try to access the members of rhs:

class Clock {
    int hour;
    int minute;
    int second;

    override bool opEquals(Object o) const {
        auto rhs = cast(const Clock)o;

        return (rhs &&
                (hour == rhs.hour) &&
                (minute == rhs.minute) &&
                (second == rhs.second));
    }

    // ...
}

With that definition, Clock objects can now be compared by their values:

    auto variable0 = new Clock(6, 7, 8);
    auto variable1 = new Clock(6, 7, 8);

    assert(variable0 == variable1); // Now they are equal
                                    // because their values
                                    // are equal

When defining opEquals it is important to remember the members of the superclass. For example, when comparing objects of AlarmClock it would make sense to also consider the inherited members:

class AlarmClock : Clock {
    int alarmHour;
    int alarmMinute;

    override bool opEquals(Object o) const {
        auto rhs = cast(const AlarmClock)o;

        return (rhs &&
                (alarmHour == rhs.alarmHour) &&
                (alarmMinute == rhs.alarmMinute) &&
                super.opEquals(o));
    }

    // ...
}

That expression could be written as super == o as well. However, that would initiate the four-step algorithm again and as a result, the code might be a little slower.

`opCmp`

This operator is used when sorting class objects. opCmp is the function that gets called behind the scenes for the <, <=, >, and >=.

This operator must return a negative value when the left-hand object is before, a positive value when the left-hand object is after, and zero when both objects have the same sorting order.

Warning: The definition of this function must be consistent with opEquals(); for two objects that opEquals() returns true, opCmp() must return zero.

Unlike toString and opEquals, there is no default implementation of this function in Object. If the implementation is not available, comparing objects for sort order causes an exception to be thrown:

    auto variable0 = new Clock(6, 7, 8);
    auto variable1 = new Clock(6, 7, 8);

    assert(variable0 <= variable1);    // ← Causes exception

object.Exception: need opCmp for class deneme.Clock

It is up to the design of the program what happens when the left-hand and right-hand objects are of different types. One way is to take advantage of the order of types that is maintained by the compiler automatically. This is achieved by calling the opCmp function on the typeid values of the two types:

class Clock {
    int hour;
    int minute;
    int second;

    override int opCmp(Object o) const {
        /* Taking advantage of the automatically-maintained
         * order of the types. */
        if (typeid(this) != typeid(o)) {
            return typeid(this).opCmp(typeid(o));
        }

        auto rhs = cast(const Clock)o;
        /* No need to check whether rhs is null, because it is
         * known at this line that it has the same type as o. */

        if (hour != rhs.hour) {
            return hour - rhs.hour;

        } else if (minute != rhs.minute) {
            return minute - rhs.minute;

        } else {
            return second - rhs.second;
        }
    }

    // ...
}

The definition above first checks whether the types of the two objects are the same. If not, it uses the ordering of the types themselves. Otherwise, it compares the objects by the values of their hour, minute, and second members.

A chain of ternary operators may also be used:

    override int opCmp(Object o) const {
        if (typeid(this) != typeid(o)) {
            return typeid(this).opCmp(typeid(o));
        }

        auto rhs = cast(const Clock)o;

        return (hour != rhs.hour
                ? hour - rhs.hour
                : (minute != rhs.minute
                   ? minute - rhs.minute
                   : second - rhs.second));
    }

If important, the comparison of the members of the superclass must also be considered. The following AlarmClock.opCmp is calling Clock.opCmp first:

class AlarmClock : Clock {
    override int opCmp(Object o) const {
        auto rhs = cast(const AlarmClock)o;

        const int superResult = super.opCmp(o);

        if (superResult != 0) {
            return superResult;

        } else if (alarmHour != rhs.alarmHour) {
            return alarmHour - rhs.alarmHour;

        } else {
            return alarmMinute - rhs.alarmMinute;
        }
    }

    // ...
}

Above, if the superclass comparison returns a nonzero value then that result is used because the sort order of the objects is already determined by that value.

AlarmClock objects can now be compared for their sort orders:

    auto ac0 = new AlarmClock(8, 0, 0, 6, 30);
    auto ac1 = new AlarmClock(8, 0, 0, 6, 31);

    assert(ac0 < ac1);

opCmp is used by other language features and libraries as well. For example, the sort() function takes advantage of opCmp when sorting elements.

`opCmp` for string members

When some of the members are strings, they can be compared explicitly to return a negative, positive, or zero value:

import std.exception;

class Student {
    string name;

    override int opCmp(Object o) const {
        auto rhs = cast(Student)o;
        enforce(rhs);

        if (name < rhs.name) {
            return -1;

        } else if (name > rhs.name) {
            return 1;

        } else {
            return 0;
        }
    }

    // ...
}

Instead, the existing std.algorithm.cmp function can be used, which happens to be faster as well:

import std.algorithm;

class Student {
    string name;

    override int opCmp(Object o) const {
        auto rhs = cast(Student)o;
        enforce(rhs);

        return cmp(name, rhs.name);
    }

    // ...
}

Also note that Student does not support comparing incompatible types by enforcing that the conversion from Object to Student is possible.

`toHash`

This function allows objects of a class type to be used as associative array keys. It does not affect the cases where the type is used as associative array values. If this function is defined, opEquals must be defined as well.

Hash table indexes

Associative arrays are a hash table implementation. Hash table is a very fast data structure when it comes to searching elements in the table. (Note: Like most other things in software, this speed comes at a cost: Hash tables must keep elements in an unordered way, and they may be taking up space that is more than exactly necessary.)

The high speed of hash tables comes from the fact that they first produce integer values for keys. These integers are called hash values. The hash values are then used for indexing into an internal array that is maintained by the table.

A benefit of this method is that any type that can produce unique integer values for its objects can be used as the key type of associative arrays. toHash is the function that returns the hash value for an object.

Even Clock objects can be used as associative array key values:

    string[Clock] timeTags;
    timeTags[new Clock(12, 0, 0)] = "Noon";

The default definition of toHash that is inherited from Clock produces different hash values for different objects without regard to their values. This is similar to how the default behavior of opEquals considers different objects as being not equal.

The code above compiles and runs even when there is no special definition of toHash for Clock. However, its default behavior is almost never what is desired. To see that default behavior, let's try to access an element by an object that is different from the one that has been used when inserting the element. Although the new Clock object below has the same value as the Clock object that has been used when inserting into the associative array above, the value cannot be found:

    if (new Clock(12, 0, 0) in timeTags) {
        writeln("Exists");

    } else {
        writeln("Missing");
    }

According to the in operator, there is no element in the table that corresponds to the value Clock(12, 0, 0):

Missing

The reason for this surprising behavior is that the key object that has been used when inserting the element is not the same as the key object that has been used when accessing the element.

Selecting members for `toHash`

Although the hash value is calculated from the members of an object, not every member is suitable for this task.

The candidates are the members that distinguish objects from each other. For example, the members name and lastName of a Student class would be suitable if those members can be used for identifying objects of that type.

On the other hand, a grades array of a Student class would not be suitable both because many objects may have the same array and also it is likely that the grades array may change over time.

Calculating hash values

The choice of hash values has a direct effect on the performance of associative arrays. Furthermore, a hash calculation that is effective on one type of data may not be as effective on another type of data. As hash algorithms are beyond the scope of this book, I will give just one guideline here: In general, it is better to produce different hash values for objects that are considered to have different values. However, it is not an error if two objects with different values produce the same index value; it is merely undesirable for performance reasons.

It is conceivable that all of the members of Clock are significant to distinguish its objects from each other. For that reason, the hash values can be calculated from the values of its three members. The number of seconds since midnight would be effective hash values for objects that represent different points in time:

class Clock {
    int hour;
    int minute;
    int second;

    override size_t toHash() const {
        /* Because there are 3600 seconds in an hour and 60
         * seconds in a minute: */
        return (3600 * hour) + (60 * minute) + second;
    }

    // ...
}

Whenever Clock is used as the key type of associative arrays, that special definition of toHash would be used. As a result, even though the two key objects of Clock(12, 0, 0) above are distinct, they would now produce the same hash value.

The new output:

Exists

Similar to the other member functions, the superclass may need to be considered as well. For example, AlarmClock.toHash can take advantage of Clock.toHash during its index calculation:

class AlarmClock : Clock {
    int alarmHour;
    int alarmMinute;

    override size_t toHash() const {
        return super.toHash() + alarmHour + alarmMinute;
    }

    // ...
}

Note: Take the calculation above just as an example. In general, adding integer values is not an effective way of generating hash values.

There are existing efficient algorithms for calculating hash values for variables of floating point, array, and struct types. These algorithms are available to the programmer as well.

What needs to be done is to call getHash() on the typeid of each member. The syntax of this method is the same for floating point, array, and struct types.

For example, hash values of a Student type can be calculated from its name member as in the following code:

class Student {
    string name;

    override size_t toHash() const {
        return typeid(name).getHash(&name);
    }

    // ...
}

Hash values for structs

Since structs are value types, hash values for their objects are calculated automatically by an efficient algorithm. That algorithm takes all of the members of the object into consideration.

When there is a specific reason like needing to exclude certain members from the hash calculation, toHash() can be overridden for structs as well.

Exercises

Start with the following class, which represents colored points:

enum Color { blue, green, red }

class Point {
    int x;
    int y;
    Color color;

    this(int x, int y, Color color) {
        this.x = x;
        this.y = y;
        this.color = color;
    }
}

Implement opEquals for this class in a way that ignores colors. When implemented in that way, the following assert check should pass:

    // Different colors
    auto bluePoint = new Point(1, 2, Color.blue);
    auto greenPoint = new Point(1, 2, Color.green);

    // They are still equal
    assert(bluePoint == greenPoint);

Implement opCmp by considering first x then y. The following assert checks should pass:

    auto redPoint1 = new Point(-1, 10, Color.red);
    auto redPoint2 = new Point(-2, 10, Color.red);
    auto redPoint3 = new Point(-2,  7, Color.red);

    assert(redPoint1 < bluePoint);
    assert(redPoint3 < redPoint2);

    /* Even though blue is before green in enum Color,
     * because color is being ignored, bluePoint must not be
     * before greenPoint. */
    assert(!(bluePoint < greenPoint));

Like the Student class above, you can implement opCmp by excluding incompatible types by the help of enforce.

Consider the following class that combines three Point objects in an array:

class TriangularArea {
    Point[3] points;

    this(Point one, Point two, Point three) {
        points = [ one, two, three ];
    }
}

Implement toHash for that class. Again, the following assert checks should pass:

    /* area1 and area2 are constructed by distinct points that
     * happen to have the same values. (Remember that
     * bluePoint and greenPoint should be considered equal.) */
    auto area1 = new TriangularArea(bluePoint, greenPoint, redPoint1);
    auto area2 = new TriangularArea(greenPoint, bluePoint, redPoint1);

    // The areas should be equal
    assert(area1 == area2);

    // An associative array
    double[TriangularArea] areas;

    // A value is being entered by area1
    areas[area1] = 1.25;

    // The value is being accessed by area2
    assert(area2 in areas);
    assert(areas[area2] == 1.25);

Remember that opEquals must also be defined when toHash is defined.

... the solutions

[ ↢ Prev ] [ Next ↣ ]

Object

typeid and TypeInfo

toString

opEquals

opCmp

opCmp for string members

toHash