Read stuff with me: Stroustrup Notes - Chapter 11

Like most languages, C++ supports a set of operators for its built in types. However, most concepts for which operators are conventionally used are not built in types in C++, so they must be represented as user defined types. For example, if you need complex arithmetic, matrix algebra, logic signals, or character strings in C++, you use classes to represent these notions. Defining operators for such classes sometimes allows a programmer to provide a more conventional and convenient notation for manipulating objects than could be achieved using only the basic functional notation.

The usual precedence rules hold. Many of the most obvious uses of operator overloading are for concrete types. However, the usefulness of user defined operators is not restricted to concrete types. For example, the design of general and abstract interfaces often leads to the use of operators such as ->, [], and ().

Operator Functions
The following operators cannot be defined by a user:
:: (scope resolution),
. (member selection), and
.* (member selection through pointer to function)

They take a name, rather than a value, as their second operand and provide the primary means of referring to members. Allowing them to be overloaded would lead to subtleties.

The name of an operator function is the keyword o p e r a t o r followed by the operator itself; for example, o p e r a t o r <<. An operator function is declared and can be called like any other function. A use of the operator is only a shorthand for an explicit call of the operator function. c o m p l e x c = a + b ; // shorthand c o m p l e x d = a .o p e r a t o r +(b ); // explicit call Binary and Unary Operators A binary operator can be defined by either a nonstatic member function taking one argument or a non member function taking two arguments. For any binary operator @, a a @b b can be interpreted as either a a .o p e r a t o r @(b b ) or o p e r a t o r @(a a ,b b ). If both are defined, overload resolution determines which, if any, interpretation is used. A unary operator, whether prefix or postfix, can be defined by either a nonstatic member function taking no arguments or a nonmember function taking one argument. For any prefix unary operator @, @a a can be interpreted as either a a .o p e r a t o r @() or o p e r a t o r @(a a ). If both are defined, overload resolution determines which, if any, interpretation is used. For any postfix unary operator @, a a @ can be interpreted as either a a .o p e r a t o r @(i n t ) or o p e r a t o r @(a a ,i n t ). This is explained further. If both are defined, overload resolution determines which, if any, interpretation is used. An operator can be declared only for the syntax defined for it in the grammar. For example, a user cannot define a unary % or a ternary +. Predefined Meanings for Operators Only a few assumptions are made about the meaning of a user defined operator. In particular, o p e r a t o r =, o p e r a t o r [], o p e r a t o r (), and o p e r a t o r ->
must be nonstatic member functions; this ensures that their first operands will be lvalues. The meanings of some built in operators are defined to be equivalent to some combination of other operators on the same arguments. For example, if a is an int, ++a means a +=1 , which in turn means a =a +1 . Such relations do not hold for user defined operators unless the user happens to define them that way. For example, a compiler will not generate a definition of Z :: o p e r a t o r +=() from the definitions of Z :: o p e r a t o r +() and Z :: o p e r a t o r =().

Because of historical accident, the operators = (assignment), &(addressof),and ,(sequencing;) have predefined meanings when applied to class objects. These predefined meanings can be made inaccessible to general users by making them private.

Alternatively, they can be given new meanings by suitable definitions.

Operators and User Defined Types
An operator function must either be a member or take at least one argument of a user defined type (functions redefining the new and delete operators need not). This rule ensures that a user cannot change the meaning of an expression unless the expression contains an object of a user defined type. In particular, it is not possible to define an operator function that operates exclusively on pointers. This ensures that C++ is extensible but not mutable (with the exception of operators =, &, and , for class objects).

An operator function intended to accept a basic type as its first operand cannot be a member function. For example, consider adding a complex variable aa to the integer 2: aa+2 can, with a suitably declared member function, be interpreted as aa.operator+(2), but 2+aa cannot because there is no class int for which to define + to mean 2.operator+(aa). Even if there were, two different member functions would be needed to cope with 2+aa and aa+2. Because the compiler does not know the meaning of a user defined +, it cannot assume that it is commutative and so interpret 2+aa as aa+2. This example is trivially handled using nonmember functions.

Operators in Namespaces
An operator is either a member of a class or defined in some namespace (possibly the global namespace).

Operators defined in namespaces can be found based on their operand types just like functions can be found based on their argument types. In particular, cout is in namespace std, so std is considered when looking for a suitable definition of <<. In that way, the compiler finds and uses. For a binary operator @, x@y where x is of type X and y is of type Y is resolved like this: [1] If X is a class, determine whether class X or a base of X defines operator@ as a member; if so, that is the @ to try to use. [2] Otherwise, – look for declarations of @ in the context surrounding x@y; and – if X is defined in namespace N, look for declarations of @ in N; and – if Y is defined in namespace M, look for declarations of @ in M. If declarations of operator@ are found in the surrounding context, in N, or in M, we try to use those operators. In either case, declarations for several operator @s may be found and overload resolution rules are used to find the best match, if any. This lookup mechanism is applied only if the operator has at least one operand of a user defined type. Therefore, user defined conversions will be considered. Note that a typedef name is just a synonym and not a user defined type. Member and Nonmember Operators I prefer to minimize the number of functions that directly manipulate the representation of an object. This can be achieved by defining only operators that inherently modify the value of their first argument, such as +=, in the class itself. Operators that simply produce a new value based on the values of its arguments, such as +, are then defined outside the class and use the essential operators in their implementation. Composite assignment operators such as += and *= tend to be simpler to define than their ‘‘simple’’ counterparts + and *. This surprises most people at first, but it follows from the fact that three objects are involved in a + operation (the two operands and the result), whereas only two objects are involved in a += operation. In the latter case, runtime efficiency is improved by eliminating the need for temporary variables. Mixed Mode Arithmetic To cope with complex d = 2+b; we need to define operator + to accept operands of different types. In Fortran terminology, we need mixed mode arithmetic. We can achieve that simply by adding appropriate versions of the operators. Initialization A constructor taking a single argument specifies a conversion from its argument type to the constructor’s type. A constructor is a prescription for creating a value of a given type. The constructor is used when a value of a type is expected and when such a value can be created by a constructor from the value supplied as an initializer or assigned value. Thus, a constructor requiring a single argument need not be called explicitly. For example, complex b = 3; Copying A default copy constructor simply copies all members. However, for types where the default copy constructor has the right semantics, I prefer to rely on that default. It is less verbose than anything I can write, and people should understand the default. Also, compilers know about the default and its possible optimization opportunities. Furthermore, writing out the memberwise copy by hand is tedious and errorprone for classes with many data members. I use a reference argument for the copy constructor because I must. The copy constructor defines what copying means – including what copying an argument means – so writing complex: :complex(complex c) : re(c.re) , im(c.im) { } / / error is an error because any call would have involved an infinite recursion. I like the look of the version using = better. It is possible to restrict the set of values accepted by the = style of initialization compared to the ()style by making the copy constructor private or by declaring a constructor explicit Similar to initialization, assignment of two objects of the same class is by default defined as memberwise assignment. We could explicitly define complex::operator= to do that. The copy constructor – whether user defined or compiler generated is used not only for the - initialization of variables - but also for argument passing - value return - and exception handling The semantics of these operations is defined to be the semantics of initialization. Constructors and Conversions The alternative to providing different versions of a function for each combination of arguments is to rely on conversions. For example, our complex class provides a constructor that converts a double to a complex. Consequently, we could simply declare only one version of the equality operator for complex There can be reasons for preferring to define separate functions. For example, in some cases the conversion can impose overheads, and in other cases, a simpler algorithm can be used for specific argument types. Where such issues are not significant, relying on conversions and providing only the most general variant of a function – plus possibly a few critical variants – contains the combinatorial explosion of variants that can arise from mixed mode arithmetic. Where several variants of a function or an operator exist, the compiler must pick ‘‘the right’’ variant based on the argument types and the available (standard and user defined) conversions. Unless a best match exists, an expression is ambiguous and is an error. An object constructed by explicit or implicit use of a constructor is automatic and will be destroyed at the first opportunity. No implicit user defined conversions are applied to the left hand side of a . (or a ->). This is the case even when the . is implicit.

Thus, you can express the notion that an operator requires an lvalue as their left hand operand by making that operator a member.

Literals
It is not possible to define literals of a class type in the sense that 1.2 and 12e3 are literals of type double. However, literals of the basic types can often be used instead if class member functions are used to provide an interpretation for them. Constructors taking a single argument provide a general mechanism for this. When constructors are simple and inline, it is quite reasonable to think of constructor invocations with literal arguments as literals. For example, I think of complex(3) as a literal of type complex, even though technically it isn’t.

NOTE: Always provide Additional Member Functions & Helper functions

Conversion Operators
Using a constructor to specify type conversion is convenient but has implications that can be undesirable. A constructor cannot specify
[1] an implicit conversion from a user defined type to a basic type (because the basic types are not classes), or
[2] a conversion from a new class to a previously defined class (without modifying the declaration for the old class).
These problems can be handled by defining a conversion operator for the source type. A member function X::operator T(), where T is a type name, defines a conversion from X to T.

To enable the usual integer operations on Tiny variables, we define the implicit conversion from Tiny to int, Tiny: :operator int().
Note that the type being converted to is part of the name of the operator and cannot be repeated as the return value of the conversion function:
Tiny: :operator int() const { return v; } / / right
int Tiny: :operator int() const { return v; } / / error
In this respect also, a conversion operator resembles a constructor.

Conversion functions appear to be particularly useful for handling data structures when reading (implemented by a conversion operator) is trivial, while assignment and initialization are distinctly less trivial.

The istream and ostream types rely on a conversion function to enable statements such as
while (cin>>x) cout<
The input operation cin>>x returns an istream&. That value is implicitly converted to a value indicating the state of cin. This value can then be tested by the while. However, it is typically not a good idea to define an implicit conversion from one type to another in such a way that information is lost in the conversion.

In general, it is wise to be sparing in the introduction of conversion operators. When used in excess, they lead to ambiguities. Such ambiguities are caught by the compiler, but they can be a nuisance to resolve. Probably the best idea is initially to do conversions by named functions, such as X: :makeint(). If such a function becomes popular enough to make explicit use inelegant, it can be replaced by a conversion operator X: :operator int().

If both user defined conversions and user defined operators are defined, it is possible to get ambiguities between the user defined operators and the built in operators. For example:

int operator+(Tiny,Tiny) ;
void f(Tiny t, int i)
{
t+i; / / error, ambiguous: operator+(t,Tiny(i)) or int(t)+i ?
}

It is therefore often best to rely on user defined conversions or user defined operators for a given type, but not both.

Ambiguities
An assignment of a value of type V to an object of class X is legal if there is an assignment operator
X: :operator=(Z) so that V is Z or there is a unique conversion of V to Z. Initialization is treated equivalently.

In some cases, a value of the desired type can be constructed by repeated use of constructors or conversion operators. This must be handled by explicit conversions; only one level of user defined implicit conversion is legal. In some cases, a value of the desired type can be constructed in more than one way; such cases are illegal.

The rules for conversion are neither the simplest to implement, the simplest to document, nor the most general that could be devised. They are, however, considerably safer, and the resulting resolutions are less surprising. It is far easier to manually resolve an ambiguity than to find an error caused by an unsuspected conversion.

The insistence on strict bottom up analysis implies that the return type is not used in overloading resolution. The reason for this design choice is partly that strict bottom up analysis is more comprehensible and partly that it is not considered the compiler’s job to decide which precision the programmer might want for the addition.

Once the types of both sides of an initialization or assignment have been determined, both types are used to resolve the initialization or assignment. In these cases, the type analysis is still bottom up, with only a single operator and its argument types considered at any one time.

Friends
An ordinary member function declaration specifies three logically distinct things:
[1] The function can access the private part of the class declaration, and
[2] the function is in the scope of the class, and
[3] the function must be invoked on an object (has a this pointer).
By declaring a member function static (§10.2.4), we can give it the first two properties only. By declaring a function a friend, we can give it the first property only.

A friend declaration can be placed in either the private or the public part of a class declaration; it does not matter where. Like a member function, a friend function is explicitly declared in the declaration of the class of which it is a friend. It is therefore as much a part of that interface as is a member function. A member function of one class can be the friend of another.

Clearly, friend classes should be used only to express closely connected concepts. Often, there is a choice between making a class a member (a nested class) or a friend.

Finding Friends
Like a member declaration, a friend declaration does not introduce a name into an enclosing scope. For large programs and large classes, it is nice that a class doesn’t ‘‘quietly’’ add names to its enclosing scope.

For a template class that can be instantiated in many different contexts, this is very important. A friend class must be previously declared in an enclosing scope or defined in the non class scope immediately enclosing the class that is declaring it a friend.

A friend function can be explicitly declared just like friend classes, or it can be found through its argument types as if it was declared in the non class scope immediately enclosing its class.

Friends and Members
Some operations must be members – for example, constructors, destructors, and virtual functions – but typically there is a choice.

Member functions can be invoked for objects of their class only; no user defined conversions are applied.
99.m1() ; / / error: X(99).m1() not tried

The conversion X(int) is not applied to make an X out of 99.

An operation modifying the state of a class object should therefore be a member or a global function taking a non const reference argument (or a non const pointer argument). Operators that require lvalue operands for the fundamental types (=, *=, ++, etc.) are most naturally defined as members for user defined types.

Conversely, if implicit type conversion is desired for all operands of an operation, the function implementing it must be a nonmember function taking a const reference argument or a non reference argument. This is often the case for the functions implementing operators that do not require lvalue operands when applied to fundamental types (+, , ||, etc.). Such operators often need access to the representations of their operand class. Consequently, binary operators are the most common source of friend functions.

If no type conversions are defined, there appears to be no compelling reason to choose a member over a friend taking a reference argument, or vice versa. In some cases, the programmer may have a preference for one call syntax over another.

All other things considered equal, choose a member. It is not possible to know if someone someday will define a conversion operator. It is not always possible to predict if a future change may require changes to the state of the object involved. The member function call syntax makes it clear to the user that the object may be modified; a reference argument is far less obvious. Furthermore, expressions in the body of a member can be noticeably shorter than the equivalent expressions in a global function; a nonmember function must use an explicit argument, whereas the member can use this implicitly. Also, because member names are local to the class they tend to be shorter than the names of nonmember functions.

Large Objects
Unfortunately, not all classes have a conveniently small representation. To avoid excessive copying, one can declare functions to take reference arguments.

References allow the use of expressions involving the usual arithmetic operators for large objects without excessive copying. Pointers cannot be used because it is not possible to redefine the meaning of an operator applied to a pointer.

This operator+()accesses the operands of + through references but returns an object value. Returning a reference would appear to be more efficient: This is legal, but it causes a memory allocation problem. Because a reference to the result will be passed out of the function as a reference to the return value, the return value cannot be an automatic variable. Since an operator is often used more than once in an expression, the result cannot be a static local variable. The result would typically be allocated on the free store. Copying the return value is often cheaper (in execution time, code space, and data space) than allocating and (eventually) deallocating an object on the free store. It is also much simpler to program.

Essential Operators
In general, for a type X , the copy constructor X (c o n s t X &) takes care of initialization by an object of the same type X . It cannot be overemphasized that assignment and initialization are different operations. This is especially important when a destructor is declared. If a class X has a destructor that performs a nontrivial task, such as free store deallocation, the class is likely to need the full complement of functions that control construction, destruction, and copying.

There are three more cases in which an object is copied:
as a function argument,
as a function return value,
and as an exception.
When an argument is passed, a hitherto uninitialized variable – the formal parameter – is initialized. The semantics are identical to those of other initializations. The same is the case for function return values and exceptions, although that is less obvious. In such cases, the copy constructor will be applied.

For a class X for which the assignment operator X :: o p e r a t o r =(c o n s t X &) and the copy constructor X :: X (c o n s t X &) are not explicitly declared by the programmer, the missing operation or operations will be generated by the compiler.

Explicit Constructors
By default, a single argument constructor also defines an implicit conversion. For some types, that is ideal. For example:
c o m p l e x z = 2 ; // initialize z with complex(2)
In other cases, the implicit conversion is undesirable and error prone. For example:
s t r i n g s = ´a ´; // make s a string with int(’a’) elements
It is quite unlikely that this was what the person defining s meant. Implicit conversion can be suppressed by declaring a constructor e x p l i c i t . That is, an e x p l i c i t constructor will be invoked only explicitly. In particular, where a copy constructor is in principle needed, an e x p l i c i t constructor will not be implicitly invoked.

Subscripting
An o p e r a t o r [] function can be used to give subscripts a meaning for class objects. The second
argument (the subscript) of an o p e r a t o r [] function may be of any type. This makes it possible to
define v e c t o r s, associative arrays, etc.

An o p e r a t o r []() must be a member function.

Function Call
Function call, that is, the notation expression(expressionlist), can be interpreted as a binary operation with the expression as the left hand operand and the expressionlist as the right hand operand. The call operator () can be overloaded in the same way as other operators can. An argument list for an o p e r a t o r ()() is evaluated and checked according to the usual argument passing rules
.
Overloading function call seems to be useful primarily for defining types that have only a single operation and for types for which one operation is predominant. The most obvious, and probably also the most important, use of the () operator is to provide the usual function call syntax for objects that in some way behave like functions. An object that acts like a function is often called a function like object or simply a function object. Such function objects are important because they allow us to write code that takes nontrivial operations as parameters. For example, the standard library provides many algorithms that invoke a function for each element of a container.

At first glance, this technique may look esoteric, but it is simple, efficient, and extremely useful.

Other popular uses of o p e r a t o r ()() are as a substring operator and as a subscripting operator for multidimensional arrays. An o p e r a t o r ()() must be a member function.

Dereferencing
The dereferencing operator -> can be defined as a unary postfix operator. That is, given a class
c l a s s P t r {
/ / ...
X * o p e r a t o r ->();
};
objects of class P t r can be used to access members of class X in a very similar manner to the way pointers are used.

The transformation of the object p into the pointer p .o p e r a t o r ->() does not depend on the member m pointed to. That is the sense in which o p e r a t o r ->() is a unary postfix operator. However, there is no new syntax introduced, so a member name is still required after the ->

Overloading -> is primarily useful for creating ‘‘smart pointers,’’ that is, objects that act like pointers and in addition perform some action whenever an object is accessed through them.

For ordinary pointers, use of -> is synonymous with some uses of unary * and []. Given
Y * p ;
it holds that
p -> m == (*p ).m == p [0 ].m
As usual, no such guarantee is provided for user defined operators. The equivalence can be provided where desired:
c l a s s P t r t o Y {
Y* p;
p u b l i c :
Y * o p e r a t o r ->() { r e t u r n p ; }
Y & o p e r a t o r *() { r e t u r n *p ; }
Y & o p e r a t o r [](i n t i ) { r e t u r n p [i ]; }
};
If you provide more than one of these operators, it might be wise to provide the equivalence, just as it is wise to ensure that ++x and x +=1 have the same effect as x =x +1 for a simple variable x of some class if ++, +=, =, and + are provided.

The overloading of -> is important to a class of interesting programs and not just a minor curiosity. The reason is that indirection is a key concept and that overloading -> provides a clean, direct, and efficient way of representing indirection in a program. Another way of looking at operator -> is to consider it as a way of providing C++ with a limited, but useful, form of delegation.

Operator -> must be a member function. If used, its return type must be a pointer or an object of a class to which you can apply ->. When declared for a template class, o p e r a t o r >() is frequently unused, so it makes sense to postpone checking the constraint on the return type until actual use.

Increment and Decrement
Once people invent ‘‘smart pointers,’’ they often decide to provide the increment operator ++ and the decrement operator to mirror these operators’ use for built in types. This is especially obvious and necessary where the aim is to replace an ordinary pointer type with a ‘‘smart pointer’’ type that has the same semantics, except that it adds a bit of runtime error checking

We might want to replace the pointer p with an object of a class P t r t o T that can be dereferenced only provided it actually points to an object. We would also like to ensure that p can be incremented and decremented, only provided it points to an object within an array and the increment and decrement operations yield an object within the array.

The increment and decrement operators are unique among C++ operators in that they can be used as both prefix and postfix operators

PtrtoT&operator++() ; / / prefix
P t r t o T o p e r a t o r ++( i n t ); // postfix

The int argument is used to indicate that the function is to be invoked for postfix application of ++. This int is never used; the argument is simply a dummy used to distinguish between prefix and postfix application. The way to remember which version of an o p e r a t o r ++ is prefix is to note that the version without the dummy argument is prefix, exactly like all the other unary arithmetic and logical operators. The dummy argument is used only for the ‘‘odd’’ postfix ++ and --.

Advice
[1] Define operators primarily to mimic conventional usage.
[2] For large operands, use c o n s t reference argument types.
[3] For large results, consider optimizing the return.
[4] Prefer the default copy operations if appropriate for a class.
[5] Redefine or prohibit copying if the default is not appropriate for a type.
[6] Prefer member functions over nonmembers for operations that need access to the representation.
[7] Prefer nonmember functions over members for operations that do not need access to the representation.
[8] Use namespaces to associate helper functions with ‘‘their’’ class.
[9] Use nonmember functions for symmetric operators.
[10] Use () for subscripting multidimensional arrays.
[11] Make constructors that take a single ‘‘size argument’’ e x p l i c i t.
[12] For non specialized uses, prefer the standard s t r i n g to the result of your own exercises.
[13] Be cautious about introducing implicit conversions.
[14] Use member functions to express operators that require an lvalue as its left hand operand.

Read stuff with me

Tuesday, October 30, 2012

Stroustrup Notes - Chapter 11 - operator overloading

No comments:

Post a Comment

About Me