View Single Post
Old Feb 18th, 2006, 9:58 PM   #6
grumpy
Programming Guru
 
grumpy's Avatar
 
Join Date: Jun 2005
Location: Adelaide, South Australia
Posts: 1,223
Rep Power: 5 grumpy is on a distinguished road
Dawei, as an example of the dedication ...... howsis???

This text (unrelated to the original question on classes) is something I threw together when a bit bored a few month back, but never got around to posting...... I've just topped it with a dedication <evil grin>.

Using exceptions

Dedication: Several authors, including myself, generated posts like this one a couple of years back on devshed forums. The intent of the posts was to help both novice and expert programmers, and they were written by volunteers. This post is therefore dedicated to Thaminda, the site admin at devshed who saw fit to claim ownership of the material and reposted it under his (or her) own name without any attribution or acknowledgement and (worse) making it completely pointless for the authors to correct any errors in the original posts.

Before getting into advanced concepts related to using exceptions, I'll describe the basics. Most of the following may be gleaned from basic texts, although I've worded things slightly differently. This material starts with basics, and becomes moderately advanced. Basic knowledge of C++ syntax is assumed.

Exceptions, as an error reporting and recovering mechanism, are a feature of C++ (not C). As described in many texts, the basic usage comes down to
try
{

     if (some_condition)
     {
         ExceptionType x;
         throw x;
     }
}
catch (ExceptionType x)
{
         // recover from error
}
Core features of exceptions include;
1) They must be caught, or the program will exit (via a call to terminate() which, by default, calls abort()).
2) The throwing of exceptions need not be in the same function as the try/catch block.
3) When an exception is thrown, control is passed to the first matching catch() clause for recovery. Any objects local to intervening functions are also destructed.
4) Exceptions may be caught via value or reference. If caught by reference, they may be used polymporphically.
5) The exception types that may be thrown by a function may be specified. If an exception is thrown within the function that does not match the "throw specification", the program will exit via a call to abort().
6) Within a catch handler, a new exception may be thrown or the caught exception may be rethrown.

These features are exhibited in the following sample;
#include <string>

class X
{
 public:
    int Type() const {return 1;};
};
class Y : public X
{
 public:
    int Type() const {return 2;};
};
;
class Other {};
void FunctionThatThrows() throw (X)
{
    Y y;
    throw y;    // OK;  Y is derived from X
}

void AnotherFunctionThatThrows() throw (X)
{
    Other a;
    throw a;   // will result in call to abort();
}

void IntermediateFunction(bool call)
{
     std::string hello("Hello");

     // just to illustrate how to do it, we catch all exceptions and rethrow them
     //    Such a construct gives the same effect that the compiler would give us
     //   anyway
     
     try
     {     
         if (call) FunctionThatThrows();
         //  the string hello will be cleaned up (i.e. destructed) under all circumstances
         //    if an exception is not thrown (eg call is false) it will be cleaned up
         //    if an exception is thrown (eg call is true) it will still be cleaned up
     }
     catch (...)
     {
         throw;   // rethrow exception
     }
}

int main()
{
    try
    {
         std::string x("x");
         IntermediateFunction(false);
         std::string y("y");
         // both x and y will be cleaned up
    }
    // no exception thrown, so neither of the catch clauses will be invoked
    catch (X &x)
    {
       std::cout << x.Type() << std::endl;
    }
    catch (...)
    {
    }

    try
    {
         std::string x("x");
         IntermediateFunction(true);   // this will throw
         std::string y("y");           // so this line is never executed
         // but x will still be cleaned up (i.e. it's destructor will be invoked).
    }
    // an exception of type Y (derived from X) will be thrown, so the first catch clause 

will have effect
    catch (X &x)
    {
       std::cout << x.Type() << std::endl;  // exception of type Y thrown, so value of 2 

printed
    }
    catch (...)   // catch clauses checked in order they are declared.
                  //  First clause matches, so this one has no effect
    {
    }

    return 0;
}
The above is what one may normally work out by reading most basic texts. The problem is, while the material is accurate, it does not represent how one may effectively use exceptions. In particular, most people attempting to manage exceptions in their code sprinkle try/catch clauses everywhere. This actually makes their code much more complicated than it needs to be and, because try/catch blocks must wrap other code, makes their code much more difficult to understand and maintain. Ironically, it turns out that a basic rule of thumb to using exceptions effectively is that use of try/catch should be avoided wherever possible. This allows most error handling code to be localised and reduces phenomena of having to reuse exception handling code. The purpose of the remainder of this post is to explain concepts and techniques that help reduce the need to write lots of exception handling code.

Rule 1: View exception handling as part of an error management strategy

Exceptions are not the only way of handling or reporting error conditions. Other ways include;
1) When an error condition is detected, simply exit the program. This approach is often used in C programs when dynamic memory allocation fails.
2) Use return or error codes from functions rather than exceptions. This approach is commonly used in C and C++ for errors that occur with file input and output.

Both of these approaches may be used, whether exceptions are used or not. The first approach is common in C programs as C does not support exceptions, but less common in well-designed C++ code as an unhandled exception gives the same effect, but also provides an opportunity to catch and recover from the error.

The second approach is appropriate if the error condition is not severe. This is a double edged sword, as it requires the caller to deliberately check if an error has occurred, but the error will be silently ignored if the caller forgets to do that check. An error code is therefore appropriate if the program can sensibly continue executing even if the error code is not checked. An example where error codes are appropriately used is "end of file" (EOF) condition with C and C++ file input. This allows parsing of a file using a loop that repeats until EOF is reached. Forgetting to check for EOF is also rarely a critical error (the program can often continue sensibly afterwards). One of my pet peeves with the Java I/O library is that it throws exceptions for some non-critical errors, forcing code that reads from a file to be more complex (several local try/catch blocks) than necessary.

It often makes sense to handle errors by different mechanisms. There is nothing wrong with a program that makes use of multiple error reporting mechanisms. Essentially, the errors encountered by an application fall into four categories.
1) Those that can be recovered from immediately, by the code that detects them.
2) Those that cannot be recovered from immediately, but do not cause chaos if they are ignored.
3) Those that cannot be recovered from immediately, but will cause chaos if recovery is not initiated.
4) Those that are irrecoverable.

An example of the first would be a function that dereferences a pointer detecting that the pointer is NULL. Such an error might be recovered from silently by doing nothing if the pointer is NULL. This would only be appropriate if behaviour of other code will not be adversely affected by our function silently takes no action.

An example of the second is file I/O. Encountering an EOF is often an expected, and non-critical, event when reading a file.

An example of the third category is a function that detects failure of supplied data to comply with a complex set of required pre-conditions, but has insufficient information to be able to correct the data. Throwing an exception is appropriate to return control to a caller that (presumably) provided the required data.

To develop a useful error handling strategy, it is necessary to characterise the severity of errors, the urgency and feasibility of recovery, and whether or not it is possible to correct an error at the point it is detected. If a particular error falls neatly into one of the categories above, the approach to handle that error will be obvious. A lot of errors will not fall neatly into only one category. In this case, an analysis of of benefits and costs will determine options to realistically handle the error.

Rule 2: Use features of the language to help simply exception handling

The most helpful language feature is the fact that all locally constructed objects will be destroyed when a function returns (or a block - delimited by {} - completes) whether an exception is thrown or not. This means that, if an exception is thrown from deep within a set of function calls, the destructors will be called for every object created within those functions.

This allows the use of RAII (Resource Acquisition Is Initialisation) techniques. Essentially, this means that resources used by an object (memory, file handles, mutexes, etc) are acquired by constructors, and cleaned up by destructors. As the destructors will be called as control passes from a throw statement to the first matching catch clause, this means that all resources grabbed in between will be cleaned up. While member functions may be used to reinitialise the resource (eg a class working with an array may dynamically resize the array), those functions must ensure that they do not cause a leak, and ensure that the destructor will have something valid to clean up.

The language also guarantees that, if an exception is thrown while an object is being constructed, that any parts of the objects successfully constructed will be destroyed. This can be used to avoid resource leaks. For example;
class Cat {};   // definition of Cat is incidental to this example
class Base
{
    public:
        Base() : x(new Cat) {};
        virtual ~Base() {delete x;};   
    private:
        Cat *x;
};

class Dog {};

class Derived : public Base
{
    public:
       Derived(): Base(), y(new Dog) {throw Foo;};
       ~Derived() {delete y;}
    private:
       Dog *y;
}
The language guarantees that, when creating a Derived, the base classes are constructed first, and then data members are constructed. In this example, Base is constructed, then y is initialised. The constructor of Derived throws an exception. The language then guarantees that the components of Derived that have been successfully initialied will be cleaned up, in reverse order of their construction. So, in this example, y will be deleted, and the destructor of Base will be invoked. This approach means that, if construction of an object fails, the object never exists (and an exception is encountered).

This rule allows one to avoid the common, but naive, "two phase construction" approach, which essentially means that a constructor sets the object into some default "safe state", and then requires calling of something like an init() function to initialise the object into the needed state. The problems with such an approach include the possibility of forgetting to call the init() function, the possibility of calling it more than, and (often) the need for every member function to check that the object is not in the "default" state before using it. These problems increase likelihood of errors, and make the class implementation more difficult to understand and maintain. The most usual reason for "two phase construction" is avoiding grabbing resources before they are needed. This can also be avoided by not constructing the whole object until it is needed.

This practice also simplifies the provision of multiple functions by one class. In this example, class Base manages an instance of a Cat, so the implementer of class Derived need not worry about the creation and destruction of that Cat.

Rule 3:Practice exception safety

Exception safety amounts to coding your program in ways to provide certain guarantees of behaviour when an exception occurs. It is useful to design operations in terms of the guarantees they provide if an exception is thrown. The guarantees that are provided by operations in the standard library are;
1) The strong guarantee means that an operation succeeds or, if it fails, has no effect.
2) The basic guarantee means that the object remains in a valid state, can be destructed, and no resources are leaked.
3) The nothrow guarantee means that an operation is guaranteed to never throw exceptions, as well as not introducing resource leaks or leaving the object in an invalid state. In other words, if an exception is thrown within a function, it is recovered from and the exception is not propagated.

An example where the strong guarantee would be needed is in banking. A transaction on a savings account should either succeed or, if it fails, have no effect. An unauthorised withdrawal should not occur, and not change the account balance.

The basic guarantee is a bit weaker. If an error occurs, the object can be successfully reassigned or destroyed, but it's contents may be unpredictable.

These guarantees come at different cost. The strong guarantee is the most expensive, as it can mean backtracking to a previous known state when an error occurs. The basic guarantee is the minimum that one should aim for, as it is otherwise not possible to use the object again.

All operations in the standard library provide one of the above guarantees, subject to user-defined objects also providing similar guarantees. For example, std::vector<X>::push_back() [used to add an element to the end of a vector] always provides the strong guarantee. However, all variants of std::vector<X>::insert() [used to insert elements into the middle of a vector] only provide the strong guarantee if the copy constructor of X provides the nothrow guarantee. Multi-element insertions into a std::map only provides the basic guarantee.

The tools that we, the humble programmer, have to provide these guarantees are;
1) the try/catch block. For example, one way of implementing a nothrow guarantee is to catch an exception, recover from it, and not throw anything again;
2) the RAII technique mentioned above;
3) never let go of a piece of information until we can store its replacement without triggering an exception; and
4) ensure all objects are in a valid state before throwing an exception;
5) ensure all resources that may need to be released are owned by an object (so that object's destructor will release the resource) or release any unowned resources immediately before throwing an exception.

An example

To illustrate, let's consider how we might implement the constructors and assignment operator for a class that dynamically allocates a dynamically allocated array. First, I'll give a commonly used approach, but one which does not provide useful guarantees.
//  example of poorly constructed resizable vector
template<class X> class PoorResizableVector
{
    public:
        PoorResizableVector(int default_size);
        void Resize(int size);
        PoorResizableVector(const PoorResizableVector<X> &);
        PoorResizableVector<X> &operator=(const PoorResizableVector<X> &);
        ~PoorResizableVector(); 
    private:
        X *data;
        int size;
};

template<class X> PoorResizableVector<X>::PoorResizableVector(int default_size)
{
    data = NULL;
    size = 0;
    Resize(default_size);
}

template<class X> void PoorResizableVector<X>::Resize(int newsize)
{
    if (newsize > size)
    {
       double *newdata = new X[newsize];
       if (size > 0)
          for (i = 0; i < size; ++i) newdata[i] = data[i];   
       size = newsize;
       delete [] data;
       data = newdata;
    }
    else if (newsize < size)
    {
       size = newsize;
    } 
}

// copy constructor

template<class X> PoorResizableVector<X>::PoorResizableVector(const PoorResizableVector<X> &a)
{
    data = new X[a.size];
    size = a.size;    
    if (size > 0)
       for (i = 0; i < size; ++i) data[i] = a.data[i];   
}

template<class X> PoorResizableVector<X> &PoorResizableVector<X>::operator=(const 

PoorResizableVector<X> &a)
{
    if (this == &a) return *this;
    data = new X[a.size];
    size = a.size;    
    if (size > 0)
       for (i = 0; i < size; ++i) data[i] = a.data[i];
    return *this;   
}

template<class X> PoorResizableVector<X>::~PoorResizableVector()
{
    delete [] data;
}
This example uses the "two-phase construction" approach, as it initialises into a default "safe" state, and requires resizing of the vector later. The problem is that the Resize() function may fail and throw exceptions in two distinct ways. The first way (dynamically allocating a new array) is harmless, as the C++ language guarantees that no memory will be leaked. The second possible failure occurs when copying the elements from the old vector to the new: these operations can fail if assignment of an X can fail (eg X::operator=() might dynamically allocate memory, and fail by throwing an exception). The copy constructor has exactly the same problems. The assignment operator has these problems plus one more: the approach will fail for self-assignment (eg v = v;), so a test is necessary.

The basic problem is that the logic of PoorResizableVector is muddled, and no care is taken to ensure exception safety. Modifying this example, without changing the core logic, to make it provide useful guarantees would actually be quite difficult. For example, when resizing an array, it is necessary to copy existing elements into the resized array. If an exception occurs, it needs to be caught and the memory allocated for the new size array needs to be released. I leave that as an exercise.....

A better example is as follows.
//  example of better constructed resizable vector
template<class X> class BetterResizableVector
{
    public:
        BetterResizableVector(int default_size = 10);
        BetterResizableVector(const BetterResizableVector<X> &);
        BetterResizableVector<X> &operator=(const BetterResizableVector<X> &);
        ~BetterResizableVector(); 
        X &operator[](int index);
    private:
        X *data;
        int size;
};

template<class X> BetterResizableVector<X>::BetterResizableVector(int default_size):
   data(new X[default_size]), size(default_size)
{
}

// copy constructor

template<class X> BetterResizableVector<X>::BetterResizableVector(const 

BetterResizableVector<X> &a): data(new X[a.size]), size(a.size)
{
    for (i = 0; i < size; ++i) data[i] = a.data[i];   
}

template<class X> BetterResizableVector<X> &BetterResizableVector<X>::operator=(const 

BetterResizableVector<X> &a)
{
    BetterResizableVector<X> temp(a);
    // swap contents of temp and *this.
    X *ptemp = data;
    data = temp.data;
    temp.data = ptemp;
    int tsize = size;
    size = temp.size;
    temp.size = tsize;

    // the preceding 6 lines may be more simply expressed as:
    //   std::swap(size, temp.size); std::swap(data, temp.data);

    return *this;   
}

template<class X> BetterResizableVector<X>::~BetterResizableVector()
{
    delete [] data;
}
The default constructor does not rely on "two phase construction". If memory cannot be assigned, an exception is thrown and it is (for the function trying to create the vector) as if the vector has never existed. Similarly, if the copying of elements fails in the copy constructor, the memory allocated previously is recovered. The assignment operator is interesting, as it immediately creates a temporary copy of the vector being assigned. If this fails, the temporary copy is cleaned up. The subsequent operations simply swap the data between temp and *this. As these swaps are simply swapping of two raw pointers and of two integers, no exception can occur when doing the swap. The variable temp will be destructed as the function returns, thereby cleaning up the original data in *this.

It must be noted that this approach comes at a cost: an additional object (and associated resources) must exist while the assignment operator is doing its work. In a lot of cases, safer code is sufficient to justify a short term requirement for additional resources. In some cases, it is not, and there is a trade-off between performance and the type of guarantee made when an exception is thrown.

The key, however, is that the type of exception that may be thrown by each line of code needs to be analysed, and the feasibility of recovering from it determined. In the example given, the order of operations (create a complete copy, swap over the internal representations) results in cleaner code because, if an exception occurs, the original object is left unchanged.

An alternate implementation of the assignment operator, which actually does almost the same thing, is;
template<class X> BetterResizableVector<X> &BetterResizableVector<X>::operator=(const 

BetterResizableVector<X> &a)
{
    X *tempX = new X[a.size];
    try
    {
       for (int i = 0; i < a.size; ++i)  tempX[i] = a.data[i];
    }
    catch (...)
    {
       delete [] tempX; 
    }
    delete [] data;
    data = tempX;
    size = a.size;
    return *this;   
}
This implementation avoids creating a temporary BetterResizableVector, and saves resources by not needing to store an additional integer value or swapping operations. Because of the need for exception handling and recovery, I would argue it is slightly more difficult to understand. In particular, it is necessary to determine what happens if an exception occurs partway through the loop. Other people may prefer the second version.

Other considerations

There are some other considerations related to using exceptions effectively.

The C++ standard effectively disallows having two active exceptions at one time. If an exception is thrown while control is being passed back up the call stack because of an existing exception, the program is required to terminate(). In practice, the only ways in which an exception can be generated while another is active is from an objects destructor. This means that throwing an exception from a destructor is something to avoid. In other words, destructors should provide a "nothrow" guarantee. The C++ standard library also strongly discourages the practice of throwing exceptions in destructors, but for different reasons: recovering from exceptions in destructors of an arbitrary user-supplied type is quite difficult to implement and carries a significant run time cost. An example of this is
class MyException{};
class Foo
{
    public:
       ~Foo() {throw MyException();};
};

void Func()
{
    Foo x;
    throw MyException();
}

int main()
{
    try
    {
       Func();    // this will result in program termination     
    }
    catch (...)
    {
       // we will never get here
    }
    return 0;
}

The basic mechanism by which exceptions are propagated to callers is compiler dependent. However, the C++ standard allows invoking of the exception's copy constructor in that process. This means it is necessary to ensure that copying of exception types cannot throw exceptions (the result may be a call to abort()).

It is also not a good idea to define exception types so that constructing them is able to throw an exception. If this is done, an exception may occur in the process of constructing an exception, and the caller might be thrown a different exception from that intended. A side effect of this is that it is not a good idea to use dynamic memory allocation when creating an exception type. This means it is a poor idea to use types like std::string as members of an exception type. For example;
#include <string>

struct MyException
{
    std::string type;   // BAD:  construction of this may throw
};

void Func()
{
    MyException e;      // either line in this function may throw
    e.type = "Hello";
    throw e;
}

Acknowledgements: Although this material is simplified and structured a little differently, significant parts are based on Appendix E of the "The C++ Programming Language, Special Edition" by Bjarne Stroustrup, 2000, ISBN 0-201-70073-5.
grumpy is offline   Reply With Quote