Saveen
Reddy
and G.
Bowden Wise
Welcome
to the inaugural edition of the ObjectiveViewPoint column!
Here we will touch on many aspects of object-orientation.
The word object has surfaced in more ways than you can count.
There are OOPLs (Object-Oriented Programming Languages) and
OODBs (Object-Oriented Databases), OOA (object-oriented analysis),
and OOD (object-oriented design). We are sure you can come
up with some OOisms of your own.
Our
goal in this column is to explore object-orientation through
practical object-oriented programming. This time, we look
at C++, but in the future we will explore other areas of object-orientation.
Learning an object-oriented language-a whole new way of programming-will
pave the way for many exciting topics down the road.
Our
intended audience consists of humble beginners to seasoned
hackers. We assume that you have programmed in at least one
procedural language, such as C or Pascal. Even if you are
familiar with C++, please stay with us, you may learn some
interesting new language features. Also, we will illustrate
our points with many self-contained examples that you may
later wish to incorporate into your own programs.
C++:
A Historical Perspective
We
begin our journey of C++ with a little history. C, the predecessor
to C++, has become one of the most popular programming languages.
Originally designed for systems programming, C enables programmers
to write efficient code and provided close access to the machine.
C compilers, found on practically every Unix system, are now
available with most operating systems.
During
the 1980s and into the 1990s, an explosive growth in object-oriented
technology began with the introduction of the Smalltalk language.
Object-Oriented Programming (OOP) began to replace the more
traditional structured programming techniques. This explosion
led to the development of languages which support programming
with objects. Many new object-oriented programming languages
appeared: Object-Pascal, Modula-2, Mesa, Cedar, Neon, Objective-C,
LISP with the Common List Object System (CLOS), and, of course,
C++. Although many of these languages appeared in the 1980s,
many ideas of OOP were taken from Simula-67. Yes! OOP has
been around since 1967.
C++
originated with Bjarne Stroustrop. In the simplest sense,
if not the most accurate, we can consider it to be a better
C. Although it is not an entirely new language, C++ represents
a significant extension of C abilities. We might then consider
C to be a subset of C++. C++ supports essentially every desirable
behavior and most of the undesirable ones of its predecessor,
but provides general language improvements as well as adding
OOP capability. Note that using C++ does not imply that your
are doing OOP. C++ does not force you to use its OOP features.
You can simply create structured code that uses only C++'s
non-OOP features.
C++:
A Better C
The
designers of C++ wanted to add object-oriented mechanisms without
compromising the efficiency and simplicity that made C so popular.
One of the driving principles for the language designers was
to hide complexity from the programmer, allowing her to concentrate
on the problem at hand.
Because
C++ retains C as a subset, it gains many of the attractive
features of the C language, such as efficiency, closeness
to the machine, and a variety of built-in types. A number
of new features were added to C++ to make the language even
more robust, many of which are not used by novice programmers.
By introducing these new features here, we hope that you will
begin to use them in your own programs early on and gain their
benefits. Some of the features we will look at are the role
of constants, inline expansion, references, declaration statements,
user defined types, overloading, and the free store.
Most
of these features can be summarized by two important design
goals: strong compiler type checking and a user-extensible
language.
By
enforcing stricter type-checking, the C++ compiler makes us
acutely aware of data types in our expressions. Stronger type
checking is provided through several mechanisms, including:
function argument type checking, conversions, and a few other
features we will examine below.
C++
also enables programmers to incorporate new types into the
language, through the use of classes. A class is a user-defined
type. The compiler can treat new types as if they are one
of the built-in types. This is a very powerful feature. In
addition, the class provides the mechanism for data abstraction
and encapsulation, which are key to object-oriented programming.
As we examine some of the new features of C++ we will see
these two goals resurface again and again.
A
NEW FORM FOR COMMENTS.
It
is always good practice to provide comments within your code
so that it can be read and understood by others. In C, comments
were placed between the tokens /* and */ like
this:
/* This is a traditional C comment */
C++
supports traditional C comments and also provides an easier
comment mechanism, which only requires an initial comment delimiter:
// This is a C++ comment
Everything
after the // and to the end of the line is a comment.
THE
CONST KEYWORD.
In
C, constants are often specified in programs using #define
. The #define is essentially a macro expansion facility,
for example, with the definition:
#define PI 3.14159265358979323846
the
preprocessor will substitute 3.14159265358979323846 wherever
PI is encountered in the source file. C++ allows any
variable to be declared a constant by adding the const keyword
to the declaration. For the PI constant above, we would
write:
const double PI = 3.14159265358979323846;
A
const object may be initialized, but its value may never
change. The fact that an object will never change allows the
compiler to ensure that constant data is not modified and to
generate more efficient code. Since each const element also
has an associated type, the compiler can also do more explicit
type checking.
A
very powerful use of const is found when it is combined with
pointers. By declaring a ``pointer to const'', the pointer
cannot be used to change the pointed-to object. As an example,
consider:
int i = 10;
const int *pi = &i;
*pi = 15;
// Not allowed! pi is a const pointer!
It
is not possible to change the value of i through the pointer
because *pi is constant. A pointer used in this way can
be thought of as a read-only pointer; the pointer can be used
to read the data to which it points, but the data cannot be
changed via the pointer. Read-only pointers are often used by
class member functions to return a pointer to private data stored
within the class. The pointer allows the user to read, but not
change, the private data.
Unfortunately,
the user can still modify the data pointed at by the read-only
pointer by using a type cast. This is called ``casting away
the const-ness''. Using the above example, we can still change
the value of i like this:
// Cast away the constness of the pi pointer and modify i
*((int*) pi) = 15;
By
returning a const pointer we are telling users to keep their
hands off of internal data. The data can still be modified,
but only with extra work (the type cast). So, in most cases
users will realize they are not to modify that data, but can
do so at their own risk.
There
are two ways to add the const keyword to a pointer declaration.
Above, when const comes before the * , what the pointer
points to is constant. It is not possible to change the variable
that is pointed to by the pointer. When when const comes after
the *, like this:
int i = 10;
int j = 11;
int* const ptr = &i;
// Pointer initialized to point to i
the
pointer itself becomes constant. This means that the pointer
cannont be changed to point to some other variable after it
has been initialized. In the above example, the pointer ptr
must always point at the variable i. So, statements such
as:
ptr = &j;
// Not allowed, since the pointer is const!
are
not allowed and are caught by the compiler. However, it is possible
to modify the variable that the pointer points to:
*ptr = 15;
// This is ok, what is pointed at is not const
If
we want to prevent modification of what the pointer points to
and prevent the value of the pointer from being changed, we
must provide a const on both sides of the * like
this:
const int * const ptr = &i;
Remember
that adding const to a declaration simply invokes extra
compile time type checking; it does not cause the compiler
to generate any extra code. Another advantage of using the
const mechanism is that the C++ construct will be available
to a symbolic debugger, while the preprocessing symbols generally
are not.
INLINE
EXPANSION
Another
common use of the C #define macro expansion facility
is to avoid function call overhead for small functions. Some
functions are so small that the overhead of invoking the function
call takes more time than the body of the function itself. C++
provides the inline keyword to inform the compiler to place
the function inline rather than generate the code for calling
the routine. For example, the macro
#define max (x, y) ((x)>(y)?(x):(y))
can
be replaced for integers vy the C++ inline function
inline int max (int x, int y)
{
return (x > y ? x : y);
}
When
a similar function is needed for multiple types, the C++ template
mechanism can be used.
Macro
expansion can lead to notorious results when encountering
an expression with side effects, such as
max (f(x), z++);
which,
after macro expansion becomes:
((f(x)) > (z++) ? (f(x) : (z++));
The
variable z will be incremented once or twice, depending on
the values of the x and y arguments to the function max().
Such errors are avoided when using the inline mechanism.
When
defining a C++ class, the body of a class member function
can also be specified. This code is also treated as inline
code provided it does not contain any loops (e.g., while).
For example:
class A {
int a;
public:
A() { }
// inline
int Value()
{
return a;
}
// inline
}
Since
the code for both the constructor A() and the member
function Value() are specified as part of the class
definition, the code between the braces will be expanded inline
whenever these functions are invoked.
REFERENCES
Unlike
C, C++ provides true call-by-reference through the use of reference
types. A reference is an alias or a name to an existing object.
They are simliar to pointers in that they must be initialized
before they can be used. For example, let's declare an integer:
int n = 10;
and
then declare a reference to it:
int& r= n;
Now
r is an alias for n; both identify the same object
and can be used interchangeably. Hence, the assignment
r = - 10;
changes
the value of both r and n to -10.
It
is important to note that initialization and assignment are
completely different for references. A reference must have
an initializer. Initialization is an operator that operates
only on the reference itself. The initialization
int& r = n;
establishes
the correspondence between the reference and the data object
that it names. Assignment behaves like we expect an operation
to, and operates through the reference on the object referred
to. The assignment,
r = -10;
is
the same for references as for any other lvalue, and simply
assigns a new value to the designated data object.
C
programmers know that C uses the call-by-value parameter mechanism.
In order to enable functions to modify the values of their
parameters, pointers to the parameters must be used as the
``value'', which is passed. For example, a routine Swap(),
which swaps its parameters would be written like this in C:
void Swap (int* a, int* b)
{
int tmp;
tmp = *a;
*a = *b;
*b = tmp;
}
The
routine would be invoked like this:
int x = 1;
int y = 2;
Swap (&x, &y);
C
programmers are all too familiar with what happens when one
of the ampersands is forgotten; the program usually ends with
a core dump!
Now
consider the C++ version of Swap() which makes use
of true call-by-reference.
void Swap (int& a, int& b)
{
int tmp;
tmp = a;
a = b;
b = tmp;
}
The
routine would be invoked like this:
int x = 1;
int y = 2;
Swap (x, y);
The
compiler ensures that the parameters of Swap() will be
passed by reference. In C, often a run-time error results if
the value of a parameter is passed instead of its address. References
eliminates these errors and is syntactically more pleasing.
Another
use for references is as return types. Consider this routine:
int& FindByIndex (int* theArray,int index)
{
return theArray[index];
}
Note
that the FindByIndex() returns a reference to the element
in the array rather than its value. The expression FindByIndex
(A, i) yields a reference to the ith element of the array
A. Now, because a reference is an lvalue, it can be used
on the left hand side of an expression, we can write:
FindByIndex(A, i) = 25;
which
will assign 25 to the ith element of the array A.
Note
that if FindByIndex() is made inline, the overhead
due to the function call is eliminated. Inline functions that
return references are attractive for the sake of efficiency.
DECLARATIONS
AS STATEMENTS.
In
a C++ program, a declaration can be placed wherever a statement
can appear, which can be anywhere within a program block. Any
initializations are done each time their declaration statement
is executed. Suppose we are searching a linked list for a certain
key:
int IsMember (const int key)
{
int found = 0;
if (NotEmpty())
{
List* ptr = head;
// Declaration
while (ptr && !found)
{
int item = ptr->data;
// Declaration
ptr = ptr->next;
if (item == key)
found = 1;
}
}
return found;
}
By
putting declarations closer to where the variables are used,
you write more legible code.
IMPROVED
TYPE SYSTEM.
Through
the use of classes, user-defined types may be created, and if
properly defined, C++ will behave as if they are one of the
built-in types: int, char, float, and double.
It is possible to define a Vector type and perform operations
such as addition and multiplication just as easily as is done
with ints:
// Define some arrays of doubles
double a[3] = { 11, 12, 13 };
double b[3] = { 21, 22, 23 };
// Initialize vectors from the
// double arrays
Vector v1 = a;
Vector v2 = b;
// Add the two matrices.
Vector v3 = v1 + v2;
The
Vector class has been defined with all of the appropriate
arithmetic operations so that it can be treated as a built-in
type. It is even possible to define conversion operators so
that we can convert the Vector to a double, we
get the magnitude, or norm, of the Vector:
double norm = (double) v3;
OVERLOADING.
One
of the many strengths of C++ is the ability to overload functions
and operators. By overloading, the same function name or operator
symbol can be given several different definitions. The number
and types of the arguments supplied to a function or operator
tell the compiler which definition to use. Overloading is most
often used to provide different definitions for member functions
of a class. But overloading can also be used for functions that
are not a member of any class.
Suppose
we need to search different types of arrays for a certain
value. We can provide implementations for searching arrays
of integers, floats, and doubles:
int Search (
const int* data,
const int key);
int Search (
const float* data,
const float key);
int Search (
const double* data,
const double key);
The
compiler will ensure that the correct function is called based
on the types of the arguments passed to Search(). When
arguments do not exactly match the formal parameter types, the
compiler will perform implicit type conversions (e.g., int
to float) in an attempt to find a match.
Overloading
is most often used for member functions and operators of classes.
Most classes have overloaded constructors, for there is often
more than one way to create a given object. All of the built-in
types also have operators such as addition, subtraction, multiplication,
and division. In fact, we can mix different types and still
add them together:
int i = 1;
char c = 'a';
float f = -1.0;
double d = 100.0;
int result = i + c + f + d;
The
compiler takes applies the type conversions appropriate for
the above calculation. When we define our own types, we can
inform the compiler which operations and type conversions can
be applied to our type. The compiler will allow our type to
blend in with the built-in types. We will see more examples
of this when we look at classes in detail.
A
FREE STORE IS PROVIDED.
In
C, variables are placed in the free store by using the sizeof()
macro to determine the needed allocation size and then calling
malloc() with that size. Variables are removed from the
free store by calling free(). With classes, using malloc()
and free() becomes tedious. C++ provides the operators
new and delete, which can allocate not only built-in
types but also user-defined types. This provides a uniform mechanism
for allocating and deallocating memory from the free store.
For
example, to allocate an integer:
int *pi;
pi = new int;
*pi = 1;
and
to allocate an array of 10 ints:
int *array = new int [10];
for (int i=0;i < 10; i++)
array[i] = i;
Just
as with malloc() the memory returned by new is
not initialized; only static memory has a default initial value
of zero.
Suppose
we have defined a type for complex numbers, called complex.
We can dynamically allocate a complex number as follows:
complex* pc = new complex (1, 2);
In
this case, the complex pointer pc will point to the complex
number 1 + 2i.
All
memory allocated using new should be deallocated using
delete. However, delete takes on different forms
depending on whether the variable being deleted is an array
or a simple variable. For the complex number above, we simply
call delete:
delete pc;
Delete
calls the destructor for the object to be deleted. However,
to delete each element of an array, you must explicitly inform
delete that an array is to be deleted:
delete [] array;
The
C++ compiler maintains information about the size and number
of objects in an array and retrieves this information when deleting
an array. The empty bracket pair informs the compiler to call
the class destructor for each element in the array.
Be
careful, attempting to delete a pointer that has not
been initialized by new results in undefined program behavior.
However, it is safe to apply the delete operator to a null
pointer.
New
and delete are global C++ operators and can be redefined
(e.g., if it is desirable to trap every memory allocation).
This is useful in debugging, but is not recommended for general
programming. More often, the operators new and delete
are overridden by providing new and delete operators for a
specific class.
When
C++ allocates memory for a user-defined class, the new
operator for that class is used if it exists, otherwise the
global new is used. Most often, programmers define
new for certain classes to achieve improved memory management
(i.e., reference counting for a class).
The
Class: Data Encapsulation, Data Hiding, and Objects
Like
a C structure, a C++ class is a data type. An object is simply
an instantiation of a class. C++ classes have additional capabilities
as the following example should show:
Vector v1(1,2),
Vector v2(2,3),
Vector vr;
vr = v1 + v2;
Vector
is a class. v1, v2, and vr are objects
of class Vector. v1 and v2 are given initial
values through their constructor. vr is also initialized
through its constructor to certain default values. The example
illustrates a major power of C++. Namely, we can define functions
on a class as well as data members. Here, we have an overloaded
addition operator which makes our expression involving Vectors
seem much more natural than the equivalent C code:
Vector v1, v2, vr;
add_vector( &vr , &v1, &v2 );
The
ability to define these member functions allows us to have a
constructor for Vector, code that creates an object of
class Vector. The constructor ensures proper initialization
of our Vectors.
Though
not illustrated in the above example, a class can limit the
use of its data members and member functions by non-member
code. This is encapsulation. If class K defines member M as
private, then only members of class K can use M. Defining
M as public means any other class or function can use M.
Let's
take a look at a trivial implementation of Vector that will
show is a little about constructors, operators, and references.
#include <iostream.h>
class Vector
{
public:
Vector(double new_x=0.0,double new_y=0.0) {
if ((x<100.0) && (y<100.0))
{
x=new_x;
y=new_y;
}
else
{
x=0;
y=0;
}
}
Vector operator +
( const Vector & v)
{
return
(Vector (x + v.x, y + v.y));
}
void PrintOn (ostream& os)
{
os << "["
<< x
<< ", "
<< y
<< "]";
}
private:
double x, y;
};
int main()
{
Vector v1, v2, v3(0.0,0.0);
v1=Vector(1.1,2.2);
v2=Vector(1.1,2.2);
v3=v1+v2;
cout << "v1 is ";
v1.PrintOn (cout);
cout << endl;
cout << "v2 is ";
v2.PrintOn (cout);
cout << endl;
cout << "v3 is ";
v3.PrintOn (cout);
cout << endl;
}
Encapsulation
of x and y means that they cannot be altered without
the help of specific member functions. Any member function or
data member of Vector can use x and y freely.
For everyone else, the member functions provide a strict interface.
They ensure a particular behavior in our objects. In the example
above, no Vector can be created that has an x
or y component that exceeds 100. If at some point code
tries to do this, then the constructor performs bounds-checking
and sets x and y both to zero. In a normal C structure
we can simply do the following:
Vector v1;
InitVector( & v1, 99 , 99 );
v1.x = 1000;
InitVector()
closely approximates a C++ constructor. Assume it tries to behave
like the constructor Vector() in example three. This C code
demonstrates how without encapsulation we can easily violate
the rules set up in our pseudo-constructor. With class Vector,
both x and y are private. As a result, they can
only be accessed by member functions. If our goal is to prevent
x and y from exceeding 100, we simply have all
accessor functions perform bounds-checking. In fact, once created
and outside of the addition operation there's no way to modify
x or y. They are private and no member function,
outside of the constructor, sets their values. Notice how the
constructor Vector() limits our Vector component values.
By returning a new object, the addition operator uses the constructor
to check for overflow.
We
could have made `+' do multiplication instead. Though
such manipulation is atypical, it can be quite useful. For
example, C++ comes standard with a streams library which uses
the << operator to provide output.
There
is one useful thing about the addition operator: we don't
have to pass the addresses of arguments. The arguments for
the addition operator specify that the parameters are references
(using the reference operator &-not the same as the address-of
operator &). Recall that the reference operator allows
us to use the same calling syntax as call-by-value and yet
modify the value of an argument. The reference operator avoids
the overhead of actually creating a new object. Thus, we can
avoid a lot of indirection.
However,
the most powerful OOP extension C++ provides is probably inheritance.
Classes can inherit data and functions from other classes.
For this purpose we can declare members in the base class
as protected: not usable publicly but usable by derived classes.
In
conclusion, we looked at some of the features that make C++
a better C. C++ provides stronger type checking by checking
arguments to functions and reduces syntactic errors through
the use of reference types. Programmers can also add new types
to the language by defining classes. Although we have only
taken a brief look at classes, we will see more abstract discussion
of C++ object-orientation as well as general OOP concepts
in upcoming columns.
Copyright 1994 by Saveen Reddy and Bowden Wise
|