A different name for this book was considered:
Tips and Tricks of the C++ Mastersbut this led to the connotation that this was a book of C++ peculiarities and wizards tricks. While there might be one or two of those to be found -- most of them gleened from looking at the source code for early STL versions -- that isn't the focus of the book.
Instead, this book is for the professional developer of large scale C++ programs. Its about standard practices, regularity, simplicity, and correctness.
Experience can be a difficult task master. Hopefully this book can help the reader avoid a few painful lashes.
This book is not a tutorial of the C++ language. Rather, it addresses approaches to successfully using existing C++ compilers to develop real world programs in a professional manner -- and, hopefully, as quickly and as painlessly as possible. The following chapters are organized such that they can be easily used as a reference, but also contain sufficient explanatory material to justify the advice found in them. Each chapter begins with a very quick summary of what is said in the chapter -- usually as an indented list of suggestions. The remainder of each chapter expands and justifies each.
The remainder of the book is arranged according to this broad outline:
Objected oriented thinking can be extremely helpful in system design and analsysis and should be well understood by professional programmers -- particularly for application level class designs. However, focussing only on object oriented methodology can lead to a failure to use some of the more helpful features of C++ -- particularly templates and meta-object construction.
That is, C++ provides OO tools for high level program artifacts, but
it also provides its own particular specialties to assist in the
day to day mechanics of development quite apart from OO methodology.
Like many complex things, there are several layers in which C++ can
be understood. The following paragraphs discuss these ways.
You really only need to understand C to use this level.
This level of understanding gives you better compile errors, but
will not allow you to take advantages of the C++ features that make
it worthwhile to use.
Features will begin to notice at this level include:
C++ complains more about questionable practices, so it helps you avoid
problems.
At this level, you need to start learning to do without
macros. Templates, inline functions, and reference variables (used as
function arguments) almost completely eliminate the need for C
macros. Note
that const variable declarations and enum's remove the need for
#defining
constants. For example:
enum constants_list
{
constant_name_1=constant_value,
constant_name_2=constant_value_2
};
#define
'd constants are really only
needed for integers and single characters -- because constants
are required for switch statements and array sizes. Neither
strings nor floats can be used as switch statement case selectors
nor array sizes.
#defined
constants can be just as effectively
handled with global variables instead of defined constants.
C++ class
and struct
design allows
for related functions to be packaged together with an enforced
naming convention. That is, the names of class methods all
begin with the class' name: SomeClass::some_member().
Further, the member access declarators let the class designer prohibit access to methods not part of documented interfaces.
Further, constructors and destructors help guarantee that package specific design assumptions are enforced. That is, when a class is designed, object member variable values are assumed. In plain old C, without constructors and with no power to prevent the calling of the wrong functions, these guarantees are impossible to enforce.
The first step in true object oriented programming is understand encapsulation of data and methods in classes. The next is to learn to design a class' methods so that they guarantee the internal state of the data in the class objects at all times and then to prevent access by outside functions which might not obey these state guarantees.
The C++ language does not force object oriented programming on the developer -- it is up to the developer rigorously attend to this paradigm. Collections of class objects can be implemented in numerous ways. Some standard container paradigms are: vectors, stacks, lists, associative arrays, etc. Templates let you instantiate behaviorally consistent but class specific collections of objects in an efficient manner. There is an ANSI standard set of container classes, the STL (Standard Template Library). Each compiler also provides its own approximation to the STL. These are typically not 100% compatible. You can make your own container templates and port them but doing so is by no means a trivial task. It requires that you understand how new and delete really work. It requires that you understand the assignment and copy constructor logic, etc. You really need a good understanding of pointers, references, inline functions, and operator overloading to build an efficient container class. Here, you build data structures that emulate the atomic objects. For example, you might want to implement a representation of the time that can be treated as an integer in that you can assign an integer to the time and vice versa. You would want to be able to use the time on the right hand side of operator= to perform calculations involving integers and time values. Further, you probably want a character string representation, eg "12/28/57-11:03pm". So, you'd want to be able to write expressions like this:
A = B + C + D + E;
concatenate B with C and store the result in T1
concatenate T1 with D and store the result in T2
concatenate T2 with E and store the result in T3
store T3 in A
T3 contains a pointer to T2 and to E
T2 contains a pointer to T1 and to D
T1 contains a pointer to B and to C
A = B * C * D * E;
There is no substitute for practice when it comes to software development. The more often you deal with your editor, the compiler, its error messages, and porting, the easier it will be to do these things. But don't play in the production code base. Do these things in your own personal directory. If an idea pops into your head, create a small C++ source file to check it out. Then port it to all platforms available to you. A good configuration management system is essential to the success of any production software product. For individual developers, the most important benefit it provides is the ability to get back the code you used to have after you have screwed it up. This ability, to run back time, as it were, allows you to edit your code without fear. Many developers are rightly afraid that they will mess up the product and spend a lot of time and effort getting it fixed. A good CM system will help you overcome the irrational aspects of this fear -- you can always find out from the CM system who did what -- and how to undo it.
A good CM system is not necessarily cheap, nor is it necessarily easy to use. Just because you can get one, like CVS, for free, does not necessarily mean that it is a good system. Whatever system you use, be an expert with it. If you don't become an expert, you are like to be more afraid of it than you are of breaking in the code base. Learn the CM system and become and expert.
In particular, you should learn how to make branches and apply labels for your own private use. Once you have learned to branch the code base, you can make any change you want with impunity. If you completely screw it up, just delete the branch -- or simply ignore it and go back to the 'standard' branch. Clearcase, for example, makes this very simple. It is a bit more complex with CVS. Because people are habitual in nature, we sometimes do the same wrong thing over and over again in dozens of files. Perhaps your organization established standards and practices that later turned out to be a bad idea. The ubiquitousness of a mistake should not be an impediment to fixing it.
Don't be afraid to edit your code using some text processing script language. The standard unix tool, sed, is relatively easy to use for simple textual substitutions using regular expressions. For more sophisticated changes -- or substitutions that span multiple lines of text -- you'll have to use perl.
If you have a good CM system, you should not be afraid to sweep through every file, check it out and change it with a script. If your script messes up a file, just get back the prior version and make the needed changes by hand. Even if you don't have a good CM system, you can just copy the files to another directory and do your experiments there.
Often, you can only fix some significant fraction of the code using a script -- perfecting said script to the point where it fixes all the cases might well take longer than fixing the remainders by hand. Thus in many cases, when a script is used there will still be some hand work to be done.
"Algorithms and iterators work so well together because they nothing about each other". -- Andrew Koenig.The standard template library was design by Mr. Andrew Koenig. If ever there was a 'gods gift to programming' -- he's it. The rest of us can relax -- the job is already filled.
(Of course, Bjarne Stroustrup is the giver of the gift!)
In addition to being 'standard', the STL is a great source of programming paradigms and implementation paradigms. Chief among them are 'algorithsm', 'containers', and 'iterators'.
iterator
const_iterator
reverse_iterator
const_reverse_iterator
const
version of the reverse_iterator
operator*
operator++
A good example is the std::ostream_iterator
.
In that case, increment operator causes the data to be physically
written.
Note that output iterators cannot (conceptually at least) be copied -- although they can be returned from functions. You can't save them in temporary variables and use the saved value later.
operator*
and operator++
. The
same restrictions about saving input iterators applies
here.
A good example, is the std::istream_iterator
.
operator--
operator [] ()
.
iterator
class and a const_iterator
. The
reverse iterators rarely needed.
Make sure that all your iterator member functions are as fast as possible and as small as make sense. Do not, for example, design a forward iterator which supports the array index operators. This gives the user the false impression that the function will be fast -- which it won't.
const
parameter types in slightly different ways.
When writing your own template algorithms, try to do the following:
"prefer the standard to the offbeat".
operator + ()
unless you
really have to. If you are going to only add one or two,
just use the operator ++ ()
. Ditto for the
subtraction operator.
operator -- ()
unless
you have to.
When you decide to provide multiple instances of your algorithm, use 'routing' template function to select among the alternates. See the distance overloaded function example below.
auto_ptr
is a really nice way to tell the compiler,
and code maintainers, that a given object is a pointer and more
importantly that the memory to which it points is owned by the
auto_ptr
object. There are however a lot of caveats with its
use (but it is still worthwhile, see below).
An auto_ptr
is particularly useful in some
important cases:
operator new
new
'd up objects which must
be deleted by the caller.
auto_ptr
acts like a pointer but is not a pointer --
thus it cannot be easily passed as a function parameter when a regular
pointer is expected. Instead, it must be clearly stated in the code
whether or not the pointer is being temporarily shared or given up
completely when it is passed to the function.
Here is a simple example of an auto_ptr's use:
Only one auto_ptr
object can own a given piece of
memory. Whenever the copy constructor or assignment operator
of the auto_ptr
class is invoked, it takes ownership
away from the source variable. Consider the following code:
const &
when it used. Normally this is a good thing --
copy constructors should not be modifying the source. Its just that
the auto_ptr
is a very special case.
auto_ptr
is a good tool to use, it is
not a good design template to follow. Use the extant
auto_ptr
but don't make your own!
fp.get() or fp.release()
in the above case? The answer depends on how other_function
is written.
It is tempting to think that other_function
should be
declared like this if it were to take ownership of the pointer:
auto_ptr
implementation
differences. It can also result in some great confusions when
calling other_function
incorrectly. Consider
the following code:
auto_ptr
temporary variable will be created by the compiler in order to make the
code compile. This will be true even if other_function
were written to take auto_ptr<float> const &
as its parameter.
The automatic destruction of the temporary variable will destroy
the object refered to by parameter p
. This is not likely
to be the desired result -- and it will be very hard to debug without
using purify.
In general, it is better not to use
auto_ptr
as function parameters -- although function returns
are a very good use:
some_function()
such that it returns an
auto_ptr
to SomeClass guarantees that the returned
value will eventually be destroyed -- as is the intent of the
author of some_function
. Simply documenting that
such is needed will not.
Finally, while it is tempting to use a container of auto_ptr
's
to automate the process of destructing objects whose pointers are stored
in the container when the container is destroyed, this won't work very
well because of the copy constructor logic of auto_ptr
does not in fact make a copy. This of course violates a basic
principle in the design of STL containers and they cannot be expected
to work correctly using auto_ptr
as template parameters.
Instead of using, say a
std::vector< auto_ptr< t > >
which will
automatically destroy the T's owned by the container, derive from
the vector and make the derived class' destructor do the freeing:
auto_ptr
's are initialized they can be changed via
assignment using operator = ()
. They can also be changed
using the auto_ptr::reset
. The reset
method
is the basis for its assignment operator. Resetting an auto_ptr
deletes the current pointer and takes ownership of another. Generally,
it is clearer to use the reset method that the assignment operator.
Consider the following example program fragment that parses expressions
consisting of terms optionally added together:
Unfortunately, at this point in the history of the STL, the
auto_ptr
class is implemented/declared in different
ways on different compilers. The HP compiler, for example,
requires that the strict ansii compatibility compile option (-AA)
be used event to use auto_ptr
-- and even then it
is very selective about non-trivial uses.
The following code will likely work even if the runtime library
version causes trouble. This code puts its auto_ptr
definition in a namespace, alt_tools
. Doing this is not
a requirement, and will not work unless namespaces are supported.
Simply remove the namespace wrapper if needed.
Due to the
unusual nature of the auto_ptr
many uses and bugs in
various compilers, there is some ugly casting away of const
that cannot be avoided. But they mean that care must be taken with
this auto_ptr
's use (for example, don't use it as a
function parameter's type).
auto_ptr
to make sure that memory is not accidentally
lost due to a failure to call delete
either on local
variables or on class members.
Use auto_ptr
to declare the return value from functions
returning memory created via operator new
(assuming the
caller is supposed to delete the memory).
Do not use auto_ptr
as function parameters.
For example, if you decide (as I once did) that it would be really cool to write your own operating system, don't jump at the chance to write your first operating system as part of some production product development cycle (as I did). Instead, do this in your own private directories -- and stick to traditional operating systems until you perfect yours (and can convince other people to use it by force of argument rather than by shoving it down their throats just before leaving them to debug it when you go on to the next project where you get to write the next cool thing and stick it in the production code base).
Practice is good, but don't practice in the code base -- that is for work, not play. One of the advantages that are found in the development of good library modules is that they are easily tested. Unfortunately, a lot of application programs quickly lose this ability due to complex internal interfaces that do not lend themselves to easy testing. For example, there may be many functions in a program which were written with the assumption that they will be used but at any given time the program never call them.
The fact that these functions have never been tested will be lost and when program is changed and the code finally does call the neglected functions they will malfunction in strange ways. This often leads developers to tear their hear out in frustration with the famous exclamation, "How did this ever work?!" In this case, it didn't but no one knew.
The principle problem here is that the program was thought of as a sand pile rather than as a building constructed of floors -- each of which could have been individually built and tested. But this view is simply a matter of choice on the developer's part. There is no reason that the "application" cannot be a thin layer of glue logic on top of an existing set of libraries whose interfaces are clearly documented and well defined and easily tested early.
Another important advantage of library design is that it encourages re-use. Re-use is a complex thing that is best accomplished by practice. The more you try to make the code re-usable, the more often you will be successful at it. People writing "programs" rather than libraries rarely pay attention to the usability and documentation details that make or break re-usability.
The pursuit of re-usability is to some extent like a moral philosophy. At any given point in time it might not be in one's short term best interest to adhere to a particular precept of the philosophy, but the belief is that, if done consistently, one's long term best interest will be ultimately served by conformity.
Of course, deadlines and ill considered promises interfere with perfect adherence to the goal of writing all code to be re-used. Further, in many cases, code is known to be discardable. In these situations, it is still valuable to follow the general outline of creating all programs as a collection of libraries which are glued together by the main program. This design makes it easier to understand the parts of the program -- even if they were hastily constructed. It also makes it easier to write a new library that replaces an old buggy one!
In programs that are not constructed as collections of libraries, there are often interconnections between disjoint parts that are difficult to break or even understand. Even worse, some of the interconnections are not physical so much as accidental. For example, if a function never returns a number greater than 10 during its initial implementation phase, a consumer of this function might mistakenly assume that it never will and forget to test for this condition. Part of design code in libraries is the act of documenting the behavior. Library design makes the need for this documentation more obvious than is often found in "application" code where all the source code is bundled together for easy viewing and misunderstanding. Human beings cannot remember to always do the right thing, but computers can help -- use automation to make sure that mistakes are either prevented or caught early.
Products, of course, should be built using either Makefiles, scripts, or IDE project build instructions. This is a good place to automate the detection of the use of prohibited symbols or coding practices -- if such can be quickly detected using grep, perl, or other fast text processing language. Just add additional rules to search the code during build time for violations. The following paragraphs describe some approaches to using the compiler to help prevent problems. Also, see Early Error Detecton in the templates chapter. The principle advantage of C++ is that it provides automatic behaviors such as the firing of constructors and destructors. A standard automatic way of making sure that allocated resources get freed is to declare a variable of a special type whose destructor performs the release. That is declare an object that represents the allocation of a resource. When the object is destructed -- if not before -- free the allocated resource. Here's an example code fragment:
auto_ptr
's are used to make sure they
are in fact deleted as needed -- but the principle use of this
design pattern is to make sure that destructors get called
properly from all possible exists from a function.
When using old fashioned C code, failure to call free()
after having called malloc
is a major source of
memory leaks. Consider cases like the following:
get_name()
but it is not handling the cleanup correctly -- it forgets to
delete the NameBuffer in the event of an error. The
auto_ptr class can be used to make sure that buffer gets deleted.
Here's a rewritten version:
Handles are similar to auto_ptr
's in that destruction
is guaranteed (almost) but differ in that there can be more than
one handle to a heap packet -- not just the one allowed by an
auto_ptr
. Unfortunately, there is no standard 'handle'
as of yet. The following paragraphs describe one possible
implemenation.
Typically, a handle is a class object that is used like a pointer. However, its implementation is such that there is both a pointer to the 'handled' object, but also a reference count. Creating a handle to a given physical object increments the reference count associated with that object. Destroying the handle decrements the count. When a handle's destructor fires, and the reference count decrements to 0, then the object refered to by the handle is destroyed.
In a single threaded application, handles are a clear win because they prevent memory leaks (mostly) in cases where ownership of objects must be shared between multiple code fragments or classes. However, in a threaded situation, where multiple threads may share control of an object, and thus its handle's reference count must be thread locked, signficant performance penalties can occur. Multiple copies may well be the best solution in a multi-threaded environment. Performance analysis tools may be required to understand one way or the other.
The construction of an auto_ptr object looks basically like this:
ClassName
object is destroyed
-- unless it has been somehow released. The same would be true of an
object owned by a handle -- except that if another handle refers to the
same ClassName
object, it would not be deleted until all
the handles had been de-scoped. Consider the following example:
Only developers can write 'white box' tests because they know how the code works. They have to think about the things that can go wrong given the architecture of the code and make sure that tests are written that detect and report such malfunctions.
The QA team should only be writing 'black box' tests. That is, tests written using only properly documented functions that the user is specifically paying for. If the QA team writes tests that rely on undocumented features, it will be very difficult to get rid of features -- and likely be expensive to maintain them -- even though the customer never wanted them. Only developers can write tests that validate the test coverage levels on a per line of code basis. It is extremely important that code be tested to a level of at least 85%. That is, that 85% of the lines of code have been executed during the course of a suite of tests and that these tests do in fact actually operate on real data -- and most importantly that the code does in fact give the right answers. There are several tools that can help you determine if your tests have achieved at least 85% coverage:
There is a share ware program that works roughly like purify that is avaialable for linux: valgrind. It does not have all of purify's features but since you can download it for free in some settings, it can give you a taste of how valuable purify can be -- and valgrind can be quite helpful in its own right.
Using purify correctly is somewhat complex but if you learn it and set up regular regression testing using it, you will find that the payoff can be very high. Imaging trying to catch a random memory misuse that only occurs after 12 hours of program execution when the program's memory size has grown to 20 gigabytes or so -- how would you go about it? You get a machine loaded with 56 GB of ram, purify the executable, and let it run for 2 weeks until it prints the first message: memory misuse caused by line 102 in file "bla.c" with the following calling stack ....
Luckily most problems can be duplicated quickly, with a little work, but even then, purify can often point you to the exact line of code that caused the problem and describe the circumstance in such detail as to make it easy to fix. If you are unable to use purify or valgrind to help you detect memory bugs, try using the memory allocator's internal tables to detect when its memory chains are broken. Sometimes when bugs envolving randomly writing all over memory. The memory allocator has pointers throughout your heap space. The random writes may hit your memory allocator's chains. The malloc header file may well describe how to iterate over the chains and detect bad links. The Microsoft compiler provides a function called heapwalk to help you do this. Other platforms do not.
If you are using a 3rd party memory allocation tool, it might might have special features.
Another 'doit it youself' technique is to superseed the global
::operator new
. This function basically just calls
malloc and returns its return value. You can intercept these
calls to new
and add some extra space at the beginning and
ending of the heap packets requested from malloc.
You fill these packets with known
values in ::operator new
. Then, during
::operator delete ()
, you check to make sure the
correct values are still there. If they are not, then you
know that your program has misbehaved.
If you try something like this, remember the following:
operator new ()
As annoying as it may sound, if you are working in a porting environment, you have to take into account the mistakes of the compiler vendors and those of the vendors of any third party libraries.
One of the most tedius problems encountered is the use of the
preprocessor to define names that you want to use for
something else. For example, the name 'unix', as well as 'UNIX', and
'Unix', are highly likely to be #define
'd to be '1' on some system
or other. This means that you cannot define a class, a macro, a
function, or a variable named simply 'unix', or 'UNIX', or 'Unix'.
Sadly, there is a myriad host of simple computer science related words that should not be used for the same reason. There is no fixed list, but a simple rule to follow is this: do not use any simple english word that might be computer science related. Instead, used underscores to separate word fragments. For example, instead of using 'unix', use 'unix_flag', or some other such thing.
The compiler vendors are supposed to put leading underscores on their
#define
's -- so you should never define a symbol that begins with an
'_'. For example, don't use '_unix', nor use '__unix'. Nor '_bob',
'_srikanth', etc. Leading underscores belong to the compiler vendor.
Nesting your symbols in namespaces, functions, and classes does not help if the symbol is defined in the preprocessor.
The only work around to this problem if you have unwittingly caused
it is to put #undef's in your source somewhere after including the
header file defining the symbol causing you trouble. For example,
the stdio.h
headef file defines the name 'fileno()' as
a macro. If you had large body of code and you could not easily
change your own variable name, fileno
, to something else,
but you absolutely have to include stdio.h
somewhere
where your own fileno
is defined, you can do it like
this:
Developing and sticking to a standard naming convention can greatly ease your porting burdens -- and improve the understandability of your code even if you aren't porting. Here is a simple example of such a naming convention:
#define
whenever possible -- prefer
typedef
's, inline functions and const
variable's instead.
#define
, symbol, class, and variable names
that are simple computer science terms or operating system names,
etc -- use longer names separated by underscores instead.
Some_Class
, or SomeClass
.
fred_is_bad, func(),
etc.
#define
'd constants can be
all upper case with separating underscores.
member_one_
.
#include
directives. This is
because the microsoft compiler will recognize the '/' as a
path separator but the unix compilers will not recognize
the backslash ('\').
#include
directives, it is generally
inadvisable to use fully qualified pathnames. No to versions
of the same operating system / compiler combination will
guarantee continuity of the names of directories where standard
tools can be found. Usually they are consistent but there is
no guarantee. Rather than imbed specific pathnames in your
source code, rely on the compiler's -I
parameters
to help you find the code.
#include <myfile.h>
You would do something like this:
#include <my_project/myfile.h>
This prevents trouble caused when 'myfile.h' is defined in some
third party library as well as yours.
Note that this approach can also reduce the number of -I statements needed by the compiler on a large project. For example, suppose your project source code tree looks like this:
main_dirInstead of using using two -I statements on the command line to the compiler like this:
lib1
header1.h
header2.h
lib2
header3.h
header4.h
-Imain_dir/lib1 -Imain_dir/lib2
You could just use one:
-Imain_dir
This means of course that the header files would have be
included using the directory name fragment:
#include <lib1/header1.h>
#include <lib1/header2.h>
#include <lib2/header3.h>
#include <lib2/header4.h>
.h
. Unfortunately, writing 'make' rules
that properly handle all the cases is made much more complex
if you leave the file name extension. Also, Microsoft Windows
and most browsers rely on the file name extension for type
information.
Here is a short example source file obeying these rules:
#ifndef SOME_CLASS_DEF_H #define SOME_CLASS_DEF_H extern const int MAX_DOG_COUNT; namespace our_company { void some_function (int some_parm) { float some_variable = some_parm; } struct Some_Struct { typedef std::vectorDespite people's best efforts to review and understand code, the language is sufficiently complicated that accidental features can become essential to the proper operation of your code. This is very bad. Out of the blue, one day, your code will suddently malfunction after some trivial change -- and after spending many hours tracking down the problem -- you will say to yourself "how did this ever work!"Vector; Vector vec_; Vector const &vec () const { return vec_; } enum Constants { ONE=1, TWO=2 }; }; }; #endif
Porting your code to other compilers can help detect such things with little work on your part. Other compilers will have had other developers write them. These other developers will have made different choices about warnings and error detection. If you port your code to a variety of platforms, one of them will likely give you a good error message about the problem you have unwittingly created. Therefore, it is wise to start the porting process as early as possible -- rather than leave it to the end.
Porting the code envolves both getting it to compile and also running your regression tests on that platform. You should do both of these things early and often. When the time between a code commit and the discovery of the breakage is large, it becomes more difficult to remember what you did that might have caused the problem. Not all porting errors appear as compile errors -- some are just bugs. Detecting a bug on a porting platform long after the author created it is very tedius.
Some porting difficulty is caused by floating point numbers -- you will get minor differences in roundoff or stream output behavior. These differences are not necessarily bugs -- but they might be.
One way to work around such 'allowed' differences between operating system behaviours -- particularly with respect to floating point numbers is allow different results on different platforms based on specific test results on specific platforms. You should be wary of just ignoring the test results on a platform though. You just need some way of officially stating that 'test A works differently on platform P than on platform O'. Don't let these differences between platforms dissuade you from automted test mechanisms -- use scripts if you have to enable automated testing for all platforms.
A common mistake is to make a class object with a variety of data members and functions but which does not make guarantees about the state of these data members. You do not have a class, but rather a named 'pile' in this case. And just like a pile of sand, as you add more grains to the top, you will eventually see a catastrophic collapse as all the sand grains slide to the ground.
Don't make piles of data. C++ classes should be objects whose internal states are guaranteed at all times which makes sense. Here are some general guidelines for completeness:
main()
begins
executing. Here is an example of a static initialization:
// at file scope
extern int fred();
int bill = fred();
A slightly different variation of the above theme occurs when you declare a
global variable of a type which is a class having a constructor -- or has
members which have constructors. For example:
// at file scope
#include <class.h>
Class varname; // this is a static init
An almost infinite number of useful ways can be imagined to use this feature.
However, it should be used with great caution. The following
sections will describe various problems that can occur if you do use it. The
problems are such that you should able to avoid them all, and use
static intialization to your hearts content. But, "should" and "likely
to" are very different in this case -- for reasons discussed below.
There is always a special case that will work, for a
few releases of yor product at least, but in the long term, after newly
hired people have taken over maintenance of your product, static initialization
will eventually come back to bite you for reasons described below. It will
be safer if you
find a way to initialize all your program variables after
main()
begins executing.
Typically class objects should not
be allocated statically because of their constructors -- but you can allocate
static pointers and fill those pointers after main()
begins executing.
In the example from the prior section, a declaration of an external function,
fred()
is made, and a variable name bill
is defined. Also, before
main
begins executing, bill
is initialized with
return value of fred()
This is a nice feature, but it leaves you open for some vary nasty shocks.
What if fred()
requires that bill
be properly
initialized before you call it? It will return an undefined value -- which
may be annoyingly constant until you port to another platform and then
you will get nasty surprises.
You might think that an intelligent developer is not likely to make this
mistake. However, experience shows this not to be true. And worse, it is
not one developer that is involved in an error of this form. The original
author of fred()
may well know not to use variable bill
as party of its implementation -- but subsequent maintainers may not know that
bill
is initialized by calling fred()
and may
unwittingly make fred()
dependent on bill
-- perhaps
even accidentally. Consider, if a change to fred()
envolves a
call to function tom()
and the developer who modifies fred()
did not write tom()
, how will he know that tom()
uses the global variable bill
?
There is no solution to this problem -- you just cannot write code that depends on a global variable safely if the function is used to initialize the variable or any other variable envolved that variable's initialization. Luckily bugs of this form are easily caught. Unlike the obvious order dependency bug described above, other, more subtle order dependency bugs can creap into your statici intialization logic. Suppose your program has two global variables which are both statically initialized and further suppose that the second variable's intialization requires that the first be initialized it can be initialized correctly. Will this situation work? Here is a concrete example. Consider the following files:
File h.h
extern int g ();
extern int f ();
extern int var_one;
extern int var_two;
File one.c
#include <h.h>
int var_one = f();
File two.c
#include <h.h>
int var_two = var_one + g();
This program will work correctly one if var_one is statically constructed before var_two. That will only happen if file one.c is linked into the program before file two.c.
In a program with only two object modules, this is easily controlled.
Typically, the linker places the objects into the module in the order
in which it encounters them. All you have to do then, is make sure
that file one.c's object module appears on the command line to the
linker before file two.c's object module. But what if file one.c and
file two.c are in libraries? The linker has no way of knowing that
it should put file one.c's object first. It might put it first,
and it might not depending on a large number variables. Further,
different linkers on different platforms will make decisions differently --
and thus the code might work one platform in one release, but may not
work on other platforms in that same release -- and may not work on
any platform in a different release.
Here are some approaches for forcing initialization after main()
:
main()
execute code to 'do' all the stuff. This of
approach requires that you create static initalizations that
append the 'stuff to do' for each file into the global linked list.
You also have verify that the global linked list gets linked into
the exeuctable first.
main()
program call a single function to handle all
initializations everywhere than to have a long list.
std::vector
std::list
std::map
Note that because an algorithm if O(1) does not mean that it will be fast. It could be a large O(1). However, O(1) does mean that the amount of data being processed does not matter. Which means that adding a million times as much data does not making the code any slower.
Consider the cost of the following operations on an STL container: first, insert 1 million items, then find each item in a random order. The following table describes the cost (time) required for each step:
Container | Insert Cost | Find Cost |
---|---|---|
vector | O(N^2) | O(N^2) |
list | O(N) | O(N^2) |
map | O(N*ln(N)) | O(N*ln(N)) |
set | O(N*ln(N)) | O(N*ln(N)) |
The above table is an example of the 'worst case' situation. It is applicable when large amounts of data must be randomly created and searched.
The use of threads may or may not improve the performance or responsiveness of your programs:
malloc
to run a lot slower than it did before. If your program makes a
lot of small malloc calls (or in this case operator new) a threaded
program will spend up to 50% of its total runtime waiting for
memory. This is true, even if all threads save 1 are actually
waiting for something to do.
To overcome the above problems: pre-allocate all the memory for the threads, and architect a program so that each thread has its own data to process -- with little or no overlap in access to global variables. However, retrofitting a large application, which was not designed with threads in mind, can be difficult or impossible -- and simply turning on threads may cut your performance in 1/2.
In general, before committing to the path of using threads in an application, verify that it will in fact speed it up to do so. Numerous experiments are generally required for this. These experiments should be done on a multi-processor machine because single cpu boxes will hide certain programmatic bugs. Failure to put mutex locks around accesses to global variables will result in program malfunction much faster on a multiple cpu machine than on one having only one cpu.
It may be advisable to break an application into two programs. One that does not have threads -- but executes commands in serial. The second is threaded and handles i/o such as web transactions which benefit from thread parallelism. The threaded application can martial data and serialize commands to the non-threaded engine (thus allowing it to run a full speed).
An alternative to this draconian separation is to have an executable which is only threaded during certain phases. When performing lots of small mallocs, for instance, thread could be turned off. Then when the memory is allocated, turn threads on. This only works if the threads are all dead though -- not just stopped during the non-threaded periods.
Another problem with threads is that the stack size is generally fixed. Unlikely the primary thread of an application, the secondary thread stacks do not grow as needed. This means that you have to be more carefull with recursive algorithms implemented in threads than in a non-threaded program -- which is to say, that you have to guess correctly about the stack size. Technically, of course, all recursive algorithms can overflow any stack size -- even the program's main stack. However, threads tend to make the problem worse.
Starting threads is faster than launching a whole new process -- but it is not infinitely fast. In really high performance situations, a pool of threads which are already up and running but waiting to handle the next command is likely to give the best results. The threads in this pool might not terminate until the application does.
virtual
does not add a considerable
overhead in most situations. There are a few cases where care
should be taken to avoid them. All these cases boil down to code fragments
that do a large number of function calls which do not in themselves do
very much. Some obvious cases are:
std::istream_iterator<T>
. Such an iterator
allows for algorithms of the following form to be written:
//
// Function to read 1 megabyte from an input stream
// using an std::istream_iterator.
//
typedef std::vector CharVec;
typedef std::istream_iterator StrmItr;
void
read_1mb(StrmItr &it,
CharVec &buffer
)
{
StrmItr end;
CharVec::iterator out = buffer.begin();
int count = 0;
while(count != one_million && it != end)
{
*out++ = *it++;
++count;
}
}
Despite the fact that iterators are envolved, the above code can approach
the speeds of simple character pointer operations -- if the following
methods are all inline
and simple:
CharVec::iterator::operator++(int);
CharVec::iterator::operator*();
StrmItr::operator*();
StrmItr::operator !=(StrmItr const &);
If these functions are not inline
, there will be very noticeable
performance impacts compared to implementations where they are because of
the very large amount of data being processed (1 megabyte) and there
being several function calls per byte.
In fact, in all major implementations, the istream_iterator
and the ostream_iterator
are not truly inline
.
No virtual
method gets actually called in an inline manner
unless it is declared inline
and is referenced using the
class member operator (::). In some implementations, the ostream
and istream
i/o methods are virtual members of a base class --
resulting in at least one true function call for every i/o operation in
the calling code.
Luckily, there is a pair of alternate classes that are:
The function above runs much faster using istreambuf_iterator;
In all major implementations, the std::vector
member accessors
are made as close to purely inline
as possible.
Using the inline
keyword is not always sufficient to ensure that
a given function or class method is in fact inline
. The
reason for this is that the langauge definition allows compiler vendors
to determine that an given function is too complex to inline
and in these cases the function is implemented as a static method in all
translation units that reference it. Understanding the rules for all
compilers is a bit of a challenge. In general the following things
prevent inlining on at least one and probably more of the major compilers:
while
, for
)
For example, suppose a function has a while
loop in it -- but it is only
used some times by the function, whereas most of the time the function
does something very trivial. This code can be split into two functions,
one inline
without the loop, and the other outline with the
loop. Here is an example of the original function:
As can be seen, there is a loop in the above code to ignore
repeated space characters in the input data. This will prevent
inlining on many compilers. Inlining can be achieved though
with a simple change: add a new function that begins inline
and switches to out of line only if it has to:
The C++ operator new
function ultimately calls the C language
function malloc
. Most operating systems provide an implementation
thereof that solves most application problems in an affective if not
optimal manner. There are a couple of cases where the built in
malloc
may turn out to be less than optimal:
malloc
, the
first step should be to determine if the program, cannot easily be
changed in some way to reduce the number of calls. However, this is not
always easy or the best solution. After market memory allocators are
available that can give significant performance optimizations without
major code rewrites. Consider trying:
In addition to speeding up malloc
, these tools typically
provide additional built in error detection that can be turned on with
little effort.
It is generally inadvisable to use these tools when building a purified executable -- so the build process must provide for ways of choosing the 3rdparty memory allocator or not in a given build.
Using operator new to initialize local variables is generally dangerous -- although it is certainly a ubiquitous practice. Consider using auto_ptr's or handles.
The mere fact that a program is compiled in such a way that allows exceptions to be handled forces it to accept a performance degradation of between 5% and 25% depending on the compiler.
Compilers often provide a way to turn exception handling off at compile time. If exceptions are not used in the program, then it will be worthwhile to use this option. However, this might disable use of the STL, etc.
If a specific function does
not use the throw
keyword and does not call any other
functions that do, it can be individually marked as 'not supporting
exceptions' by using the following syntax:
Sometimes the bulk of a program does not use exceptions but some small part does. If the part that does use exceptions is invoked from a single function call, or small number thereof, it might make sense to compile that section of the code with exception support turned on -- and never let any exceptions leave -- that is, have the highest level function in the exception handling part of the program catch all exceptions.
If you must live with exceptions and are noticing a signficant performace drop, try increasing the compiler optimization levels. Unfortunately, this may result in compiler bugs and thus some work to find the modules that can not be successfully compiled at higher optimization. Object oriented programming methodologies spend a lot of energy on determining the specific needs the high level object types -- the application objects. This is obviously good in that all requirements should be met and no unnecessary functionality should be included as it merely burdens the implementation without value.
However, the bulk of the C++ classes that get implemented are not high level application objects but rather represent temporary values and other objects not directly visible in the UML, design diagrams, etc. Attempting to design all these classes using design diagrams may provide a low return on the investment of effort.
Further, focusing only on the specific needed features of these many "helper" classes as understood at the time the project is begun can result in program bugs because the objects will be used in ways not conceived up front. The trade off between over design of simple classes and under design will often fall towards overdesign -- simply to eliminate unnecessary debug effort when these classes are used in ways not originally conceived. At a high level, however, the trade off often falls the other way. Even the low level classes need to be implemented completely. Documenting that such and such a feature does not work is not good enough -- people often think that understand code when they don't. When designing classes try to help prevent such errors using one or all of the following:
private
keyword to force the compiler to
complain about misuses.
The compiler generates bad code in the event that the class owns a pointer that should be unique to every instance of the object. The compiler won't know this, so it won't copy the objects pointed to, only the pointers -- leaving two objects pointing to the same thing. A key reason that C++ is better than C as a programming language is that it provides for data hiding. That is, instead of simply exposing data members to all callers -- some of which will not be perfect in their understanding of their responsibilities when using the class -- hide the data and provide members to correctly perform all needed activities on that class.
Unfortunately, a lot of code is written that does something like the following:
friend
classes are used extensively.
Instead of export references to members, export public functions that do all needed things with the members -- and implement them in a way to maintain the class' design assumptions.
Of course, from time to time one might want to export a member reference
for speed reasons -- but this should almost always be limited to
const
references. Consider:
member()
is an inline function
which gives fast but un-modifiable access to the member. If
changes need to be made quickly, make change_member()
fast rather than making member()
return a writeable reference.
Object oriented programming techniques tend to focus on class hierarchies
but this causes an inherent reduction in execution speed when polymorphism
is used. See Virtual Function Call Performance
above. As described there, the penalty is not always high enough to be
concerned with, but when implementing container classes, I/O methods,
or any methods where large numbers of objects will be processed it can be high
enough to warrant consideration.
An alternative to inheritence is the use of template algorithms. Templates increase the chance of achieving inline code performance and because of template specialization can actually be more flexible -- in that basic assumptions can, for specific template parameter types, be changed. Of course, templates do not provide the 'is a' relationship that inheritence provides.
That is, a family of template classes provides the same basic functionality
as a class hierarchy where the inheritence is private
rather
than public
.
The two approaches are not mutually incompatible, of course. A template class specialization can be designed so that it is derived from a non-template class. For example:
In this case, any SomeTemplate
'is_a'
SomeBase
object. There are several reasons to design
a class hierarchy using templates for the derived classes:
private
inheritence.
Consider the following example:
As you can see in this example, a single line of code that does
nothing more than declare a pointer and set it to zero is used
to intentionally cause a compile error if the wrong template
parameter type is used. Further, the line of code about which
the error occurs has only one purpose -- to cause the error.
This makes it very easy to understand what the error means. On
the other hand, if left it to chance, you might get an error
of the form:
And what would that mean? If you looked up line 100 in file x.c, you
would likely see some perfectly normal code with no hint to the
fact that you should never have been using a Plarf reference in the
first place as a template parameter. But when you get a compile
error telling you that
Error, file "x.c", line 100: can't construct a Blurb from a Plarf.
And a comment, such as the one in the example above, tells you that
you can only use
Error, "file "y.c", line 237: Plarf has no member type named 'known_member'
X, Y, and Z
as template parameters to
template Templ
, you can easily understand your mistake.
The following sub-topics are covered in this section:
Templates are normally a mechanism implementing the same algorithms and data structures across a wide variety of data types. Specialization lets you actually have slightly different algorithms based on the types actually used. That is, you get to have it both ways.
There are two types of specialization -- overloaded functions and explicit specialization. Like normal functions, template functions can be overloaded (at least in the compilers conforming to the 1998/09/01 standard -- and most do). Unlike normal function signature overloads, however, overloaded template functions let you apply algorithms to broad classes of types. For example, you could write one template function that applies to pointers, and another that applies only to non-pointers.
Consider the following functions that let you have different algorithms
for different data types using the same function name (check()
):
In this case, the difference in algorithms is quite trivial: we are just printing different text. However, it is possible using this approach to have completely different algorithms. Here are some example uses of the above functions:
const
and the reference operator (&
) and
it may be that you cannot sucessfully port your code if you get too
sneaky.
For example, the HPUX aCC compiler, at least in the older
versions, will not promote an array name to be treated as a pointer.
That is, on older HP compilers, the check(ta);
line in the
above example will print 'Not a Pointer'. But on solaris, aix, g++,
and Microsoft 7.0+, and the Intel C++ compiler it will print 'Pointer'.
Additionally, some older compilers, like HP will produce copious quantities of meaningless warnings about the above code. The warnings are mistaken per the language standard but it will be faster to suppress the warnings using compiler directives than to wait for HP to fix the compiler (;->)
Note also that you cannot, portably at least, declare this overload in addition to the ones above:
If you do, you will numerous real overload resolution conflict errors particularly on aix.
Well, I'm glad you asked. Because template functions can be overloaded
based on the types of parameters, you can create a family of functions
with the same name but providing slightly different algorithms based on
the types of the parameters actually passed. See
Overloaded template functions above. And, yes,
you can do this without doing anything special -- it is just helpful
sometimes to be able to explicitly pass a parameter whose only purpose is
to help establish which algorithm to use. For example, you might want a
variety of copy()
algorithms, each differing only the type
of iterators it deals with, albeit with profoundly different algorithms
based on those iterator types. You could have a variety of algorithms
with exactly the same name and almost identical parameter lists,
or you could have a single copy()
interface that detects the
types of iterators and then invokes a different algorithm, say
copy_helper()
with an additional parameter whose type makes it
absolutely clear how one copy_helper()
differs from another.
Passing a 'type' is really nothing more than passing a 0 -- its just that the compiler knows 'what kind of 0' you are talking about. The runtime performance implication is that of passing an extra parameter to the function.
When you pass a'type' as a parameter to a function, you are really doing nothing more than forcing the compiler to pick a specific overload from the family of functions with the same name. This is done extensively in the STL and can occassionally be useful as a general purpose tool to reduce unnecessary duplication of similar text (and the associated duplicate maintenance).
A good example of this from the STL is the
distance()
family of functions. This family of functions
has the simple task of counting the number objects between two iterators.
The general form is like this:
As you can see the general algorithm is required to actually step through all the locations the iterator can point to. However, this is O(N) -- a good 'Big O', but not a great one. With some kinds of iterator, this is the best you can do. However, for iterators that are actually implemented as pointers, this is horrible. You should be able to do the calculation in O(1) time. In fact, for all random access iterators you should be able to calculate the distance between two iterators in constant time.
That is, the random access iterator form should look like this:
So the question is, how do you instruct the compiler as to which algorithm to use? The STL answer is to provide a family of related functions that perform the task in different ways and used overloaded template functions to implement the alternatives.
The general scheme works like this:
In the distance()
example, you could implement the public interface like this:
In this case, there are two iterator_helper implementations:
one for plan vanilla iterators and one for
random access iterators. There must also be a
family of overloaded functions that help you detect which of the two helper
algorithms should be used. Of course, this concept of 'iterator category'
goes well beyond the distance()
algorithm, so the
STL provides a general purpose technique -- the iterator_category()
family.
This functionality is documented in other places, but the general idea is that
each of the kinds of iterators, is represented
as a struct with no members. Something like:
There are also, a collection of functions of which return an object of one of the above iterator category tags. Pointers are treated as random access iterators. The general format of these routines is like this:
Given that this family of routines exists, and there are two distance_helper
functions with signatures like this:
template<class Iter, class TagType> void distance_helper (Iter first, Iter last, Size_T size, TagType t) { // see the generic O(N) algorithm } template<class Iter> void distance_helper(Iter first, Iter last, Size_T size, random_access_iterator_tag) { // see the fast O(1) algorithm }
then, the optimal algorithm will be chosen:
numeric_limits
is a good example of this. There is also
the char_traits
, etc. For an example of class specialization
in a manner similar to the stl approach, see below.
Specializing a template class is done in manner quite different from function overloading. Although, it is possible to specialize template functions in a manner similar to template class specialization. Explicit template function specialization lets you override the function body for a given type. Explicit class specialization lets you override a class' body for a given template parameter. Template specialization provides an alternative to virtual methods. Virtual function calls are neither terribly slow, nor terribly fast, but as Stroustrup points out, the iostream interface was designed so that it would not be necessary to have a virtual function call on very character read or written -- so it is an issue worth taking into account.
Here's how to use template specialization instead of virtual methods:
Here is a comparison between the above method and simply using virtual functions:
Use template class specialization instead of template class derivation
to cut down on namespace pollution when you need conceptually related
classes to have different numbers of members:
Use specialization to handle wierd stuff that should be associated with a class but which might be so bulky as to obscure readability of the original class:
Note that this mechanism helps make built in types more usable in templates
because you can write code that assumes that a builtin class has a Helpers
A nice thing about this approach is that it lets you separate 'views' of
a class. In the one way of looking at a vector, it is only an array of
ints -- dependent on nothing but operator new and delete to implement.
In another view, it is an object which can be read/written to a persistant
store -- and thus has dependencies on i/o concepts.
Bjarne Stroustrup asserts in his excellent book, "The C++ Programming
Language -- 3rd Edition", that
Truer words were never spoken -- but sometimes vague and
unrealistic goals are all you have to
work with. It would be wonderful to have all the project requirements
neatly wrapped up in a box and handed to you in toto before you start.
However, this almost never occurs. Even the most oppressive military
style development process does not prevent signficant design changes
after you have made critical design decisions.
None the less, the inevitability of coming changes does not give you
an excuse to do sloppy design work. In fact, it makes it more important
that designs be carefully thought out in light all the changes you are
likely to face -- it is rare that changes occur out of the blue, there is
usually plenty of up front warning of the broad outlines you can expect.
You may, however, be forced to take a pro-active role in ascertaining
the range of changes. Customers often have no idea what they really want.
Make sure you find out, even if they don't.
Requirements analysis is by no means a simple process. Requirements do
not simply exist and are merely waiting to be gathered. Rather, requirements
are manufactured through hard work and planning. Good requirements are
far more than just a list of things that the customers would like to find
in the product when it is finished.
The work product of the requirements
analysis phase of a product development cycle should be a top level design
from the
customer's perspective -- if not from the implementors. It is a base
document that establishes the language by which the customers will be
communicating their understanding of their needs. Unfortunately, most
customers do not have a full understanding of what their needs are and
will only be able to dribble them out as questions are asked.
The software vendors must be able to elicit the customer's understanding
of what is needed and be able to communicate that understanding effectively
to the customer as well as to the software designers. Properly done, the
customer will dribble out all the requirements in this phase, rather than
as the software is demonstrated for the first time.
Experience has shown that customers who are not themselves software
developers will be unable to read and understand developer oriented
documentation. Instead, they need to focus on
This is a complex subject in its own right, consider reading
"Conceptual Design"
by Beyer and Holtzblatt.
There is a tendency, as the saying goes, to have the programmers start
coding while the system engineers go find out what the customers want.
This is a source of many major mis-steps.
Code written in this way
is like a "black hole" in astronomy. One way for a black hole to form
is by accretion. That is, material falls into the star until it swells
to such a large mass that the gravitational force exceeds the outward
pressure of the star's thermonuclear reactions. When this occurs,
the star collapses under its own weight and the density further increases
the rate of gravitational collapse. The star litterally disappears
out of the face of the universe never to be seen again -- directly at
least.
At first, the rate of progress on a project run in the Nike way, by
"just doing it", will seem high. Unfortunately, at some point the lack of
planning will inevitably result in violations of the basic principles of good
design and progress will slow to a halt. Even an army of programmers
won't be able to speed things up. This is why that on many traditional
projects, the initial group of developers all seem to be geniuses and
the later groups seem less gifted.
Following good architectural principles is more important when building
a sky scraper than when building a chicken coup. Small programs don't
benefit that much from a lot of thought. However, unlike chicken coup's,
computer programs are often extended well beyond their original design
goals. If a chicken coup's requirements change and a sky scraper is needed
it is likely that the coup will be discarded and the sky scraper
designed from scratch. Unfortunately, there is a tendency in software
to over extend extant implementations until progress becomes impossible.
In any complex multi-person activity, it is particularly important that
interfaces between products created by different individuals or teams
be designed very carefully. The implementations of the functionality behind
these interfaces requires less (or no) coordination. However, the
functions, objects, protocols, that are provided or required across
teams should be thought about carefully and changes thereto be very
carefully managment across the teams affected.
A key feature that many projects neglect is software layering. Rather
than simply creating a large pile of code, software should be implemented
in layers, like floors in a building. The lower levels strong enough to
support the upper levels. Like a building, the layers should be only
allowed to interact with one another at key interfaces. For example,
buildings don't have stairwells placed at random locations, and pipes
don't run willy nilly through the floors at odd angles. Software created
by accretion, however, is likely to. These inappropriate interfaces
are the source of the gradual reduction in productivity that randomly
written programs will inevitably face.
In addition to layers, code should be implemented in a modular fashion.
That is, a layer is not a pool of water into which salt is poured -- and
evenly distributed. Rather it is like the floor of a building with
rooms in it. Each room serves its own purpose and interacts
with the rest of the rooms on the floor only through specific doors and
windows.
The 'rooms' are analogous to software modules. A module should
be highly cohesive -- meaning that it serves a single purpose and does
not contain extraneous functions -- they go in a different module. For
example, one does not expect to find a sink in the living room.
Program modules should also have low coupling between them. That is, in
modules should not be broken out into such fine detail that there are a
huge number interfaces from one module to another in the same layer.
Software architecture is a complex problem. Much is already written
on that subject. Consider
"Large Scale C++ Software Design"
by John Lakos.
Portability means that you are able to compile and execute your program
correctly on more than one platform. It is easier to write portable
code if you know what the target platforms are. Despite the similarities
between operating systems and numerous standards available, there will
be minor differences that make portability a taxing experience if you
don't attend to it properly from the start of the project. The best
way to have portability is to port your code early
and often.
Porting your code to platforms that you do not necessarily think you
need to, if they are easily available to develop on (such as Linux
if you are doing PC development and vice versa), will help you in two
ways:
This approach also allows you to force standard symbols into
every compilation -- but you have to be careful not to include
the entire C++ include file set in every compilation!
Decide at a product level if you are using C++ exceptions or not.
Decide
how you are going to handle floating point exceptions, etc. For
example: divide by zero. Nan's, infinities, etc. Individual
programmers should not be making this strategy decision because
everyone will come to a different understanding of how to do this
and it will be harder to work on someone else's code.
Usually, compilers also provide a way to convert .C source files into
preprocessed output source files. These are useful when trying to
debugging compile errors if the compiler decides to be annoying and
fails to give you exact line numbers describing the your mistakes.
See Understanding compile errors below.
Generally, with C++ programs, you should use the compiler
to link rather than trying to figure out how to run the linker yourself.
Often the compiler vendor doesn't want to document all the steps
truly needed for linking and trying to figure this out for yourself
only leads time, trouble, and non-portability.
Further, if your compiler provides a mechanism to build libraries, you
should probably use that instead of manually running the librarian.
Older compilers using non-standard template instantiation logic will
provide a mechanism to 'close' the library that is not easy to do any
other way. Library 'closure' refers to its completeness. Earlier
compilers generate out of line template functions as separate step
than normal compilation. This means that merely compiling an object
module does not, in these older compilers, actually build all the
object code needed. So, if you only take the generated objects and
archive them into a library, you cannot successfully link with that
library -- you'll get missing symbols. To get 'closed' libraries,
that is libraries with all the needed symbols, you need to use the
compiler to make the library. Again, this applies to older
compilers. Newer compilers use compile time instantiation of all
needed object code -- so the problem does not usually occur.
Compilers often provide a variety of optimization levels. Optimization
can refer to the efficiency of the generated object code but it can
include other things as well. The question is, "optimized for what?".
Compilers typically provide for several levels of optimization. These
levels exist primarily because the compiler vendors are only human and
sometimes make mistakes writing the compiler for high levels of
optimization. Usually, the lower optimization levels are likelier to
give you properly working code, but not always. Since the lower
optimization levels are less desireable to the users, they get less testing
than the higher optimization levels. But the lesson here is that if code
gets generated badly at one optimization level, it might work at another.
When this occurs, an small example program that demonstrates the problem
should be sent the compiler so that the compiler bug can be fixed
(assuming your company is important enough to the compiler vendor ;-).
Optimization levels typically work like this:
Sadly, you may well be forced to turn off compiler optimization, or change
the level thereof in one file or another. Your build process should
account for this. The Microsoft compiler allows you to insert
Sadly,
even the Microsoft compiler occasionally has bugs that prevent proper
operation of the pragmas to select the compiler options in pragma
directives.Before using pragmas to select compiler optimization levels, first make
sure that using command line options to change them does fix your problem.
Then, if the change helps, try using pragmas.
In addition to debugging information provided by the compiler, it is
often desireable to add your own additional code to perform error
detection as the program runs. In some cases, that code has low cost
and should be left in the final product builds. However, in some
cases, the diagnostic code may be too expensive for this.
A common approach to writing the code is to have more than one kind of
compilation (ie "build"). Typically, a developer must select the kind
of build desired by editing a file and adding or changing a
The build process is envolved in selecting these types of build: you
have to set the optimization level to debug, O1, or O2 as described above.
However, there must also be some
In addition to the obvious kinds of errors most people would expect
to get -- such as syntax errors, using undefined type names, etc --
compilers often provide some helpful warnings and errors that may or
may not require work on your part to enable. Turn them on!.
As Scott Myers says in his book, "Efficient C++":
Finding and fixing your errors at compile time is likely to be 10
times as easy as fixing a bug that accidentally got to your customer's
web site -- and is a lot less annoying (although it might not seem
like it at the time). Fixing all warnings will give you safer code
and once the act of fixing the problems has trained you not to create
them in the first place, your code quality will be greatly improved.
If possible, use a compiler option (if avaiable) to convert warnings to
errors -- which will require that all warnings be fixed in order to
compile successfully.
Many compilers allow you to turn on 'portability' errors and warnings.
The exact meaning of the 'portability problems' will vary by vendor.
Still, it is advisable to try turning them on. If you get an error that
you just can't fix -- probably because the error was incorrectly
detected by the compiler, you can always turn off this flag.
When faced with compiler bugs, it might take you considerable
time to determine a single syntax that works on all platforms to which
you are trying to port. Luckily, since February 2001, this kind of thing
has greatly abated but it has not gone away completely. This is
generally time well spent however. Using #ifdef's to work around
compiler differences can greatly complicate your code. It is better to
find one way than to try and support many.
One of the most annoying problems developers face is to determine what
the compiler is trying to tell you. This is particularly true of
of compile errors in templates -- or even worse if you use a lot of
macros. If you are lucky, your compile will give you good errors of
the form:
You will get a message of this form if you are lucky. All compilers
have limitations about producing these error tracebacks. Sometimes the
traceback just gets cut off.
In these cases, you have figure the problem out for yourself. Here is one
approach. This approach is analogous to Newton's method of root location:
Dynamic linking means that a program fragment is loaded at runtime
rather than being bound into the executable at link time. A program
fragment is actually a program unto itself -- it is just a simplistic
program whose only purpose is to provide access to functions to other
programs.
On Windows, a dynamically linked library is named Something.DLL. On
unix, it might be named 'Something.SO' or 'Something.SL'. However the
concept is pretty much the same.
The C++ habit of mangling function names make the use of C++ in dynamic
libraries a bit tedius. That is, the name of the function actually
stored in the library will be long and ugly. Instead of
To make matters worse, the exact name varies from compiler version
to version! So that code compiled with a new compiler version may not
link with code compiled with an older version of the same compiler.
The solution to this problem is relatively simple, however. Instead of
calling the functions in your dynamic library directly, declare them
to be virtual members of a some class. Then, declare inside the
dynamic library a C, not C++, function that returns a pointer to a
member of this class. Once returned, the virtual calls should work
as expected -- although even this could malfunction of the compiler
changes the way virtual tables are layed out. With a given compiler,
however, this is unlikely. Here is an example:
When you wrap a function declaration or definition in an
Template bodies can be specialized in two fundamental ways:
The STL and presumably any templates you write will make the template
bodies available to the compiler (in header files) so that it can make
instances as it
needs them -- but there are occasional cases wherein you will need to
manually force template instantiation. Manual specialization occurs
in either of the following ways:
One use of manual instantiation is to close a library. That is, if you
have written a library and that library uses templates, need to make
sure that all template bodies needed by the library are in fact in it.
Since libraries don't require linking to create them, there is always
some chance you'll forget some bodies. Thus, you can force the compiler
to instantiate all template bodies for all template signatures you
need. When you use the template keyword to instantiate a class, all
member functions and variables get instantiated.
This however, can lead to problems -- sometimes templates are written
to take any kind of parameter -- and some functions in the template
will not work with some kinds of template parameters. Then, forcing
the template to instantiate in its entirely will result in compile
errors that you really can not deal with -- nor do you need to. The
work around for this is painful, however: Manually instantiate the
specific members you do need -- and there might be a lot of them.
A work around for having to do lots of manual instantiations is to
write a function, which is never called, but which when compiled forces
the instantiation of all needed members -- and none you don't need.
In the past, there was a great deal of concern for the time that the
compiler would take when instantiating template function bodies. This
led to the view that the bodies for templates should be separated from
their declarations. Compiler writers went to great lengths to create
repositories of template bodies which had already been created so that
each compilation would not have to duplicate this work. On single
processor machines with a very fast operating system, this seemed to make
sense. However, on multiprocessor
machines, this approach actually makes things worse. The act of detecting
whether or not a template body had in fact been instantiated by some
other compilation stream was shown to be
actually larger than simply compiling the template body. Therefore
modern compilers employ the 'compile time instantiation' strategy wherein
the source for the template bodies is placed in header files (or is otherwise
available at compile time).
Thus, every compilation you invoke will, normally, instantiate in every
object module, all templates needed by that object modules (assuming the
bodies are available of course). The linker is then required to ignore
duplicate template bodies. This doesn't sound faster, but all
compiler vendors have moved to the compile time instantiation approach --
so it must be.
As strange as it may seem, it is occasionally necessary to compile your
C++ code into assembly language source files rather than into object
modules. All compiler provide some sort of command line option to allow
this. Usually, it is
So why would you need to do this? To find out what the compiler is
actually doing for one thing. If you have a lot of overloaded functions
with similar signatures, it may become very difficult to determine
which of the alternatives was actually selected. Of course, if you
have the source to all these functions, you need only debug the program
and single step into the called function. But what if the function is
is part of the STL or a 3rd party run time library -- and you
have no source?
The solution here is to:
Another reason to generate assembly language source files is to diagnose
problems with the compiler's name mangling logic. Yes, as mentioned
before, compilers are written by human beings and occaisionally have bugs.
By producing the assembly source, you see what functions are actually
implemented by the compiler (with mangled names) and you can also
look at the raw function calls (with mangled names) to make sure the
compiler did in fact generate a call to the function that got defined.
Hopefully, you won't have to do this very often. Whenever the name
mangling goes wrong, it is often only wrong by a single character in
very long name filled with seemingly random characters...
Program variables cannot be stored in ROM, although simple numeric and
"C style" string constants can. Class objects which require construction
typically cannot because the 'static construction' logic makes no sense
for symbols stored in ROM.
While it is not possible to write to ROM as the program runs, there must
be some section of memory wherein the program stack resides -- usually
a very limited amount. There may or may not be any heap space in such
a situation -- and there may or may not be any 'global data' space wherein
global and static variables might be found. If heap and global data
space is available, it might be assessible only through pointers rather
than through the normal linking and
Code not originally written to run in ROM will likely either not run or
will require rework to make it run from ROM.
Vague and unrealistic goals are the primary cause of software
project failure.
While the customers may not immediately see the need for it, they will
at least understand when asked how to provide "use cases" or actual
examples of real or approximately real data that can be used to demonstrate
proper operation when the developers are nearing completion.
A common trick to aid in porting is to require that the first
header file included in all compilations be a 'porting' header
file. That is, you add a header file whose only purpose is to
define flags and or work around problems on specific machines.
The 'compiler' command line command or IDE
functionality usually allows for automatic invocation of the linker and
sometimes the librarian. That is, the compiler not only compiles .C
source files into object modules but also lets you automatically link
said objects into programs or archive the libraries. That is, the
command you actually invoke is just a wrapper around a script that
defines the compilation workflow:
#pragma
statements in your code to set the compiler options.
Others do not. For these compilers, you will have to ensure that the
build rules supply the needed optimization selection options.
#define
'd
symbol to make the selection. Common build types are:
#define
'd symbol set, perhaps named
"OurCompany_Diagnost_Build", or some such name. When this symbol is
defined, your code should include additional error checks as needed to
exhaustively check for the correctness of its own internal state. Without
the symbol, these exhaustive checks should be left out. Here is an
example class definition that uses this technique:
"Prefer compile time and link time
errors to run time errors".
Error 122, "something went wrong" in file "name.c", line 247
where instantiated from file "other.c", line 1000
where instantiated from file "different.c", line 12345
where instantiated from non-template code in file "main.c", line 100
To reduce the difficulty in understanding errors envolving templates
which you write, see Early Error Detection.
Dynamic linking is the mechanism that lets us build our programs
in pieces which can be shipped to the customer separately. The
pieces can even be built in different languages and be made from
different product builds.
MyClass::member (int)
you might see something roughtly like:
_Fx7MyClass6Member_fi
extern "C"
wrapper, the compiler does not mangle the name according to C++ rules, but
rather according to C rules. The function calling convention (ie, how the parameters
are passed) is likely to be different in C than it is in C++. This has
little practical importance, but you must remember to get the declarations
all identical so that you do not get link errors.
Template instatiation is the act of creating a real function or class
from the 'template' provided to the compiler. Newer language specifications
call this 'template specialization'.
-s
.
c++filt
to demangle the mangled names you will see
in the assembly source.
c++filt
. Sometimes it has
bugs too! If you get a link error but you are feel certain that the
symbols are in fact defined, you must compare the object module symbol
tables directly.
operator new
mechanisms.
Luckily, C++ provides a builtin mechanism for dealing with special purpose
memories -- the stl::allocator
concept.
For better or worse, this web page was hand constructed using html,
java script, cascading style sheets, and a lot of typing.