Error handling, exceptions, common headers, code reuse, debug output.
The biggest decision to be made, before the implementation can even begin, is how to handle errors and exceptions. There are a few major sources of errors
Bugs are not supposed to get through to the final retail version of the program, so we have to deal with them only during development. (Of course, in practice most retail programs still have some residual bugs.) Since during development we mostly deal with debug builds, we can protect ourselves from bugs by sprinkling our code with assertions. Assertions can also be used to enforce contracts between components.
User input, and in general input from other less trusted parts of the system, must be thoroughly tested for correctness before proceeding any further. "Typing monkeys" tests have to be done to ensure that no input can break our program. If our program provides some service to other programs, it should test the validity of externally passed arguments. For instance, operating system API functions always check the parameters passed from applications. This type of errors should be dealt with on the spot. If it's direct user input, we should provide immediate feedback; if it's the input from an untrusted component, we should return the appropriate error code.
Any kind of persistent data structures that are not totally under our control (and that is always true, unless we are a file system; and even then we should be careful) can always get corrupted by other applications or tools, not to mention hardware failures and user errors. We should therefore always test for their consistency. If the corruption is fatal, this kind of error is appropriate for turning it into an exception. A common programming error is to use assertions to enforce the consistency of data structures read from disk. Data on disk should never be trusted, therefore all the checks must also be present in the retail version of the program.
Running out of resources-- memory, disk space, handles, etc.-- is the prime candidate for exceptions. Consider the case of memory. Suppose that all programmers are trained to always check the return value of operator new (that's already unrealistic). What are we supposed to do when the returned pointer is null? It depends on the situation. For every case of calling new, the programmer is supposed to come up with some sensible recovery. Consider that the recovery path is rarely tested (unless the test team has a way of simulating such failures). We take up a lot of programmers' time to do something that is as likely to fail as the original thing whose failure we were handling.
The simplest way to deal with out-of-memory situations is to print a message "Out of memory" and exit. This can be easily accomplished by setting our own out-of-memory handler (_set_new_handler function in C++). This is however very rarely the desired solution. In most cases we at least want to do some cleanup, save some user data to disk, maybe even get back to some higher level of our program and try to continue. The use of exceptions and resource management techniques (described earlier) seems most appropriate.
If C++ exception handling is not available or prohibited by managers, one is left with conventional techniques of testing the results of new, cleaning up and propagating the error higher up. Of course, the program must be thoroughly tested using simulated failures. It is this kind of philosophy that leads to project-wide conventions such as "every function should return a status code." Normal return values have then to be passed by reference or a pointer. Very soon the system of status codes develops into a Byzantine structure. Essentially every error code should not only point at the culprit, but also contain the whole history of the error, since the interpretation of the error is enriched at each stage through which it passes. The use of constructors is then highly restricted, since these are the only functions that cannot return a value. Very quickly C++ degenerates to "better C."
Fortunately most modern C++ compilers provide exception support and hopefully soon enough this discussion will only be of historical interest.
Another important decision to be made up front is the choice of project-wide debugging conventions. It is extremely handy to have progress and status messages printed to some kind of a debug output or log.
The choice of directory structure and build procedures comes next. The structure of the project and its components should find its reflection in the directory structure of source code. There is also a need for a place where project-wide header files and code could be deposited. This is where one puts the debugging harness, definitions of common types, project-wide parameters, shared utility code, useful templates, etc.
Some degree of code reuse within the project is necessary and should be well organized. What is usually quite neglected is the need for information about the availability of reusable code and its documentation. The information about what is available in the reusability department should be broadcast on a regular basis and the up-to-date documentation should be readily available.
One more observation-- in C++ there is a very tight coupling between header files and implementation files-- we rarely make modifications to one without inspecting or modifying the other. This is why in most cases it makes sense to keep them together in the same directory, rather than is some special include directory. We make the exception for headers that are shared between directories.
It is also a good idea to separate platform dependent layers into separate directories. We'll talk about it soon.
The implementation process should model the design process as closely as possible. This is why implementation should start with the top level components. The earlier we find that the top level interfaces need modification, the better. Besides, we need a working program for testing as soon as possible.
The goal of the original implementation effort is to test the flow of control, lifetimes and accessibility of top level objects as well as initialization and shutdown processes. At this stage the program is not supposed to do anything useful, it cannot be demoed, it is not a prototype. If the management needs a prototype, it should be implemented by a separate group, possibly using a different language (Basic, Smalltalk, etc.). Trying to reuse code written for the prototype in the main project is usually a big mistake.
Only basic functionality, the one that's necessary for the program to make progress, is implemented at that point. Everything else is stubbed out. Stubs of class methods should only print debugging messages and display their arguments if they make sense. The debugging and error handling harness should be put in place and tested.
If the program is interactive, we implement as much of the View and the Controller as is necessary to get the information flowing towards the model and showing in some minimal view. The model can be stubbed out completely.
Once the working skeleton of the program is in place, we can start implementing lower level objects. At every stage we repeat the same basic procedure. We first create stubs of all objects at that level, test their interfaces and interactions. We continue the descent until we hit the bottom of the project, at which point we start implementing some "real" functionality. The goal is for the lowest level components to fit right in into the whole structure. They should snap into place, get control when appropriate, get called with the right arguments, return the right stuff.
This strategy produces professional programs of uniform quality, with components that fit together very tightly and efficiently like in a well designed sports car. Conversely, the bottom-up implementation creates programs whose parts are of widely varying quality, put together using scotch tape and strings. A lot of programmer's time is spent trying to fit square pegs into round holes. The result resembles anything but a well designed sports car.
In the ideal world (from the programmer's point of view) every project would start from scratch and have no external dependencies. Once in a while such situation happens and this is when real progress is made. New languages, new programming methodologies, new team structures can be applied and tested.
In the real world most projects inherit some source code, usually written using obsolete programming techniques, with its own model for error handling, debugging, use or misuse of global objects, goto's, spaghetti code, functions that go for pages and pages, etc. Most projects have external dependencies-- some code, tools, or libraries are being developed by external groups. Worst of all, those groups have different goals, they have to ship their own product, compete in the marketplace, etc. Sure, they are always enthusiastic about having their code or tool used by another group and they promise continuing support. Unfortunately they have different priorities. Make sure your manager has some leverage over their manager.
If you have full control over inherited code, plan on rewriting it step by step. Go through a series of code reviews to find out which parts will cause most problems and rewrite them first. Then do parallel development, interlacing rewrites with the development of new code. The effort will pay back in terms of debugging time and overall code quality.
A lot of programs are developed for multiple platforms at once. It could be different hardware or a different set of APIs. Operating systems and computers evolve-- at any point in time there is the obsolete platform, the most popular platform, and the platform of the future. Sometimes the target platform is different than the development platform. In any case, the platform-dependent things should be abstracted and separated into layers.
Operating system is supposed to provide an abstraction layer that separates applications from the hardware. Except for very specialized applications, access to disk is very well abstracted into the file system. In windowing systems, graphics and user input is abstracted into APIs. Our program should do the same with the platform dependent services-- abstract them into layers. A layer is a set of services through which our application can access some lower level functionality. The advantage of layering is that we can tweak the implementation without having to modify the code that uses it. Moreover, we can add new implementations or switch from one to another using a compile-time variable. Sometimes a platform doesn't provide or even need the functionality provided by other platforms. For instance, in a non-multitasking system one doesn't need semaphores. Still one can provide a locking system whose implementation can be switched on and off, depending on the platform.
We can construct a layer by creating a set of classes that abstract some functionality. For instance, memory mapped files can be combined with buffered files under one abstraction. It is advisable that the implementation choices be made in such a way that the platform-of-the-future implementation be the most efficient one.
It is worth noticing that if the platforms differ by the sizes of basic data types, such as 16-bit vs. 32-bit integers, one should be extremely careful with the design of the persistent data structures and data types that can be transmitted over the wire. The fool proof method would be to convert all fundamental data types into strings of bytes of well defined length. In this way we could even resolve the Big Endian vs. Little Endian differences. This solution is not always acceptable though, because of the runtime overhead. A tradeoff is made to either support only these platforms where the sizes of shorts and longs are compatible (and the Endians are the same), or provide conversion programs that can translate persistent data from one format to another. In any case it is a good idea to avoid using ints inside data types that are stored on disk or passed over the wire.
Modifications of existing code range from cosmetic changes, such as renaming a variable, to sweeping global changes and major rewrites. Small changes are often suggested during code reviews. The rule of thumb is that when you see too many local variables or objects within a single function, or too many parameters passed back and forth, the code is ripe for a new abstraction.
It is interesting to notice how the object oriented paradigm gets distorted at the highest and at the lowest levels. It is often difficult to come up with a good set of top level objects, and all too often the main function ends up being a large procedure. Conversely, at the bottom of the hierarchy there is no good tradition of using a lot of short-lived lightweight local objects. The top level situation is a matter of good or poor design, the bottom level situation depends a lot on the quality of code reviews. The above rule of thumb is of great help there. You should also be on the lookout for too much cut-and-paste code. If the same set of actions with only small modifications happens in many places, it may be time to look for a new abstraction.
Rewrites of small parts of the program happen, and they are a good sign of healthy development. Sometimes the rewrites are more serious. It could be abstracting a layer, in which case all the users of a given service have to be modified; or changing the higher level structure, in which case a lot of lower level structures are influenced. Fortunately the top-down object-oriented design makes such sweeping changes much easier to make. It is quite possible to split a top level object into more independent parts, or change the containment or access structure at the highest level (for example, move a sub-object from one object to another). How is it done? The key is to make the changes incrementally, top-down.
During the first pass, you change the interfaces, pass different sets of arguments-- for instance pass reference variables to those places that used to have direct access to some objects but are about to loose it. Make as few changes to the implementation as possible. Compile and test.
In the second pass, move objects around and see if they have access to all the data they need. Compile and test.
In the third pass, once you have all the objects in place and all the arguments at your disposal, start making the necessary implementation changes, step by step.
Testing starts at the same time as the implementation. At all times you must have a working program. You need it for your testing, your teammates need it for their testing. The functionality will not be there, but the program will run and will at least print some debugging output. As soon as there is some functionality, start regression testing.
The scaleability under heavy loads should be tested. Depending on the type of program, it could be processing lots of small files, or one extremely large file, or lots of requests, etc.