Programming With Style 

Contents of Chapter 2

  1. Construction Advice

2 Construction Advice

Construction is the act, process, or work of building. The word itself conjures the image of a detailed plan and its implementation, as well as the double checking of previous work for defects, and remedying problems when necessary. When related to programming, the construction of quality software involves many practical aspects, such as design, testing, debugging, and optimization. The overall goal of these four phases of construction is to produce a program of superior quality and easier long-term maintainability.

Up to this point, we have dealt with programming and coding standards from the abstract perspective of effective communication. It is now time to explore the nitty-gritty coding considerations that developers face every day. The job of a programmer is challenging because of the fact that each program is different and presents a unique set of problems. As will become clear in this chapter, there are no hard and fast rules for the construction of quality code. This chapter does, however, suggest helpful advice and useful techniques to guide you through each stage of program construction.

While studying this chapter, take the time to stop and consider how you write code and compare your methods to the advice and guidelines presented in each section. For example, do you simply sit down at the terminal and type a program out? Or, do you reflect on the problem first, and code only when you have thought it through? Have you ever broken down your habits into steps and asked yourself if they work against you? In short, keep an open mind to the ideas presented and honestly consider whether they would help you write code of higher quality. Remember that the calibre of the end product is directly related to the quality of the construction. Therefore, in the end, your understanding of the main phases of code construction will determine how good a programmer you are.

In the real world, there is more to building a program than simply creating programmer-friendly code; that is, source code which is easy for others to read, comprehend, and revise. Skilful programmers never forget that programs are also used. With this thought in mind, section 2.1 will begin with an examination of what makes a program user-friendly, as opposed to programmer-friendly. Following this, we will take a closer look at the four phases of construction mentioned earlier.

2.1 Writing user-friendly programs

There are two audiences that every programmer must consider when developing an application. First, the code must be "programmer-friendly." In other words, it must be easy for programmers, other than the original coder, to read and understand. Second, the program must be "user-friendly" by showing consideration for the people who will operate it on a regular basis. Creating programmer-friendly code is covered in the latter half of this handbook. This section deals with how to write applications with the user in mind.

Being computer users, programmers deal with a host of tools and utilities everyday. Some applications are a joy to use, others a nightmare. This fact raises two key questions: "What does user-friendly mean?" and "How can I make my program more friendly?" The simple answer to the first question is that a user-friendly program is one that the user considers a friend instead of an enemy. Answering the second question is more challenging. For years, computer scientists have tried to come up with a set of laws to define what makes a program amicable. On the following page, I have provided a list of tips that you can apply to your user interfaces, to improve the overall user-friendliness of your programs.

When working on user interfaces, every programmer should keep in mind three key areas: data validation, input, and output.[Robertson 93] Data validation refers to the process of checking that input is valid. One of the main reasons that programs produce unexpected results is due to invalid or unforeseen input; for example, receiving an out of bounds value. This is commonly referred to as the "garbage in, garbage out" principle. As a safety precaution, data should be validated as close to the point of entry as possible. You may even want to consider data entry and validation as the same step.

As far as program input and output is concerned, do not make assumptions about what the user wants, needs, or likes. If you are unclear as to how the user expects the interface to work, ask! This should be standard practice, as it will save time in the long run if the client has unique requirements that you may not be aware of; for example, perhaps the user expects all of the output to be sent to a remote printer, and all input to come from a special data file.

Above all, keep in mind that the only experts on the user interface are the people who use it. If you put yourself in the shoes of the user, you can better manage the development of a user-friendly program. Following this fundamental idea, the next page lists some specific advice for the construction of programs which show consideration for the user.

Tips For Creating An Effective User Interface [McConnell 93]

Follow the "Principle of Least Astonishment." In other words, your program should always act in a way that least surprises the user. Be sure that you understand exactly what the client wants from the system. Remember that the user has a job to do, and they want the computer to do that job, their way, with minimal effort.

Make error/warning messages clear and helpful. It goes without saying that a user-friendly program must include messages to indicate errors and warnings, either caused by the program, or the user. As a convention, you should begin each error message with "Error:", and each warning message with "Warning:". Messages should not be cryptic or require the user to know any computer terminology. Use plain English. Below is a poor example of an ambiguous error message many PC users have encountered during boot-up:

Error: keyboard missing

Press <F1> to continue.

Provide some sort of help system. In the past, programs where designed to conserve resources. Today, they are designed to maximize usability. The amount of help to provide depends on the application and the skill level of the user. The help given may be as elaborate as a context sensitive help system, or as trivial as a simple message; but it should always be provided.

Friendly programs have built-in safety nets. Occasionally, a user may try to do something that will result in permanent damage or loss of data. A user-friendly program should provide a safety net preventing truly stupid actions, or at least warn users if they are about to do something they may regret. In short, do not let users shoot themselves in the foot unless they really want to.

Furnish accelerators for experienced users. "Power users" always appreciate the ability to use a program with fewer keystrokes. For example, to print in WordPerfect 5.1, Shift-F7 is preferred by the veteran user over Alt-=, F, P.

Apply the principle of modularization to the user interface. Isolate as much of the interface code as possible from the mechanics of the rest of the program and the machine it runs on. Modular code, and User Interface Management Systems (UIMS) can help you avoid rebuilding your program's user interface from scratch, if your system is ported to a new platform.

2.2 Modular code construction 

Earlier, you were asked to take the time to consider your programming methodology. If you are the kind of programmer who prefers to implement a rough design on the fly, you have probably faced the complication of trying to keep track of what is what in your code. Perhaps you have found that your code seems to spring leaks, become tangled, or difficult to follow. What can you do to avoid these difficulties? Modularize!

Before presenting a detailed explanation of modular code construction, let us consider the numerous benefits that make modularization worthwhile. The following are the main advantages: [McConnell 93]

Minimum connectedness: More specifically, modular designs focus on cutting down on the connections between routines and groups of related routines (modules). By designing every module and routine to be autonomous from others, each component can independently represent part of the solution to a larger problem. Later in this section, it will be explained how this independence can be achieved through strong cohesion and loose coupling.

Maximum code reusability, portability, and extensibility: Since each routine or module is designed to function independently from others, it is easier to reuse pieces of code, move code to new operating environments, and extend, change, or replace a routine or module without causing trauma to the underlying program structure, or surrounding source code.

Reduced system complexity; making code intellectually more manageable: By dividing a software system into independent components, initial building and long-term code maintenance are simplified. For example, by designing an independent user interface module, it should be possible to replace it with a different implementation without having to radically change other routines and modules within the program.

Promotion of a lean, stratified, design: Modular designs discourage systems with superfluous parts. Remember that extra code has to be designed, debugged, tested, optimized, and maintained in the future. The term "stratified design" means that the levels of decomposition are structured such that you can get a consistent view of the system at any level. That is, the system is designed so that you can view any single level of code without having to understand or look at the lower level details.

Humans solve major problems and deal with complexity by breaking things down into smaller and smaller subproblems. This logical process can be repeated for each minor problem until all of the subproblems are so simple that they can be effortlessly worked out, one by one.

Similarly, in software construction, routines are employed by developers to apply this problem solving strategy to programming. By definition, a routine is a construction which can be used to bring together a group of actions into a logical unit. Each routine should describe part of the solution to a more complicated problem, have a well-defined purpose, and be as independent as possible from outside code. The advantage of routines is that they can hide unessential details that a developer does not need to know, and allow him or her to fully concentrate on one component at a time.[Maguire 93]

There exists another helpful construction for bringing together related parts of a program into a logical unit. It is known as a module. Modules usually consist of shared data and related routines to operate on the data. Although the data within modules is often shared, it should not be directly accessible outside of the module. The difference between module and global data will be expanded on at the end of this section. When building a module, its interface to the rest of the program (in other words, the part of the module that will be visible to outside code) must be specified. This way, any details that are unessential to the user of the module can be concealed within it.[Robertson 93]

When building a complicated end-product, like a car, it is necessary to manufacture the different parts separately to enhance productivity and deal with complexity. Eventually, the individual components are assembled into a finished product. In order to ensure that the parts fit together as expected, a specification of each component must be drawn up during the design phase.

In software construction, the development of complicated programs by large programming teams should be handled similarly. As with building a car, it is important to spend enough time designing the initial system. Also, each part should be built separately and integrated together later. As with routines, it should be possible to develop, test, and debug a module separately from the rest of the program that it will be applied to.

Modules allow applications to be built from several separate, logical, units. In order to make modular code construction possible, modules should have a well-defined objective and be independent from external code. This way, creating potentially unmanageable programs is avoided since one subproblem can be tackled at a time.

The next subsection will examine the development of a module to implement a circular number data type. Subsection 2.2.2 will clarify the metrics used to appraise modular code quality: strong cohesion and loose coupling.

2.2.1 Abstract data types

Recall that a data type is characterized by the values that its objects can assume, and by the operations that can be carried out on them. In typed languages, such as C and C++, the allowed values for a specific type are determined when the type declaration is made. An important property of any general purpose programming language is the ability to allow the creation of user defined types. User defined types that encapsulate both data and functions together are referred to as abstract data types, or ADTs.[McConnell 93]

The concept of creating brand new types is exciting because it makes it possible to represent real-world entities within the code itself. More specifically, ATDs allow programmers to solve problems in the vocabulary of the problem domain rather than in computer science lingo. For example, instead of implementing a conventional queue data structure, with get_front() and add_back() functions to represent a line of people at a bank machine, we could define a "bank_line_up" abstract data type. Within this type we could further define routines to serve_next_patron() and add_new_patron() etc.. The ability to manipulate real-world objects in the problem domain not only helps to deal with complexity, but it makes for a more readable and maintainable program!

Virtually all modern languages are capable of supporting the implementation of some form of abstract data type although some languages are better at it than others. The ability of object oriented languages to support real-world entities through classes and objects have made languages like C++ very popular. For those unfamiliar with the Object Oriented Programming (OOP) paradigm, a class is a type of definition that merges data and functions together under a single framework, and an object is an instance of a class.

A circular number is an integer which may assume only those values within a given range. If you try to increment past its range it will return to its minimum limit and, similarly, if you decrement past the beginning of its range, the number will shift back to its maximum limit. So, as far as data is concerned, each circular number has a value, a minimum limit, and a maximum limit. The class definition also requires numerous functions to facilitate the manipulation of the circular number. Consider the C++ circular number abstract data type below:

class circular_num {
    private:
        int value                                  // current value [min..max]
        int min;                                   // minimum limit for value
        int max;                                   // maximum limit for value
        int add( int number );                     // support fn to change value
    public:
        void set_limit( int upper, int lower );    // set min,max
        int set_value( int number );               // set value 
        int get_value( void ) { return value; }    // get value 
        int get_min( void ) { return min; }        // get minimum
        int get_max( void ) { return max; }        // get maximum
        int inc( void ) { return add(1); }         // increment 
        int dec( void ) { return add(-1); }        // decrement 
}

Notice that all of the data values are declared within the private section of the definition. This means that any functions outside of the class (ADT) can not directly access them. In order to enable outside code to look at or modify the module data, seven simple access functions have been provided in the public section of the data type. Finally, notice that a supporting function, add(), has also been made private. In this implementation, a design decision was made to hide the add() function from general use outside of the ADT.

Please realize that even if your language does not support classes and objects directly, you should still be able to implement abstract data types very effectively. I have simply chosen to use C++ due to its clarity and the fact that classes very closely parallel abstract data types and modules.

Since classes, abstract data types, and modules are all defined as entities which include both data and the operations to manipulate the data, all three share these benefits:[McConnell 93]

They hide implementation details. With modular code it is possible to change the internal design of module data or functions, without affecting the rest of the program that uses it. A good analogy would be that you can change the interior of your house as much as you want. As long as the exterior is unchanged, your friends can still easily find it. For example, if the add() function was optimized with assembler code, as long as it performed the same task, no other code would need to be changed. Also, any code within the data type that called the function would benefit from the optimization.

They remove the need to pass data throughout the program. If the class, abstract data type, or module internally hides data that it manipulates, it promotes data integrity because you have centralized control over the data. In other words, data is "fire-walled" from inadvertently being damaged through unexpected direct access.

They allow you to work in the vocabulary of the problem domain. This helps to deal with complexity and make programs more self-documenting.

The purpose of this section is not to demonstrate how to implement abstract data types and modules in every language, but rather to make you aware of the main ideas behind them. Often the hardest part of implementing modular code is deciding which features should be known outside of the class, ADT, or module, and which should not. Below are common details to hide.

Any variable or routine that does not need to be directly accessed externally. By hiding as much as possible, the benefits of modularity may be maximized.

Areas likely to change. The goal here is to isolate code so that the effect of any modification will be limited to one module. Some good things to isolate in modules are user interface details, system calls, hardware dependencies, and low level screen and printer use. This way, if the system is ported, perhaps only that class, ADT, or module will need to be rewritten.

Complex data or logic. By hiding complicated code within a class, ADT, or module, complexity is reduced, and code is more readable overall. Also, if you discover a better way to implement an intricate feature, you can simply change the access functions; the outside code will be unaffected. Data management routines may be packaged to hide unnecessary details for various data or file structures thus allowing the user to manipulate data in an abstract way.

You should make an effort to assemble a library of general classes, ADTs, or modules, that can be used in several contexts within different projects. Your routines and modules should be general enough to be applicable to a wide variety of programs and make it possible to replace or change the internal implementation of a routine, or group of associated routines, without seriously affecting the rest of the program.

2.2.2 Cohesion and coupling

Strong cohesion and loose coupling are two metrics used to gauge the level of modularity for a routine or set of related routines (class, ADT, or module). To begin, a cohesive routine is one in which the entire routine is dedicated to a single purpose. For example, the trigonometric function Tan() is perfectly cohesive because it clearly performs only one job. The function TanAndCos() would be classified as less cohesive because it is intended to perform more than one task. Design your routines for strong cohesion. The payoff will be code which is less complex, easier to debug, and modify.

Within classes, ADTs, and modules, the criterion for strong cohesion is very similar to that with individual routines. They should provide a group of services that clearly belong together. For example, the circular number class earlier consisted only of data and functions related to a single purpose. The class was strongly cohesive. Appreciate that just because the routines in a module are related does not guarantee that each individual routine is cohesive. Always aim for strong cohesion with both routines, and sets of routines.

The second measure of modular code quality is coupling. Coupling refers to how simply routines and modules are physically connected to each other. Loose coupling is the complement of strong cohesion. Good coupling should be loose enough that each routine can easily be called by others. In order to achieve this goal two main criteria must be met.

First, the number of parameters to interface with a routine should be minimized.[McConnell 93] It also helps if you use a parallel ordering of parameters for routines that perform similar jobs. For example, a set of mathematical functions that only have two or three parameters each, and where the ordering of parameters is consistent, would be easier to remember how to use than a set of routines with a dozen parameters each, and where the ordering of the parameters is dissimilar in each case.

Second, the connections between all routines should be as visible as possible.[McConnell 93] In other words, you should avoid using global data to pass information to and from routines as this hampers the "plug and play" benefit of modularity. For example, a routine that depends on a global variable is not as easy to reuse in a different project, than a routine which does not rely on an external variable to do its job.

In any language, variables with large scope should be avoided because their use does not encourage loose coupling. Their use is also risky because inadvertent modification of data with a broad scope is hard to detect. As well, they hinder code reuse because routines that access data outside of their scope are harder to "pull out and plug into" another program. To avoid using variables of large scope, start by minimizing the scope of new variables and increase their scope only as needed. The reason for this is that if you start a variable with a large scope its scope may never become minimized, and vice versa. Lastly, realize that some variables only need to be accessed by a set of routines; these are module variables. Rarely, some variables need to be accessed throughout an entire program; these are the global ones.

The overall idea behind modular code construction is that once a routine or set of routines is written and tested, you should be able to take them for granted. Remember, routines are just an intellectual tool for dealing with complexity. If they are not simplifying your code, they are not doing their job.

2.3 Designing routines

Effective design is not easy to learn or teach. This is unfortunate because it is the key ingredient to a quality program. A good design should promote strong modular cohesion with loose coupling. As well, it should minimize complexity and make the system easier to maintain and extend in the future. Following a top-down or bottom-up approach, or perhaps a mixture of the two, routines should be logically planned to perform a well-defined task in the problem domain. Although there are many different methods for designing routines, here is a basic five step process that you might find helpful: [McConnell 93]

1] Check to ensure that the routine is actually needed based on the functional requirements of the program. The routine should have a well-defined purpose, and fit cleanly into the existing architecture. See if it is possible to reuse a previously constructed, generalized, routine. Code reuse is a good way to improve your productivity and the overall quality of your code.

2] If a new routine is needed, consider exactly how it will handle errors and the type of errors which are most likely to occur. It is important to think about this now, as it affects the next step. Figure out what the routine will hide, its inputs and outputs (including any global variables affected), and performance requirements. Based on the job the routine will do, give it a meaningful name. Good names help to improve readability; naming is fully covered in Part 2.

3] Keeping in mind the most probable sources of error identified in step #2, prepare a detailed test plan or suite. Maintaining a list of what to test, how to test it, and the expected result(s) is a step toward a more reliable program.

4] Write the routine in a Program Design Language (PDL) or pseudo-code (more information on PDLs can be found in McConnell's book Code Complete). The purpose of the previous steps was to establish a mental familiarisation with the overall problem the routine must solve. Start with a general idea and work toward something more specific. Keep refining it until you feel confident that your design is "bullet proof" and easy to translate into the actual programming language. Try alternative designs with a PDL before you start coding.

5] Write the heading comment for the routine to help focus your thoughts, and then translate from the PDL to the actual implementation language. Next, move on to the testing, debugging, and optimization stages of construction.

Helpful Advice For Program Design [Maguire 93]

Be sure you completely understand the functional requirements. Functional requirements describe what the overall software system is expected to do. Explicit requirements remove the need to guess what the user expects, help avoid arguments, and minimize major changes to the system after development has started. If possible, get the requirements in writing or write them yourself and have them approved. Appreciate that in virtually any project the functional specifications are likely to change slightly as you go. To be a good programmer you must be able to adapt to requirement modifications. Always keep the requirements in mind to ensure that you fulfil the client's needs.

If possible, try to negotiate difficult requirements away; offer alternatives. Question any specifications that seem too difficult or complex to implement. Understand that the client may be unaware of the difficulties presented by certain requirements and may be willing to accept alternatives.

Outline your code in a Program Design Language (PDL or pseudo-code). A PDL should use English-like statements to express specific operations in a program. Write at a level that will make generating code almost automatic; that is, not so high that problematic details are glossed over, or so low that the benefit of designing at a higher level is negated. Most developers find that during the design stage a PDL is easier to maintain and make changes to than designing in a compilable language from the start.

Let each function do one task well. In other words, do not create "Swiss-army knife" functions. It is preferable to have several specific functions, as opposed to one giant multipurpose routine. If you cannot explain what a function does in a few sentences, restructure. Only compress two or more functions into one when it will markedly improve efficiency or if the tasks are often used together.

Try to develop generalized functions. This idea does not contradict the previous point. Look for opportunities where the task executed by a function can be used in other projects. Remember, code reuse saves time and money.

Keep It Simple, Stupid (KISS). A complex, twisted design will inevitably result in unmodular code. When translating from PDL to the actual implementation language, do not try and demonstrate how smart you are by using as many tricks as possible in each line of code. If you are doing something in a complicated way, make sure that there is a good reason for it.

2.4 Testing tips

Testing is a difficult art. It requires a unique set of skills and a different mental attitude than analysis and coding. Interestingly, programmers often have difficulty testing their own code. The reason is because they focus on making a program fulfil certain criteria and as a result have difficulty imagining some of the more unorthodox ways others may use their code or application.

Section 2.2 extolled the virtues of designing routines and abstract data types such that other programmers can use them without having to understand their internal workings; that is, to treat them like "black boxes," where they know what goes in and what to expect out, but nothing about the actual mechanics of the code. Testing done by the original developer is known as "glass box" testing, because they fully understand not only what goes in and what to expect out of their routines, but also what goes on internally.[Robertson 93] Glass box testing will be the focus of this section.

Glass box testing can be a time consuming process. How much time should be spent testing code? In reality, exhaustively testing code can take longer than it took to write it. Surprised? A major misconception about testing is that testing time cuts into programming time. This is simply untrue, because testing time is programming time. More clearly, you should constantly test code as you write it, as well as fully testing the end product.

Consider testing an input routine which accepts a string of 20 digits. To demonstrate that the routine works for all possible inputs would involve 1020 test cases. Clearly, in a large scale application, it would be virtually impossible to exhaustively test potentially thousands of routines. Therefore, practical testing must involve a systematic approach, or test plan, to intelligently focus on cases that are most likely to cause problems.

Developing an effective plan is not easy and requires the ability to anticipate planned design elements and exigencies of change over time. Testing can be done manually or in an automated fashion. Manual testing works well for checking individual routines or modest sized applications. However, in general, it usually involves less coverage than automated testing. Automated testing requires a master program or test scripts to monitor multiple iterations of detailed test cases. The main benefit of this type of testing is that it always follows the same path; therefore, detected errors can be recorded and reproduced. Testing without a plan is like flying without an airplane -- it is over quickly and the results are less than satisfactory. Regardless of the scale of your project or the method of testing used, the following advice should help you detect errors with minimal effort.

Helpful Advice For Code Testing [McConnell 93]

Walk through the code pretending that you are the computer. By performing structured walkthroughs you are likely to discover problems that compilers often miss, such as repeating a line of code, inefficient statement structuring, ambiguous code, or situations where logic breaks from the design. Simulate test cases which are uncomplicated enough to check by hand.

Ask someone else to test your program. As mentioned, developers are usually not predisposed toward finding bugs in their own programs. The programmer tends to check cases he or she has provided for, potentially missing errors or non-conformance to functional specifications. Usually a person not directly involved in the code creation can more easily find bugs and can test areas that the original programmer may not have thought of.

Test each routine separately. Obviously, by isolating and thoroughly testing each unit of a program you are more likely to detect errors not apparent during a system wide test. If individual testing is not possible, test functions in small groups, especially checking routines that heavily depend on others. Be sure to double check any code which is new or modified since the last test.

Automate testing for large scale or critical applications. Using test scripts provides a fast and efficient way to consistent testing. Create and maintain a list of what to test, how to test it, and what the result of the test should be.

Keep a record of frequent errors. Statistically, most errors tend to be concentrated in a few highly defective routines.[Maguire 93] By monitoring the type and location of problems within your code, you will be in a better position to discover trends and determine the kind of errors that you make most often.

Check for off-by-one errors. These type of errors are very common and often go unnoticed for long periods of time before acting up.[Robertson 93] To avoid this problem, be sure to scrutinize conditions and watch the use of operators. For example, in C, make sure <= is not used when < needed, or that ++i is not used when i++ is intended.

Make sure code "does nothing" successfully. That is, make sure that your code does not have unexpected side effects. The most effective way to contain side effects is through modular code construction. You cannot expect the compiler to catch this type of error; this is the programmer's responsibility.

2.5 Debugging help

Beginning programmers often have difficulty distinguishing between testing and debugging. Recall that testing is the process of executing the code for the purpose of detecting errors, whereas debugging is the process of diagnosing and correcting their root causes. One of the major difficulties of debugging is that the causes of errors vary depending on the language, hardware, and operating environment used. Most commonly, bugs are caused by: off-by-one errors, array overruns, dynamic memory allocation, invalid pointer assignments, incorrect function parameters, signed/unsigned confusions, type conversions, and incorrect termination of loops and recursion.[Robertson 93]

Ironically, good programmers often do not make good debuggers. As with testing, debugging can be an exhausting art. The most important thing when debugging is not to make assumptions about your code. Far too often programmers immediately blame the computer or compiler for problems with their code. Programs are logical; they do not "do something different" every time they are run. Remember that your program did not make itself -- you did! A respectable programmer takes full responsibility for his or her code.

Aside from the proper mind set, debugging demands the skills of both a physician and a detective. Like a doctor, a programmer must determine which symptoms are relevant to the error (illness) and which are not. As well, through experience, you must learn what kind of bug (germ) tends to generate the kind of problem (ailment) in question. Once the source of the problem is isolated you must decide how to remedy the situation. Is it simply a typo, or does the design have to be rethought?

As a detective, a programmer must be able to examine the code (facts) and find clues regarding the nature of the problem. Again, it is most helpful to assume that the source of the error is in fact within the code. Identifying the source of the problem can take a long time. A good debugger must have the persistence to locate bugs. This is especially true when you are learning new things along the way and have to start over because of it. Remember, these skills come from experience. You must practice debugging and develop a system that works best for you.

When debugging it is always a good idea to start tracking the bug at the highest level most likely to contain the error in your code. If you are unable to isolate the cause of the problem at that level, move down one level lower in the code and repeat the process until the bug is found. The reasoning behind this method is that debugging complexity and effort increase as you go into the deeper layers of source code.

Helpful Advice For Code Debugging [McConnell 93]

Think. Do not blame the system without a good reason or blindly attempt to fix a bug before fully understanding what caused the problem in the first place. This is the only way to gain valuable debugging experience. Never ignore bugs which are difficult to reproduce, thinking that they will go away. Do not consider bugs that went away to be fixed. Always retest "fixed" code.

Do not make assumptions about your code. Just because a section of code performs flawlessly for months does not mean it will continue to do so. How many times have you said "I can skip this part, it works fine;" only to discover hours later that "that part" has a bug? Even the best programmers have.

Disable sections of code to isolate bugs. When searching for bugs, using comments or conditional compilation switches can help you to discern working code from broken code. Searching for bugs can be just as logical as looking for an item in a list. If you have no idea where to start looking, try a binary search technique. In other words, disable half of your code and see if the bug persists. If it does, enable the disabled section and continue searching the next half. If it does not, then you can be sure that the bug exists in the disabled portion, so continue the search process there. Develop a technique that works best for you.

Print debugging information to a file. In some cases it may not be desirable to disrupt the normal video presentation by printing debugging output to the screen. You may opt to send all of your debug information to a file instead. Recording this information can be helpful when comparing different runs. Note that it is a good idea to close and reopen this file often, to ensure that information is not lost in case of a system crash.

Trace evaluation of suspect functions. This can be accomplished by printing the arguments and return values at entry and exit of suspect functions. By tracing the execution of a function you can be sure that it does what you expect it to. Some interpreters provide special trace functions to handle this for you.

Take advantage of available construction tools. Most modern compilers and interpreters allow source code to be walked through (or traced) line by line, and to examine the contents of any data object. Being able to see exactly what your code is doing can be a huge time saver -- use it to your advantage! Aside from debuggers, learning to use a language checker, pretty printer, and cross-reference utility can save you much aggravation.

2.6 Optimization advice

In the past, many programmers tended to focus on squeezing as much functionality out of each line of code as possible. After all, it was thought, is this not how you maximize code performance? Well, the answer to that question depends on the current definition of "performance." Today, performance is only loosely related to code speed and size. Other factors are now being considered part of overall performance; considerations such as providing an easy to use interface, reliability, readability, and assisting long-term maintainability are all key metrics of performance.[Abrash 94] Clearly, a few outdated misconceptions must be dealt with:

Fallacy: You should optimize as you go.

Response: It is practically impossible to identify performance bottlenecks until you have a completely working program. Prematurely focusing on optimization is a waste of time and in the end hurts code quality by distracting concentration from the main objective.

Fallacy: It is always worthwhile to optimize code by hand.

Response: Modern compilers are often better at optimizing code than many people think. In fact, hand optimization may defeat compiler optimizations that have been designed to work with more straightforward code. Therefore, unless you have a poor design or algorithm to begin with, compilers today should do a good job of producing efficient code -- don't make extra work for yourself by fussing over it. Remember, it takes time to hand tune initially, and it is harder to read later on.

Fallacy: Certain operations are faster or smaller than others.

Response: This is partially true. Generally, I/O, system calls, and floating point calculations take time. However, these facts change among operating systems, hardware configurations, languages, compilers, and even versions of a compiler. Therefore, to maximize portability hand tuning is not always the best idea.

As general rule of thumb, you should begin with a good algorithm and high-quality design, make the code correct and easy to maintain, and then check performance. If it lumbers, then it may be worthwhile to fine tune. Do not optimize unless you are sure that it is absolutely essential.

Helpful Advice For Code Optimization [Abrash 94]

Make it work first, optimize later. When implementing a design, efficiency should be the ancillary concern. Of course, you should use a reasonable algorithm from the start by analysing the performance issues when designing the program. It is easier to optimize when you have working code because you have a starting point and a clear definition of the problem. Optimizing every line of code from the beginning is a waste of time and distracts from other important concerns, such as readability and reliability.

Optimize by finding a better algorithm. Tweaking code without changing the underlying algorithm seldom improves performance by more than one order of magnitude. A better algorithm can improve performance by several orders of magnitude. If you need speed, look for a better algorithm; often there is one.

Optimize code only if it takes a significant percentage of execution time. Usually, the most execution time is spent in a few small portions of code. Before optimizing, establish what improvement in speed you expect, and optimize only if it is honestly worthwhile. An execution profiler can be instrumental in measuring code performance. Related to the point above: Optimize data objects if they take a large percentage of program memory. Most memory is often taken by a few large arrays or static data structures. Again, you should optimize only when you expect significant performance improvements. Blindly optimizing all code is not very effective and requires great effort.

Do not repair bad code, rewrite it. Statistically, code that caused problems in the past is most likely to cause problems in the future.[Oman 93] Although many programmers find it difficult to throw away even bad code, rewriting can improve your productivity in the long run.

Periodically step back, look at your code and rewrite it. Often the first pass of code implementation acquaints you with the subtleties of the problem. The second pass allows you to write better code. As a system evolves, change upon change clutter the original code. Thus, code and data objects that once made sense become unnecessary or inefficient. A rewrite allows you to clean up the mess. When possible rewrite your code at least once.

Never sacrifice code clarity for trivial gains in efficiency. Clarity is important; sacrifice it only when the expected gains are significant.

2.7 Maintenance concerns

Code maintenance begins when the first bug occurs and you must go back and repair it. In the software engineering industry, a great percentage of resources go to program upkeep, or the act of replacing unmaintainable code. One of the reasons for this problem is that human nature tends to be more interested in the here and now than later.

Inconsiderate programmers often feel that it is easier to make a program that simply works today, than to design it to last for years. Thoughtful programmers always remember that the reward for well-written code may not be realized until much later in the process -- when it is time for a major upgrade or if a significant flaw is uncovered. In the real world no one codes in a vacuum, so be considerate for the next programmer down the road, and expect others to be considerate toward you.

2.8 Chapter summary

Too many helpful construction tips have been presented in this chapter to reiterate here. A few of the key ideas for constructing quality code are:

Quality construction involves: design, testing, debugging, and optimization.

The ecumenical goal of the four phases is long-term program maintainability.

Understanding the four phases is the key to becoming a good programmer.

Programs must not only be programmer-friendly, but user-friendly as well.

When developing a user interface always put yourself in the user's position.

Modularity promotes maintainable, independent, lean, and stratified designs.

The notion of strong cohesion and loose coupling is essential to modularity.

Always minimize the scope of data objects, and avoid global variables.

ADTs are easy to implement if you understand modular code construction.

A modular (maintainable) design must involve a logical development process.

Effective testing (manual or automated) must be systematically conducted.

Debugging requires the right mind set, and the skills of a physician/detective. It is best to debug from the highest (least complex) level of code to lowest.

Today, code performance involves factors other than simply speed and size.

Optimize working code only if a good algorithm and design are not enough.

The long-term rewards for quality code far outweigh any short-term costs.