Programming With Style 

 Contents of Chapter 5

  1. General Principles

5 General Principles

Few non-programmers can appreciate the aesthetic aspects of a well structured program. The main goal of standardizing code layout is to provide a means of consistently indicating the structure of code. This chapter, and the subsequent four, are full of what some people might consider "picky details." Good programmers would argue that it is the attention to details that enhances a program's quality. A professional programmer must write readable code, period. Please realize that coding standards and guidelines are merely another programming tool and are not guaranteed to prevent a poor programmer from writing unreadable and unmaintainable code.

There is an adage in computer science: "Write programs for people first, computers second."[Robertson 93] Writing code that a computer can understand is easy. After all, to the compiler, a source file is merely a stream of characters; it literally does not matter how code is indented, variables are named, or comments are placed. On the other hand, it is more difficult to produce code that people can comprehend because of the many intellectual elements examined in the previous chapter.

Psychological factors, code layout, variable naming, and effective commenting directly affect how readable code is. How you actually use the tools provided by the language is another element of good style. Proper language usage is a large topic, and it would be impossible to try and cover it fully in this manual. Still, when examining each of the six languages in detail, some of the more common elements of usage will be explained along with the rules for proper formatting and layout.

There is more than one valid coding style. In fact, it could be said that there are almost as many styles as programmers. Some are more sensible than others -- styles that is. You may not agree with all of the ideas presented, but above all remember to use the style you choose consistently. Mixed styles are harder to maintain than bad styles; so, when working on an existing application, emulate its style rather than blindly following this document. Even if you think that you are the only one who will need to read your code, in the real world the chances are good that at some point in the program's life span another programmer will need to modify it. If you are a student, realize that someone usually has to mark your work. A readable program makes it clearer to your audience how your code operates and reveals a sense of pride in your work.

5.1 The foundations of good style

Before getting into the principles involved in effectively commenting code, creating meaningful names, and basic code formatting, it would be worthwhile to recap three of the major theoretical concepts presented up to this point. The three pillars of programming with style are:

Always be considerate toward the reader. Chapter 1 began by explaining the five C's of good communication (clarity, comprehensiveness, conciseness, consistency, and correctness). When writing code or considering the value of any point of style, be sure to ask yourself whether these five elements are being satisfied. Also, always take into account the expertise level of your audience. In this book, for example, an effort has been made to keep the code examples simple so that it is easier to grasp the underlying concepts.

Always try to design modular code and minimize scope. Chapter 2 clarified the benefits of modular code construction, minimizing the scope of symbols, and avoiding global variables. It should not be possible to reference a data object at a point in the code where it is not needed, unless there is a very good reason for it. Keeping this notion in mind will help you to build a modular program and avert side effects, when structuring and arranging code.

Always strive for simplicity (Keep It Simple, Stupid - KISS). The elegance of languages such as C and Scheme give them tremendous power, as well as elephantine potential for complexity. Remember, the compiler or interpreter is not concerned with how nicely code is formatted, nor does the end user care if the code is easy to read, as long as the application performs its job reliably. Only the maintainer of a program may care how code is styled. Complexity increases with every variable added to the code, every extra function parameter, and every level of nesting. Strive to keep your code simple and avoid a potential mind stack overflow!

These three themes are the driving forces behind the style rules to come. The author will be the first to acknowledge that the style standards and guidelines on the ensuing pages do not cover every potential use of the six languages. The overall goal of this handbook is to instill the reader with the sensitivity required to make decisions regarding coding style and apply them logically and consistently. Next, we will examine some of the more concrete, physical factors which influence code style and organization.

5.2 Physical restrictions

Aside from the three themes of programming with style, there are also three physical limitations which must be taken into account when cultivating a usable coding style:

Limit file length to 500 lines or less. Most compilers can comfortably handle files with thousands of lines of code. However, the majority of programmers feel that a file with too many lines is unwieldy.[Oualline 92] For practical reasons, limiting each source file to no more than 500 lines (or approximately 10 81/2"x11" pages) is advisable; even half of this limit could be considered reasonable, depending on the complexity of the code or module. Excessively large files have several disadvantages; for example, they take longer to edit, print, compile, and are harder to share using source code control systems, like SCCS.

Limit line length to 80 characters or less. Most older terminals and printers, using standard 81/2"x11" paper, are confined to a horizontal display length of 80 characters. This is becoming less of a restriction with scalable type faces and landscape printing modes. Due to recent technology, this argument is not as convincing as perhaps it once was. Still, to support complete backward compatibility and amplify code readability, a limit is necessary. This restriction could be considered a blessing in disguise because one of the main causes of going over the 80 character limit is due to deep nesting of indentation levels. Therefore, controlling line length has the added benefit of helping to limit nesting which diminishes code complexity.

Use only the standard ASCII character set (avoid extended characters). This restraint is mainly for portability reasons. Although many systems support the extended ASCII character set of 256 values, others do not. It is best to stick with the 128 characters of standard ASCII. Also, early terminals had fixed tab stops every eight characters. Today, many editors allow the user to vary the tab size setting. However, changing the value to something other than eight characters is not advisable for the simple reason that the tab will no longer be equivalent to the standard ASCII representation. As a result, different environments may interpret the tab differently, thus skewing the indentation style of your code. Only change the tab size if portability is nonessential.

Next, we will explore the general principles of effectively documenting code, regardless of the particular language being used.

5.3 A few comments on comments

Code should be documented directly via comments, as well as indirectly through the use of thoughtful variable, function, and file naming. Both elements of documentation are indicative of the professional pride that a programmer has in his or her work. Although the compiler does not care how code is internally documented, it is critical to any reader who must interpret the code. This section deals with direct documentation; the next section handles indirect documentation.

Direct documentation helps to visually unlock the power inherent in the code. Well written comments are invaluable to saving code maintenance time. It could be said that if you write flawless code and ignore commenting, you are only half done. Effective commenting should not be tedious or contain details which are hard to maintain when code changes. If you find that you are having difficulty documenting your own code, there is a good chance that you do not fully understand your code. By following the advice from Section 2.3, which promoted the idea of using some form of Program Design Language (PDL), you may be able to recycle some of your PDL statements as comments.

Frequently, programmers assume that the design of their code is clear and fail to realize that someone else, unfamiliar with it, may find it very difficult to understand without proper documentation. In general, most code can benefit from extra commenting. It is advisable to write comments before and as you write the actual code. Otherwise you will end up with comments of poorer quality and fewer of them. Psychologically, if you leave comments until the end, you are more likely to forget some of the code's subtleties and hurry to get the "chore" of commenting out of the way. Good comments explain what is being done rather than how. "How" comments should never be used because they usually restate what is clear by simply reading the code. To illustrate:

Good: /* if we are out of memory (tells what code is doing) */

if (memory_status == 0) ...

Bad: /* if memory status = zero (reiterates the obvious) */

if (memory_status == 0) ...

 

Most languages have different methods for indicating comments. In C/C++, the compiler assumes that anything between the "/* ... */" pair is a comment; also, in C++, text after "//" is a comment. With Scheme, Lisp, and PC Assembler, any text following a ";" is ignored until a carriage return is reached. Although there are different commenting implementations, the expectation that they exist at well established points within the code is the same.

5.3.1 Heading comments

Heading comments are required at the start of any major item, such as, at the top of a file or routine. These comments are typically 5 to 50 lines in length and act as a reader introduction and navigation aid for the code following it. Heading comments also help to effectively focus the thoughts of the programmer on what exactly is required. There are several different styles; below are two:


;**************************               /***************************
;*                                        **
;* Example Heading Comment                ** Example Heading Comment
;* For PC Assembler,                      ** For C and C++
;* Lisp, and Scheme                       **
;*                                        */

 

Heading comments should make their extent clear through the use of delimiters at the top and left side of the block. Enclosing the bottom of the comment is a matter of personal preference -- just be consistent about it. Notice that neither example encloses the right side of the comment. The practical reason for this is that maintaining an aligned column of delimiters can become laborious every time a word is changed, removed, or added.

Aside from using a consistent style, it is necessary to include named sections. The amount of information in each section will vary depending on the size and complexity of the file or routine it serves. Many textbooks urge students to pack heading comments with as much information as humanly possible. This book takes a more conservative approach because if comments are too verbose they are unlikely to be used or maintained.

File heading comments are the first thing that readers will see when they pick up a listing or edit a file. They should follow the ideas presented above, and at least include sections describing the file's name, author's name, date, purpose, and warnings or restrictions on the usage of the file. Optionally, this heading may include a list of external functions, PDL level algorithms, and file formats used by the program. It is best to simply provide what is needed to maximize initial reader comprehension yet minimize comment maintenance.

Similarly, function heading comments should describe the purpose, inputs (parameters, global variables), expectations or limitations on the inputs, outputs (returns, global variables), and describe the process that the function performs in a few sentences. If you are using an exceptionally complex algorithm, it would be worthwhile to explain it in greater detail or furnish the name of the reference where the algorithm can be found. For a more specific illustration of heading comments inspect the style templates in the Appendices.

5.3.2 Block comments

Block comments are typically found within routines or describing major data objects. They are usually employed to explain what code is going to do, and occasionally what code has already done. The dimensions of a block comment can vary from a single line which describes a simple item, to a multiple line comment which summarises a complex object or collection of statements. As with header comments, the amount of information contained in a block comment will depend on the situation and on what details are unclear from the code. Obviously, it would be overkill to have more comments than code in a routine. There are several elements which must be taken into account when using block comments. Consider the two fragments below:

for ( ) ...                        for ( ) ...
/* Poor positioning
for the comment. */                    /*
    printf("...");                     ** Better positioning
                                       ** for the comment.
                                       */
                                       printf("...");

The first example, on the left, demonstrates an unacceptable block comment placement. There are three major problems with it. First, the opening "/*" and the closing "*/" should be vertically aligned so that it is clear where the comment begins and ends. Second, it is advisable to put a blank line separating the block comment from the code that it is least associated with. In this case, it is unclear if the comment applies to the code above or below. Finally, a block comment must always be indented to the same level as the code it identifies. Even though the corrected version, on the right, requires more vertical space it is also a great deal clearer.

Unlike this manual, where I can emphasize my thoughts using different type faces, source code is normally limited to a single style, monospaced, font and the coder's imagination. It is up to the programmer to use mechanisms like capitalization, and delimiter "weighting" to make certain comments stand out over others.[Oualline 92] Single line block comments are often used as delimiters between sections of code. Delimiters make code chunks easier to see and understand. By using characters other than the standard asterisk, the importance of delimiters can be made clearer. For example, in Scheme or Assembler you might use the following delimiter styles; if not, come up with your own set:

;***** MAJOR SECTION DIVIDER (Data/Code Section) *****

;===== Major Subsection (Within a routine?) =====

;----- minor subsection (For Minor Detail?) -----

5.3.3 Trailing comments

As the name implies, a trailing comment usually follows the line of code that it describes. The content of a trailing comment should be simple, because each line of code ought to perform only one action. For code developed following the construction advice outlined in Chapter 2, the need to document individual lines of code should be rare, except with Assembler. In most cases, it is better to rewrite tricky code than to try and improve it with comments. Ideally, well structured code should require minimal internal documentation. Comments appended after individual statements or expressions represent more maintenance hassle than they are usually worth. It is best to use a block comment to explain what a group of lines is doing, rather than commenting an individual statement, unless it is especially critical or complex.

When using several trailing comments you should line them up with tabs. This better distinguishes code and comments; however, it is also harder to maintain the alignment during code changes. Often there will not be enough space on a line to fit a complete comment. In these cases it is best to put the comment on the line directly above the statement it explains, or optionally split the comment up over several lines. If you decide to split the comment, all lines except the first should be on blank lines. Also, make sure that the continuation is obvious through the use of ellipses or additional tabbing within the comment. Both of these fragments are acceptable:

/* Get valid move, then move player 1 */
make_a_move( player_1 );

make_a_move( player_1 );     /* Get valid move ...  */
                             ... then move player 1 */

Other common uses for trailing comments are for indicating the closing braces of exceptionally nested code and commenting all variable declarations. Figuring out what uncommented executable code does is possible through careful tracing, although it may take a while. Data, on the other hand does, not do anything. As a result, discovering the purpose of an uncommented and badly named data object can be an impossible feat! Good programmers treat all data declarations as if they were complex statements, and comment them accordingly. Ideally, data should be well described by name, see 5.3.4, but comments should help to clarify the intent for usage, possible ranges in value that the variable may contain, and any units which may be appropriate (m/s, kg, etc..). Take special care when commenting tables and complex structures, because they can often appear to be a jumble of information. Lastly, treat each member of a structure, union, or typedef distinctly and comment each one individually.

5.3.4 Commenting suggestions

To comment effectively, consistently follow the 5 C's of good communication, and pay attention to these general recommendations:

Comment before and as you write code, not at the end. If you fail to follow this advice, you will end up viewing commenting as a chore, rather than an integral part of the coding process. Writing function heading comments before the actual routine will help you to better focus on what the code must do. Going back to comment a program after it has been written is a poor idea, because it often takes excess time to reinterpret your old code and you run the risk of missing subtle design details. Some coders are under the misconception that they should not break their concentration to write comments. This is silly because having to concentrate that hard when coding is a symptom that they have not spent enough time determining how to best solve the problem at hand.

Comment on what is being done rather than how. "How" should be obvious from the code; if not, you should rethink your design. Always comment on "what" complicated, unusual, or possibly misleading code is doing. Comments should provide any information not easily attainable from the code itself.

Comment consistently and unobtrusively. Follow a standard commenting style, verb tense, and comment density throughout your source code. Indent block and trailing comments so that they do not visually interfere with the overall code structure.

Make comments distinguishable from the code and from each other. Use white space before a block comment to distinguish it from the code above and associate it with the code following. It is also a good idea to develop a set of delimiter styles to characterise the "weight" of every comment.

Make sure every comment is worthwhile and correct. A comment that merely repeats information, obvious from the code, is worthless. Make sure all documentation is updated to reflect any maintenance changes because an incorrect comment is worse than no comment at all.

Make code readable to minimize the need for comments. You should never solely rely on direct documentation to make your code readable. First, apply the principles of indirect documentation as discussed in the next sections.

5.4 Naming conventions and techniques

Indirect documentation, through effective variable, routine, and file naming, is just as important as directly commenting code. As mentioned earlier, good naming is the sign of a superior program. To effectively name a symbol you must fully understand its purpose. Regrettably, many textbooks, articles, and lectures on computer programming, seem to ignore or neglect this essential component of code documentation. This is perhaps understandable because they are primarily interested in teaching the features of a language rather than illustrating good style. Also, in many cases, the code examples are very small, with variables of limited significance, so good style is not promoted.

The fact remains that one of the most difficult, yet common, tasks that a professional programmer faces daily is to come up with suitable names for all symbols within a program. Good names should be clear, yet easy to use. Bad names can be too long to be easily remembered or too short to convey any meaning. As with most elements of programming style, balance is the key. The best length for a name is between 8 and 16 characters. Constructive naming can be achieved by following well established conventions, abbreviations, and other techniques.

5.4.1 Conventions

There is a multitude of naming conventions in use today. Each system has its ardent supporters and zealous antagonists. This section will not attempt to survey them all, rather, it will suggest some of the things to look for when selecting a naming system. Above all, if you elect to follow an established naming convention, be sure to use it consistently.

Naming conventions are helpful because they provide a very consistent, almost automatic, way to name new symbols added to a program. This allows the programmer to concentrate on the more important characteristics of the code. A solid naming system should provide a way to recognize global variables module variables, local variables, type definitions, named constants, enumerated types, and other data types supported by the language. The convention should also help to explain the variable's content, data type, and intended purpose in the overall code structure. Clearly, this is a tall order. All conventions have their benefits and drawbacks. It is usually up to the development team to decide which system is appropriate for the language and application being developed; for example, Microsoft has standardized on Hungarian notation for Ms-Windows development.[McConnell 93]

5.4.2 Abbreviations

Good abbreviations can significantly decrease the number of characters in a name, while still keeping it visually and audibly discernable. Although they are valuable for decreasing name length, they can also be dangerous if not used responsibly. Over the years many techniques have evolved to help shorten symbol names. Some of them include:

Removing needless words; eg, the_character_buffer Þ character_buffer.

Removing suffixes, like "er" and "ing"; eg, character_buffer Þ charact_buff.

Using shorter synonyms; eg, character_buffer Þ key_buffer.

Using the beginning of each word; eg, character_buffer Þ char_buf.

Using the most noticeable sounds; eg, character_buffer Þ crctr_bufr.

Using well established conventions; eg, in C, character_buffer Þ ch_buf.

Obviously, all of these methods are quite mechanical in nature. Every technique above is capable of converting long names to shorter ones, although the shortened versions may be uncomprehensible! It is the responsibility of each programmer to combine these techniques, depending on the situation, to come up with sensible names. Always use abbreviations consistently and only if they significantly shorten the name. There are a few pitfalls of abbreviations and naming in general that should be kept in mind:

Names should not look similar (eg, token vs. tokens), sound similar (eg, word vs. ward), or have similar or equivocal meanings (eg, file_1 vs. file_2). Also, avoid any hard to read character pairs like: (l and 1), (G and 6), (S and 5).

Avoid extreme abbreviations for temporary variables. Programmers often make the mistake of using very short, undescriptive, names like "temp_1", "temp_2", and "x" for temporary variables. Treat temporary variables as being just as important as any other variable and name them appropriately.

Avoid abbreviations that result in mispronunciation. An abbreviated name should be recognizable over the telephone.[Kernighan and Ritchie] Try to avoid removing vowels, as this quickly degrades pronunciation quality.

Avoid phonetic spellings, and cultural nomenclature. Appreciate that other programmers may not be native English speakers. Avoid names like "hi" and "lo," as these are less clear than "high" and "low." Say what you mean.

5.4.3 Variables

Like comments, a variable name should explain what it is for, rather than how it is used. Good variable names are important to both the code reader and the code writer. Using simple, yet descriptive, identifier names will make your source code more readable. An equilibrium must be reached to ensure symbol names are long enough to be descriptive yet short enough to be memorable and useful.

The construction of variable and routine names is more fully described in the next four chapters. Generally, variable and routine names should be in lower case letters only. In most programming languages, names consisting of multiple words should be separated by an "_"; for Scheme and Lisp use a "-" instead. An alternative method is to capitalize the first letter of each word in a name, rather than separate them with an underscore. Regardless of the method you adopt, be consistent about it.

Because variables hold information, they are usually described by an abstract noun, for example, count. A name this short may be suitable for very simple applications, but for most practical cases this name should be expanded, for example, node_count. This name is better because it is more specific. In even more complex situations, adjectives should be appended to qualify the name further, for example, free_node_count. Again, the amount of detail required depends on the situation and the scope of the data object. Always follow each variable declaration with a comment explaining it and: [Straker 92]

Be sure that you are not making any spelling mistakes. As well, avoid words which are likely to be misspelled by others.

Never differentiate names by case only. In C, for example, the names "FLAG", "Flag", and "flag" can confusingly represent three different things!

Name length should be proportional to the symbol's scope. That is, the larger the scope of a data object, the more descriptive the name should be.

Never declare a variable with the same name as one with a larger scope. The local declaration will effectively make the global variable inaccessible.

  Booleans should be named so that they read naturally in the code. Boolean variable names should imply either true or false. They should describe the positive. To illustrate, "if (source_file_open) ..." is clearer than using a less specific name; such as, "if (source_file) ..." -- if "source_file" what?

5.4.4 Routines

Routines differ from variables in that they actually do things, which can usually be described with a plain verb; for example, check(). In a very simple application what is being "checked" may be obvious; however, since functions generally do actions on things, a noun should follow the verb in the name of a routine, for example, check_file(). This is better, but, in many situations, more than a simple noun is needed, for example, check_file_exists().

If you have difficulty naming a routine, it is usually a sign that it is not cohesive (see section 2.2.2). In these cases, it is best to restructure the design so that the function has a well-defined purpose. Related functions are often grouped into modules. In order to help associate a function with a module, it is a good idea to append a prefix to the beginning of all functions identifying them with their module. For example, a module of database functions may all be prefixed by "db_". Be consistent when using any module prefixing scheme.

5.4.5 Files and directories

We will conclude this section by looking at some of the considerations that go into naming files. For maximum portability, source files should consist of a base name, followed by an optional period and suffix (also known as an extension). The base name should consist of no more than eight characters, and the extension no more than three. The first character of the base name should be alphabetic, and the remaining characters alphanumeric.[Maguire 93] All of the naming rules given up to this point should be applied to the base name of each source file and any files created or manipulated by the program itself.

Obviously, because of the eight character restriction, the use of abbreviations may need more emphasis. Never use names similar to those of any system files; for example, stdlid.h. Standard conventions exist for the extensions of C/C++, Smalltalk, Assembler, Scheme, and Lisp code files:

.cC source.asmPC Assembler source
.cppC++ source.incPC Assembler include
.hC/C++ header.macPC Assembler macros
.scmScheme source.lspPC Lisp source
.stSmalltalk source

As far as directories go, it is most convenient if all file names can be viewed on the screen at one time. Therefore, having no more than 24 source files per directory is a good guideline. For projects with hundreds of files it is advisable to divide the files into subdirectories.

5.5 Basic code formatting

The actual layout of individual lines of code is the most hotly debated element of programming style.[Ranade and Nash 93] Most programmers consider "their way" to be the "best way." This section will present some of the more practical aspects of code layout. The next four chapters deal with the specifics of C/C++, Smalltalk, Assembler, and Scheme/Lisp.

Generally, programs should be organized like good technical references. [Rabinowitz and Chaim 90] Consider this handbook, for example. The preface is the first section of text that the reader will come across when perusing this handbook. Similarly with source code, the file heading comment is the first thing that the reader will see when studying a code listing. Both briefly explain what to expect in the rest of the work and give the reader enough information to begin to understand the problem and the approach that will be taken to solve it.

The similarities do not end there. The chapters of this manual are like the modules in a program. Each chapter covers an individual topic, yet they are all interlinked to support a single goal. As well, each section of this book parallels the functions in a program. In other words, just as each chapter is divided into sections and subsections, each module is divided into functions and supporting functions. When organizing a program for readability, try to arrange your code as you would expect an author to organize a book -- for easy reference and understanding.

Paragraphs and sentences make books readable. With code, multiple lines of related code (blocks) are equivalent to paragraphs, and individual statements or expressions are similar to sentences. In both English and programming, these groupings are defined by the semantics of the information rather than the syntax of the language. It is up to the author/programmer to effectively arrange the information.

Just as section headings and indentation are used to distinguish how the information in this handbook is organized, comments and indentation should be used to clarify the structure of program code. Also, in English, run on sentences that seem to go on and on and on and try to get too many different points across at a time without any punctuation or breaks in the text are difficult to read and frustrating to try to assimilate much like this sentence! Always remember these two basic ideas when writing code:

Each statement should perform only one action to avoid side effects.

Complex or lengthy statements should be broken down into smaller chunks.

5.5.1 Tabs and spaces

Couldyouimagineabookwithoutspaces? It may be readable, but not easily. Items closer together form visual chunks which have an implied association. By increasing the separation between chunks, you effectively weaken the association. White space is used to separate information vertically, with form feeds and blank lines, as well as horizontally, with tabs and spaces. Of course, too m u c h white space should be avoided. A consistent, well placed, use of space can be very effective.

There has been a "religious" argument raging for years as to whether tabs or spaces are best for horizontally indenting code. Some programmers contend that tabs are easier to use, and require less disk space than eight spaces. Opponents counter that tabs can cause problems, because if their size is redefined to handle the optimal indentation size of four characters, another environment may not represent the tab the same way (see 5.5.2). Thus, the code indentation will be thrown off. Spaces guarantee the indentation level of statements in any editor or printout, even if they take a little longer to type and use more disk space. Both opinions are valid. As a compromise, if you are certain that your code will only be used in one environment, use tabs and feel free to change the tab size. If portability is a major concern, it would be best to only use spaces. Avoid using an inconsistent mixture of tabs and spaces as this can do more harm than good. Spaces can also be used to make tokens more clearly identifiable under the following conditions:

Commas (and semicolons) are most closely associated with the previous item. As in English, put no space before them and at least one space after them.

All unary operators (++ -- etc..) should have no space between them and the identifier they are associated with, use a single space otherwise.

All binary operators (+ - * / % etc..) should have an equal number of spaces on either side; for example, result = *employee_age + *years_to_retire;

Keywords are not functions, so space them out; for example, in C: switch ( has a space between the keyword "switch" and its parenthesized condition.

The contents of parenthesized expressions can be made to stand out by putting a space before every opening "(", and after every closing ")". With nested parentheses the inner sets may not follow this guideline -- just be consistent. For example, if (current_page > (first_page - last_page) ) ... Also, since it is often hard to remember the precedence levels of the various operators never assume that the reader has memorized the precedence rules as well as you have. Use parentheses liberally to clarify and emphasize code.

5.5.2 Indenting code

Indentation is the most effective way to make chunks of code clearly distinguishable from others. It is the key to showing the logical structure of a code listing. Statements should be indented under the statement or block of statements that they are logically subordinate to. Always indent one level for each new level of logic.

Many studies have been done to determine what the optimal code indentation size is.[Oman 93] It is generally agreed that, although using two space indentation enables the programmer to get more code on each line, with deeper nested blocks, they are also easy to confuse with other levels. Next, using eight space indentation is handy because eight is the standard ASCII tab size. But, using eight space tabs, tends to gobble up a lot of horizontal space with each new level of code nesting. For these reasons, it is advisable to avoid both two and eight space indents. Most research agrees that four spaces is the best compromise.[Straker 92] Regardless of the number of spaces used to indent code, be consistent and keep in mind the pros and cons of tabs and spaces (5.5.1).

5.5.3 Line wrapping

Line wraps should occur when a single statement or expression is too long to fit within the 80 character line length standard. There are three general principles here:

The continued line should be indented one level under the line it is extended from.

When wrapping long expressions, try to leave an operator at the end of the line to be continued; this flags a continuance when reading.

When wrapping (nested) parenthesized expressions, always try to wrap the expression at the lowest possible level, and align like level parentheses vertically. To Illustrate:

result = ( ( ( x1 + 1 ) * ( x1 + 1 ) ) - ( ( y1 + 1 ) * ( y1 + 1 ) ) ) ;

111111 1 1 2 3 33 3 3 3 2 3 33 3 3 3 2 1 2 3 33 3 3 3 2 3 33 3 3 3 2 1 1

====================================== Ý [lowest nesting level]

becomes:

result = ( ((x1 + 1) * (x1 + 1)) -

((y1 + 1) * (y1 + 1)) );

Ý [aligned level 2 parentheses]

5.6 Chapter summary

The style rules given in this chapter are general enough to be applicable regardless of the language used. The proposals below will be built upon in the next four chapters, as we consider specific programming languages:

Write code for people first, computers second.

There is no one "best" style, just be consistent about the style you use.

Be considerate toward the reader; follow the 5 C's of communication.

Strive for simple, modular, code by minimizing variable scope.

Limit files to 500 lines in length and 80 characters in width.

Use only standard ASCII characters and avoid changing the size of a tab stop.

Comment on the code before and as you write it, not all at once at the end.

Comment on "what" is being done, rather than "how" -- this is obvious.

Comment clearly, consistently, and unobtrusively.

Every comment should be accurate and helpful to the reader.

Do not depend on comments to make bad code readable, rewrite it!

To effectively comment a program you must fully understand it.

Studies have found that the best name length is between 8 and 16 characters.

If you opt to use a naming system, use it with consistency.

Abbreviations are helpful to significantly shorten the length of a name.

Symbol names should not look, sound, or have similar meanings.

Always name temporary variables with the same care as any other variable.

Avoid any abbreviations which result in mispronunciation.

Comment every data declaration as if it were a complex statement.

Double check to make sure that you have not misspelled a name or word.

Do not differentiate variables by case only.

In general, name length should be proportional to the symbol's scope.

Never declare a local variable with a name used by a more global variable.

Boolean variables should be named such that they read naturally in the code.

Routines should follow the same naming rules as any other program symbols.

Never redefine a routine or constant provided by the language or system.

Name files for maximum portability and clarity of purpose.

Too many files in one directory are hard to manage, use subdirectories.

Each line of code should perform only one action to avoid side effects.

Utilize white space as a tool to help clarify and explain code.

Studies have found that the best code indentation size is four characters.

Wrap expressions at the lowest level and align like level parentheses.