5 Formatting

When people look under the hood, we want them to be impressed with the neatness, consistency, and attention to detail that they perceive. If instead they see a scrambled mass of code, then they are likely to conclude that the same inattention to detail pervades every other aspect of the project.
If you are working on a team, then the team should agree to a single set of formatting rules and all members should comply. It helps to have an automated tool that can apply those formatting rules for you.

The purpose of formatting

Code formatting is about communication, and communication is the professional developer’s first order of business. The functionality that you create today has a good chance of changing in the next release, but the readability of your code will have a profound effect on all the changes that will ever be made.

Vertical formatting

How big should a source file be? Small files are usually easier to understand than large files are.

  • The Newspaper Metaphor
    We would like a source file to be like a newspaper article. The name should be simple but explanatory. The topmost parts of the source file should provide the high-level concepts and algorithms. Detail should increase as we move downward. A newspaper is composed of many articles; most are very small. Some are a bit larger. Very few contain as much text as a page can hold. This makes the newspaper usable.
  • Vertical Openness Between Concepts
    Nearly all code is read left to right and top to bottom. Each line represents an expression or a clause, and each group of lines represents a complete thought.
    Those thoughts should be separated from each other with blank lines.
  • Vertical Density
    If openness separates concepts, then vertical density implies close association. So lines of code that are tightly related should appear vertically dense.
  • Vertical Distance
    It is frustrating to spend time and mental energy on trying to locate and remember where the pieces are while trying to understand what the system does. Concepts that are closely related should be kept vertically close to each other. Clearly this rule doesn’t work for concepts that belong in separate files. This is one of the reasons that protected variables should be avoided.
    Variables should be declared as close to their usage as possible. Instance variables, on the other hand, should be declared at the top of the class. There have been many debates over where instance variables should go.
    In C++ we put all the instance variables at the bottom. The common convention in Java, however, is to put them all at the top of the class. I see no reason to follow any other convention. If one function calls another, they should be vertically close, and the caller should be above the callee, if at all possible. This gives the program a natural flow.
    Certain bits of code want to be near other bits. They have a certain conceptual affinity. The stronger that affinity, the less vertical distance there should be between them. Yhis affinity might be based on a direct dependence, such as one function calling another, or a function using a variable. But affinity might also be caused because a group of functions perform a similar operation.
  • Vertical Ordering
    In general we want function call dependencies to point in the downward
    direction. That is, a function that is called should be below a function that
    does the calling.

Horizontal formatting

How wide should a line be? The old Hollerith limit of 80 is a bit arbitrary, and I’m not opposed to lines edging out to 100 or even 120. But beyond that is probably just careless.

  • Horizontal Openness and Density
    We use horizontal white space to associate things that are strongly related and disassociate things that are more weakly related.
    Assignment statements have two distinct and major elements: the left side and the right side. Spaces make that separation obvious. I don’t put spaces between the function names and the opening parenthesis. This is because the function and its arguments are closely related. I separate arguments within the function call parenthesis to accentuate the comma and show that the arguments are separate.
    Another use for white space is to accentuate the precedence of operators. Unfortunately, most tools for reformatting code are blind to the precedence of operators and impose the same spacing throughout.
  • Horizontal Alignment
    When I was an assembly language programmer, I used horizontal alignment to accentuate certain structures. I have found, however, that this kind of alignment is not useful. The alignment seems to emphasize the wrong things.
    For example, in a list of declarations are tempted to read down the list of variable names without looking at their types. So, in the end, I don’t do this kind of thing anymore. Nowadays I prefer unaligned declarations and assignments. If I have long lists that need to be aligned, the problem is the length of the lists, not the lack of alignment.
  • Indentation
    A source file is a hierarchy rather like an outline. Each level of this hierarchy is a scope into which names can be declared and in which declarations and executable statements are interpreted. To make this hierarchy of scopes visible, we indent the lines of source code in proportion to their position in the hiearchy. Programmers rely heavily on this indentation scheme. They visually line up lines on the left to see what scope they appear in.
    It is sometimes tempting to break the indentation rule for short if statements, short while loops, or short functions. Whenever I have succumbed to this temptation, I have almost always gone back and put the indentation back in.
  • Dummy Scopes
    Sometimes the body of a while or for statement is a dummy. When I can’t avoid them, I make sure that the dummy body is properly indented and surrounded by braces.

Team rules

Every programmer has his own favorite formatting rules, but if he works in a team, then the team rules. We want the software to have a consistent style.

Previous: 4 CommentsUp: ContentsNext: 6 Objects and Data Structures