Unix Programming - Compactness and Orthogonality - The SPOT Rule
The SPOT Rule
The Pragmatic Programmer articulates a
rule for one particular kind of orthogonality that is especially
important. Their “Don't Repeat Yourself” rule is: every
piece of knowledge must have a
unambiguous, authoritative representation within a system. In this
book we prefer, following a suggestion by Brian Kernighan, to call
this the Single Point Of Truth or SPOT rule.
Repetition leads to inconsistency and code that is subtly
broken, because you changed only some repetitions when you needed to
of them. Often, it also means
that you haven't properly thought through the organization
of your code.
Constants, tables, and metadata should be declared and
and imported elsewhere. Any
time you see duplicate code, that's a danger sign. Complexity
is a cost; don't pay it twice.
Often it's possible to remove code duplication by
refactoring; that is, changing the organization
of your code without changing the core algorithms. Data duplication
sometimes appears to be forced on you. But when you see it, here are
some valuable questions to ask:
If you have duplicated data in your code because it has to
have two different representations in two different places, can you
write a function, tool or code generator to make one representation from the
other, or both from a common source?
If your documentation duplicates knowledge in your code, can you
generate parts of the documentation from parts of the code, or
vice-versa, or both from a common higher-level representation?
If your header files and interface declarations duplicate
knowledge in your implementation code, is there a way you can generate
the header files and interface declarations from the code?
There is an analog of the SPOT rule for data structures:
“No junk, no confusion”. “No junk” says
that the data structure (the model) should be minimal, e.g., not made
so general that it can represent situations which cannot exist.
“No confusion” says that states which must be kept
distinct in the real-world problem must be kept distinct in the model.
In short, the SPOT rule advocates seeking a data structure whose
states have a one-to-one correspondence with the states of the
real-world system to be modeled.
From deeper within the Unix tradition, we can add some of our
own corollaries of the SPOT rule:
Are you duplicating data because you're caching intermediate
results of some computation or lookup? Consider carefully whether
this is premature
stale caches (and the layers of code needed to keep caches
synchronized) are a fertile source of bugs, and
can even slow down overall performance if (as often happens) the
cache-management overhead is higher than you expected.
If you see lots of duplicative boilerplate code, can you
generate all of it from a single higher-level representation,
twiddling a few knobs to generate the different cases?
The reader should begin to see a pattern emerging here.
In the Unix world, the SPOT Rule as a unifying idea has seldom
been explicit — but heavy use of code generators to implement
of SPOT are very much part of the
tradition. We'll survey these techniques in Chapter9.
[an error occurred while processing this directive]