Unix Programming - Designing for Transparency and Discoverability - Transparency and Editable Representations

On-line Guides

Eclipse Documentation

How To Guides

The Art of Unix Programming
Prev	Home	Next

Unix Programming - Designing for Transparency and Discoverability - Transparency and Editable Representations

Transparency and Editable Representations

Another theme that emerges from these examples is the value of programs that flip a problem out of a domain in which transparency is hard into one in which it is easy. Audacity, sng(1) and the tic(1)/infocmp(1) pair all have this property. The objects they manipulate are not readily conformable to the hand and eye; audio files are not visual objects, and although images expressed in PNG format are visual, the complexities of PNG annotation chunks are not. All three applications turn manipulation of their binary file formats into a problem to which human beings can more readily apply intuition and competences gained from everyday experience.

A rule all these examples follow is that they degrade the representation as little as possible — in fact, they translate it reversibly and losslessly. This property is very important, and worth implementing even if there is no obvious application demand for that kind of 100% fidelity. It gives potential users confidence that they can experiment without degrading their data.

All the advantages of textual data-file formats that we discussed in Chapter5 also apply to the textual formats that sng(1), infocmp(1) and their kin generate. One important application for sng(1) is robotic generation of PNG image annotations by scripts — because sng(1) exists, such scripts are easier to write.

Whenever you face a design problem that involves editing some kind of complex binary object, the Unix tradition encourages asking first off whether you can write a tool analogous to sng(1) or the tic(1)/infocmp(1) pair that can do a lossless mapping to an editable textual format and back. There is no established term for programs of this kind, but we'll call them textualizers.

If the binary object is dynamically generated or very large, then it may not be practical or possible to capture all the state with a textualizer. In that case, the equivalent task is to write a browser. The paradigm example is fsdb(1), the file-system debugger supported under various Unixes; there is a Linux equivalent called debugfs(1). The psql(1) used to browse PostgreSQL databases, and the smbclient(1) program that can be used to query Windows file shares on a SAMBA-equipped Linux machine, are two more. All five are simple CLI programs that could be driven by scripts and test harnesses.

Writing a textualizer or browser is a valuable exercise for at least four reasons:

You gain an excellent learning experience. There may be other ways that are as good to learn about the structure of the object, but none that are obviously better.
You gain the ability to dump the contents of the structure for inspection and debugging. Because such a tool makes dumping easy, you'll do it more. You'll get more information, probably leading to more insight.
You gain the ability to easily generate test loads and unusual cases. This means you are more likely to probe the odd corners of the object's state space — and to break the associated software, so you can fix it before your users break it.
You gain code you may be able to reuse. If you're careful about how you write the browser/textualizer and keep the CLI interpreter properly separated from the marshaling/unmarshaling library, you may find you have code that can be reused for your actual application.

After you've done this, you may well discover that it's possible to apply the “separated engine and interface” pattern (see Chapter11) using your textualizer/debugger as the engine. All the usual benefits of this pattern will apply.

	It is desirable, although often difficult, for a textualizer to be able to read and write even a damaged binary object. For one thing, it lets you generate damaged test cases to stress-test software; for another, it can make emergency repairs a whole lot easier. It may be hard to handle cases in which the structure of the object is messed up, but at least you should handle cases in which the content of the structure is nonsense, e.g., by showing nonsense values in hex and converting the hex back to the values.
-- Henry Spencer

[an error occurred while processing this directive]

The Art of Unix Programming
Prev	Home	Next

Published under free license.

Design by Interspire