Unix Programming - Designing Minilanguages - Extending and Embedding Languages

On-line Guides

Eclipse Documentation

How To Guides

The Art of Unix Programming
Prev	Home	Next

Unix Programming - Designing Minilanguages - Extending and Embedding Languages

Extending and Embedding Languages

One fundamentally important question is whether you can implement your minilanguage by extending or embedding an existing scripting language . This is often the right way to go for an imperative minilanguage, but much less appropriate for a declarative one.

Sometimes it's possible to write your imperative language simply by coding service functions in an interpreted language, which we'll call the ‘host’ language for purposes of this discussion. Your minilanguage programs are then just scripts that load your service library and use the host language's control structures and other facilities as a framework. Every facility the host language supplies is one you don't have to write.

This is the easiest way to write a minilanguage. Old-school Lisp programmers (including me) love this technique and use it heavily. It underlies the design of the Emacs editor, and has been rediscovered in the new-school scripting languages like Tcl , Python , and Perl . There are drawbacks to it, however.

Your host language may be unable to interface to a code library that you need. Or, internally, its ontology of data types may be inadequate for the kind of computation you need to do. Or, after measuring the performance of a prototype, you discover that it's too slow. When any of these things happen, your solution is usually going to involve coding in C (or C++) and integrating the results into your minilanguage.

The option of extending a scripting language with C code, or of embedding a scripting language in a C program, relies on the existence of scripting languages designed for it. You extend a scripting language by telling it to dynamically load a C library or module in such a way that the C entry points become visible as functions in the extended language. You embed a scripting language in a C program by sending commands to an instance of the interpreter and receiving the results back as values in C.

Both techniques also rely on the ability to move data between the type ontology of C and the type ontology of your scripting language. Some scripting languages are designed from the ground up to support this. One such is Tcl , which we'll cover in Chapter14. Another is Guile, an open-source dialect of the Lisp variant Scheme. Guile is shipped as a library and specifically designed to be embedded in C programs.

It is possible (though in 2003 still rather painful and difficult) to extend or embed Perl . It is very easy to extend Python and only slightly more difficult to embed it; Cextension is especially heavily used in the Python world. Java has an interface to call ‘native methods’ in C, though the practice is explicitly discouraged because it tends to break portability. Most versions of shell are not designed for embeddability and extension, but the Korn shell (ksh93 and later versions) is a notable exception.

There are lots of bad reasons not to piggyback your imperative minilanguage on an existing scripting language . One of the few good ones is that you actually want to implement your own custom grammar for error checking. If that's the case, then see the advice about yacc and lex below.

[an error occurred while processing this directive]

The Art of Unix Programming
Prev	Home	Next

Published under free license.

Design by Interspire