The new features introduced by Apache 2.0 and the Perl 5.6 and 5.8
generations provide the base of the new mod_perl 2.0 features. In
addition, mod_perl 2.0 reimplements itself from scratch, providing
such new features as a new build and testing framework.
Let's look at the major changes since mod_perl 1.0.
24.3.1. Thread Support
In order to adapt to the Apache 2.0 threads architecture (for threaded
MPMs), mod_perl 2.0 needs to use thread-safe Perl interpreters,
also known as
ithreads (interpreter threads). This mechanism
is enabled at compile time and ensures that each Perl interpreter
instance is reentrant—that is, multiple Perl interpreters can
be used concurrently within the same process without locking, as each
instance has its own copy of any mutable data (symbol tables, stacks,
etc.). This of course requires that each Perl interpreter instance is
accessed by only one thread at any given time.
The first mod_perl generation has only a single
PerlInterpreter, which is constructed by the
parent process, then inherited across the forks to child processes.
mod_perl 2.0 has a configurable number of
PerlInterpreters and two classes of interpreters,
parent and clone. A parent is like in mod_perl
1.0, where the main interpreter created at startup time compiles any
preloaded Perl code. A clone is created from the
parent using the Perl API perl_clone( ) function.
At request time, parent interpreters are used only for making more
clones, as the clones are the interpreters that actually handle
requests. Care is taken by Perl to copy only mutable data, which
means that no runtime locking is required and read-only data such as
the syntax tree is shared from the parent, which should reduce the
overall mod_perl memory footprint.
Rather than creating a PerlInterperter for each
thread, by
default mod_perl creates a pool of interpreters. The pool mechanism
helps cut down memory usage a great deal. As already mentioned, the
syntax tree is shared between all cloned interpreters. If your server
is serving more than just mod_perl requests, having a smaller number
of PerlInterpreters than the number of threads
will clearly cut down on memory usage. Finally, perhaps the biggest
win is memory reuse: as calls are made into Perl subroutines, memory
allocations are made for variables when they are used for the first
time. Subsequent use of variables may allocate more memory; e.g., if
a scalar variable needs to hold a longer string than it did before,
or an array has new elements added. As an optimization, Perl hangs
onto these allocations, even though their values go out of scope.
mod_perl 2.0 has much better control over which
PerlInterpreters are used for incoming requests.
The interpreters are stored in two linked lists, one for available
interpreters and another for busy ones. When needed to handle a
request, one interpreter is taken from the head of the available
list, and it's put back at the head of the same list
when it's done. This means that if, for example, you
have ten interpreters configured to be cloned at startup time, but no
more than five are ever used concurrently, those five continue to
reuse Perl's allocations, while the other five
remain much smaller, but ready to go if the need arises.
The interpreters pool mechanism has been abstracted into an API known
as tipool (thread item pool). This pool,
currently used to manage a pool of PerlInterpreter
objects, can be used to manage any data structure in which you wish
to have a smaller number of items than the number of configured
threads.
It's important to notice that the Perl ithreads
implementation ensures that Perl code is thread-safe, at least with
respect to the Apache threads in which it is running. However, it
does not ensure that functions and extensions that call into
third-party C/C++ libraries are thread-safe. In the case of
non-thread-safe extensions, if it is not possible to fix those
routines, care needs to be taken to serialize calls into such
functions (either at the XS or Perl level). See Perl
5.8.0's perlthrtut manpage.
Note that while Perl data is thread-private unless explicitly shared
and threads themselves are separate execution threads, the threads
can affect process-scope state, affecting all the threads. For
example, if one thread does chdir("/tmp"), the
current working directory of all threads is now
/tmp. While each thread can correct its current
working directory by storing the original value, there are functions
whose process-scope changes cannot be undone. For example,
chroot( ) changes the root directory of all
threads, and this change is not reversible. Refer to the
perlthrtut manpage for more information.
24.3.2. Perl Interface to the APR and Apache APIs
As we mentioned earlier,
Apache 2.0 uses two APIs:
The Apache Portable Runtime (APR) API, which implements a portable
and efficient API to generically work with files, threads, processes,
shared memory, etc.
The Apache API, which handles issues specific to the web server
mod_perl 2.0 provides its own very flexible special-purpose XS code
generator, which is capable of doing things none of the existing
generators can handle. It's possible that in the
future this generator will be generalized and used for other projects
of a high complexity.
This generator creates the Perl glue code for the public APR and
Apache APIs, almost without a need for any extra code (just a few
thin wrappers to make the API more Perlish).
Since APR can be used outside of Apache, the Perl
APR:: modules can be used outside of Apache as
well.
24.3.3. Other New Features
In addition to the already mentioned new features in mod_perl 2.0,
the following are of major importance:
Apache 2.0 protocol modules are supported. Later we will see an
example of a protocol module running on top of mod_perl 2.0.
mod_perl 2.0 provides a very simple-to-use interface to the Apache
filtering API; this is of great interest because in mod_perl 1.0 the
Apache::Filter and
Apache::OutputChain modules, used for filtering,
had to go to great lengths to implement filtering and
couldn't be used for filtering output generated by
non-Perl modules. Moreover, incoming-stream filtering has now become
possible. We will discuss filtering and see a few examples later on.
A feature-full and flexible Apache::Test framework
was developed especially for mod_perl testing. While intended to test
the core mod_perl features, it is also used by third-party module
writers to easily test their modules. Moreover,
Apache::Test was adopted by Apache and is
currently used to test the Apache 1.3, 2.0, and other ASF projects.
Anything that runs on top of Apache can be tested with
Apache::Test, whether the target is written in
Perl, C, PHP, etc.
The support of the new MPMs makes mod_perl 2.0 able to scale better
on a wider range of platforms. For example, if
you've happened to try mod_perl 1.0 on Win32 you
probably know that parallel requests had to be serialized—i.e.,
only a single request could be processed at a time, rendering the
Win32 platform unusable with mod_perl as a heavy production service.
Thanks to the new Apache MPM design, mod_perl 2.0 can now efficiently
process parallel requests on Win32 platforms (using its native
win32 MPM).
24.3.4. Improved and More Flexible Configuration
mod_perl 2.0 provides
new
configuration directives for the newly added features and improves
upon existing ones. For example, the PerlOptions
directive provides fine-grained configuration for what were
compile-time only options in the first mod_perl generation. The
Perl*FilterHandler directives provide a much
simpler Apache filtering API, hiding most of the details underneath.
We will talk in detail about these and other options in the section
Section 24.5.
The new Apache::Directive module provides a Perl
interface to the Apache configuration tree, which is another new
feature in Apache 2.0.
24.3.5. Optimizations
The rewrite of mod_perl gives us
a chance to build a smarter, stronger,
and faster implementation based on lessons learned over the years
since mod_perl was introduced. There are some optimizations that can
be made in the mod_perl source code, some that can be made in the
Perl space by optimizing its syntax tree, and some that are a
combination of both.