In the following sections, we discuss the specifics of
Perl's behavior under mod_perl.
6.4.1. exit( )
Perl's
core exit(
) function
shouldn't be used in mod_perl code. Calling it
causes the mod_perl process to exit, which defeats the purpose of
using mod_perl. The Apache::exit( )function should be used instead.
Starting with Perl Version 5.6.0, mod_perl overrides exit(
) behind the scenes using
CORE::GLOBAL::, a new magical
package.
The CORE:: Package
CORE::
is a special package that provides access to Perl's
built-in functions. You may need to use this package to override some
of the built-in functions. For example, if you want to override the
exit( ) built-in function, you can do so with:
use subs qw(exit);
exit( ) if $DEBUG;
sub exit { warn "exit( ) was called"; }
Now when you call exit( ) in the same scope in
which it was overridden, the program won't exit, but
instead will just print a warning "exit( ) was
called". If you want to use the original built-in
function, you can still do so with:
# the 'real' exit
CORE::exit( );
Apache::Registry and
Apache::PerlRun override exit(
) with Apache::exit( ) behind the
scenes; therefore, scripts running under these modules
don't need to be modified to use
Apache::exit( ).
If CORE::exit( ) is used in scripts running under
mod_perl, the child will exit, but the current request
won't be logged. More importantly, a proper exit
won't be performed. For example, if there are some
database handles, they will remain open, causing costly memory and
(even worse) database connection leaks.
If the child
process needs to be killed,
Apache::exit(Apache::Constants::DONE)should be
used instead. This will cause the
server
to exit gracefully, completing the logging functions and protocol
requirements.
If the child process needs to be killed cleanly after the request has
completed, use the
$r->child_terminate method. This method can be called
anywhere in the code, not just at the end. This method sets the value
of the
MaxRequestsPerChild
configuration directive to 1 and clears the
keepalive flag. After the request is serviced, the
current connection is broken because of the
keepalive flag, which is set to false, and the
parent tells the child to cleanly quit because
MaxRequestsPerChild is smaller than or equal to
the number of requests served.
In an Apache::Registryscript you would write:
Apache->request->child_terminate;
and in httpd.conf:
PerlFixupHandler "sub { shift->child_terminate }"
You would want to use the latter example only if you wanted the child
to terminate every time the registered handler was called. This is
probably not what you want.
You can also use a post-processing
handler to trigger child termination. You might do this if you wanted
to execute your own cleanup code before the process exits:
my $r = shift;
$r->post_connection(\&exit_child);
sub exit_child {
# some logic here if needed
$r->child_terminate;
}
This is the code that is used by the
Apache::SizeLimit module, which terminates processes that
grow bigger than a preset quota.
6.4.2. die( )
die( ) is usually used to abort the flow of the
program if something goes wrong. For example, this common idiom is
used when opening files:
open FILE, "foo" or die "Cannot open 'foo' for reading: $!";
If the file cannot be opened, the script will die(
): script execution is aborted, the reason for death is
printed, and the Perl interpreter is terminated.
You will hardly find any properly written Perl scripts that
don't have at least one die( )
statement in them.
CGI scripts running under mod_cgi exit on completion, and the Perl
interpreter exits as well. Therefore, it doesn't
matter whether the interpreter exits because the script died by
natural death (when the last statement in the code flow was executed)
or was aborted by a die( )statement.
Under mod_perl, we don't want the process to quit.
Therefore, mod_perl takes care of it behind the scenes, and
die( ) calls don't abort the
process. When die( ) is called, mod_perl logs the
error message and calls Apache::exit( ) instead of
CORE::die( ). Thus, the script stops, but the
process doesn't quit. Of course, we are talking
about the cases where the code calling die( ) is
not wrapped inside an exception handler (e.g., an eval {
} block) that traps die( ) calls, or the
$SIG{_ _DIE_ _}sighandler, which allows you to
override the behavior of die( ) (see Chapter 21). Section 6.13 at the end of this
chapter mentions a few exception-handling modules available from
CPAN.
6.4.3. Global Variable Persistence
Under mod_perl a child process
doesn't exit after serving a single request. Thus,
global variables persist inside the same process from request to
request. This means that you should be careful not to rely on the
value of a global variable if it isn't initialized
at the beginning of each request. For example:
# the very beginning of the script
use strict;
use vars qw($counter);
$counter++;
relies on the fact that Perl interprets an undefined value of
$counter as a zero value, because of the increment
operator, and therefore sets the value to 1.
However, when the same code is executed a second time in the same
process, the value of $counter is not undefined
any more; instead, it holds the value it had at the end of the
previous execution in the same process. Therefore, a cleaner way to
code this snippet would be:
use strict;
use vars qw($counter);
$counter = 0;
$counter++;
In practice, you should avoid using global variables unless there
really is no alternative. Most of the problems with global variables
arise from the fact that they keep their values across functions, and
it's easy to lose track of which function modifies
the variable and where. This problem is solved by localizing these
variables with local( ). But if you are already
doing this, using lexical scoping (with my( )) is
even better because its scope is clearly defined, whereas localized
variables are seen and can be modified from anywhere in the code.
Refer to the perlsub manpage for more details.
Our example will now be written as:
use strict;
my $counter = 0;
$counter++;
Note that it is a good practice to both declare and initialize
variables,
since doing so will clearly convey your intention to the
code's maintainer.
You should be especially careful with Perl special variables, which
cannot be lexically scoped. With special variables, local(
) must be used. For example, if you want to read in a whole
file at once, you need to undef( ) the input
record separator. The following code reads the contents of an entire
file in one go:
open IN, $file or die $!;
$/ = undef;
$content = <IN>; # slurp the whole file in
close IN;
Since you have modified the special Perl variable
$/ globally, it'll affect any
other code running under the same process. If somewhere in the code
(or any other code running on the same server) there is a snippet
reading a file's content line by line, relying on
the default value of $/ (\n),
this code will work incorrectly. Localizing the modification of this
special variable solves this potential problem:
{
local $/; # $/ is undef now
$content = <IN>; # slurp the whole file in
}
Note that the localization is enclosed in a block. When control
passes out of the block, the previous value of $/
will be restored automatically.
6.4.4. STDIN, STDOUT, and STDERR Streams
Under mod_perl, both
STDIN
and
STDOUT
are tied to the socket from which the request originated. If, for
example, you use a third-party module that prints some output to
STDOUT when it shouldn't (for
example, control messages) and you want to avoid this, you must
temporarily redirect STDOUT to
/dev/null. You will then have to restore
STDOUT to the original handle when you want to
send a response to the client. The following code demonstrates a
possible implementation of this workaround:
{
my $nullfh = Apache::gensym( );
open $nullfh, '>/dev/null' or die "Can't open /dev/null: $!";
local *STDOUT = $nullfh;
call_something_thats_way_too_verbose( );
close $nullfh;
}
The code defines a block in which the STDOUT
stream is localized to print to /dev/null. When
control passes out of this block, STDOUT gets
restored to the previous value.
STDERR
is tied to a file defined by the ErrorLog
directive. When native syslog support is
enabled, the STDERRstream will be redirected to
/dev/null.
6.4.5. Redirecting STDOUT into a Scalar Variable
Sometimes you encounter a black-box
function that prints its output to the default file handle (usually
STDOUT) when you would rather put the output into
a scalar. This is very relevant under mod_perl, where
STDOUT is tied to the Apache
request object. In this situation, the IO::String
package is especially useful. You can re-tie( )STDOUT (or any other file handle) to a string by
doing a simple select( ) on the
IO::String object. Call select(
) again at the end on the original file handle to
re-tie( )STDOUT back to its
original stream:
my $str;
my $str_fh = IO::String->new($str);
my $old_fh = select($str_fh);
black_box_print( );
select($old_fh) if defined $old_fh;
In this example, a new IO::String object is
created. The object is then selected, the black_box_print(
) function is called, and its output goes into the string
object. Finally, we restore the original file handle, by
re-select( ) ing the originally selected file
handle. The $str variable contains all the output
produced by the black_box_print( ) function.
6.4.6. print( )
Under mod_perl, CORE::print(
) (using either STDOUT as a filehandle
argument or no filehandle at all) will redirect output to
Apache::print( ), since the
STDOUT file handle is tied to
Apache. That is, these two are functionally
equivalent:
print "Hello";
$r->print("Hello");
Apache::print( ) will return immediately without
printing anything if $r->connection->aborted
returns true. This happens if the connection has been aborted by the
client (e.g., by pressing the Stop button).
There is also an optimization built into Apache::print(
): if any of the arguments to this function are scalar
references to strings, they are automatically dereferenced. This
avoids needless copying of large strings when passing them to
subroutines. For example, the following code will print the actual
value of $long_string:
my $long_string = "A" x 10000000;
$r->print(\$long_string);
To print the reference value itself, use a double reference:
$r->print(\\$long_string);
When Apache::print( )sees that the passed value
is a reference, it dereferences it once and prints the real reference
value:
SCALAR(0x8576e0c)
6.4.7. Formats
The interface
to file handles that are linked to
variables with Perl's tie( )
function is not yet complete. The format( ) and
write( ) functions are missing. If you configure
Perl with sfio, write( ) and
format( )should work just fine.
Instead of format( ), you can use printf(
). For example, the following formats
are equivalent:
format printf
---------------
##.## %2.2f
####.## %4.2f
To print a string with fixed-length elements, use the
printf( ) format %n.ms where
n is the length of the field allocated for the
string and m is the maximum number of characters
to take from the string. For example:
Notice that the first string was allocated five characters in the
output, but only three were used because m=5 and
n=3 (%5.3s). If you want to
ensure that the text will always be correctly aligned without being
truncated, n should always be greater than or
equal to m.
You can change the alignment to the left by adding a minus sign
(-) after the %. For example:
Another alternative to format( ) and
printf( ) is to use the
Text::Reform module from CPAN.
In the examples above we've printed the number
123 as a string (because we used the
%s format specifier), but numbers can also be
printed using numeric formats. See perldoc -f
sprintf for full details.
6.4.8. Output from System Calls
The output of
system( ), exec( ), and
open(PIPE,"|program") calls will not be sent to
the browser unless Perl was configured with sfio.
To learn if your version of Perl is sfio-enabled,
look at the output of the perl -V command for
the useperlio and d_sfio
strings.
You can use backticks as a possible workaround:
print `command here`;
But this technique has very poor performance, since it forks a new
process. See the discussion about forking in Chapter 10.
6.4.9. BEGIN blocks
Perl executes BEGIN blocks
as soon as possible, when it's compiling the code.
The same is true under mod_perl. However, since mod_perl normally
compiles scripts and modules only once, either in the parent process
or just once per child, BEGIN blocks are run only
once. As the perlmod manpage explains, once a
BEGIN block has run, it is immediately undefined.
In the mod_perl environment, this means that BEGIN
blocks will not be run during the response to an incoming request
unless that request happens to be the one that causes the compilation
of the code. However, there are cases when BEGIN
blocks will be rerun for each request.
BEGIN blocks in
modules and files pulled in
via require( ) or use( ) will
be executed:
Only once, if pulled in by the parent process.
Once per child process, if not pulled in by the parent process.
One additional time per child process, if the module is reloaded from
disk by Apache::StatINC.
One additional time in the parent process on each restart, if
PerlFreshRestart is On.
On every request, if the module with the BEGIN
block is deleted from %INC, before the
module's compilation is needed. The same thing
happens when do( ) is used, which loads the module
even if it's already loaded.
BEGIN blocks in
Apache::Registry scripts will be executed:
Only once, if pulled in by the parent process via
Apache::RegistryLoader.
Once per child process, if not pulled in by the parent process.
One additional time per child process, each time the script file
changes on disk.
One additional time in the parent process on each restart, if pulled
in by the parent process via
Apache::RegistryLoader and
PerlFreshRestart is On.
Note that this second list is applicable only to the scripts
themselves. For the modules used by the scripts, the previous list
applies.
6.4.10. END Blocks
As theperlmod manpage
explains, an ENDsubroutine is executed when the
Perl interpreter exits. In the mod_perl environment, the Perl
interpreter exits only when the child process exits. Usually a single
process serves many requests before it exits, so
END blocks cannot be used if they are expected to
do something at the end of each request's
processing.
If there is a need to run some code after a request has been
processed, the $r->register_cleanup(
) function should be used. This function
accepts a reference to a function to be called during the
PerlCleanupHandler phase, which behaves just like
the END block in the normal Perl environment. For
example:
$r->register_cleanup(sub { warn "$$ does cleanup\n" });
or:
sub cleanup { warn "$$ does cleanup\n" };
$r->register_cleanup(\&cleanup);
will run the registered code at the end of each request, similar to
END blocks under mod_cgi.
As you already know by now, Apache::Registry
handles things differently. It does execute all
END blocks encountered during compilation of
Apache::Registryscripts at the end of each
request, like mod_cgi does. That includes any END
blocks defined in the packages use( ) d by the
scripts.
If you want something to run only once in the parent process on
shutdown and restart, you can use register_cleanup(
) in startup.pl:
warn "parent pid is $$\n";
Apache->server->register_cleanup(
sub { warn "server cleanup in $$\n" });
This is useful when some server-wide cleanup should be performed when
the server is stopped or restarted.