13.3. Buffered Printing and Better print( ) Techniques
As you probably know, this statement:
local $|=1;
disables buffering of the currently select( )ed
file handle (the default is STDOUT). Under
mod_perl, the STDOUT file handle is automatically
tied to the output socket. If STDOUT buffering is
disabled, each print( ) call also calls
ap_rflush( ) to flush Apache's
output buffer.
When multiple print( ) calls are used
(bad style in generating output), or if there
are just too many of them, you will experience a degradation in
performance. The severity depends on the number of print(
) calls that are made.
This example has multiple print( ) calls, which
will cause performance degradation with $|=1. It
also uses too many backslashes. This makes the code less readable,
and it is more difficult to format the HTML so that it is easily
readable as the script's output. The code below
solves the problems:
You can easily see the difference. Be careful, though, when printing
an <html> tag.
The correct way is:
print qq{<html>
<head></head>
};
You can also try the following:
print qq{
<html>
<head></head>
};
but note that some older browsers expect the first characters after
the headers and empty line to be <html> with
no spaces before the opening left angle bracket.
If there are any other characters, they might not accept the output
as HTML might and print it as plain text. Even if this approach works
with your browser, it might not work with others.
Another approach is to use the here document
style:
print <<EOT;
<html>
<head></head>
EOT
Performance-wise, the qq{ } and here document
styles compile down to exactly the same code, so there should not be
any real difference between them.
Remember that the closing tag of the here document style
(EOT in our example) must be
aligned to the left side of the line, with no spaces or other
characters before it and nothing but a newline after it.
Yet another technique is to pass the arguments to print(
) as a list:
This technique makes fewer print( ) calls but
still suffers from so-called backslashitis
(quotation marks used in HTML need to be prefixed with a backslash).
Single quotes can be used instead:
'<a href="foo.html">foo</a>'
but then how do we insert a variable? The string will need to be
split again:
'<a href="',$foo,'.html">', $foo, '</a>'
This is ugly, but it's a matter of taste. We tend to
use the qq operator:
print qq{<a href="$foo.html">$foo</a>
Some text
<img src="bar.png" alt="bar" width="1" height="1">
};
What if you want to make fewer print( ) calls, but
you don't have the output ready all at once? One
approach is to buffer the output in the array and then print it all
at once:
An even better technique is to pass print( ) a
reference to the string. The print( ) used under
Apache overloads the default CORE::print( ) and
knows that it should automatically dereference any reference passed
to it. Therefore, it's more efficient to pass
strings by reference, as it avoids the overhead of copying.
single string print is obviously the fastest;
join, concatination of
string, array of references to
string, and array of strings are very
close to each other (the results may vary according to the length of
the strings); and print call per string is the
slowest.
Now let's look at the same benchmark, where the
printing was either buffered or not:
First, we see the same picture among different printing techniques.
Second, we can see that the buffered print is always faster, but only
in the case where print( ) is called for each
short string does it have a significant speed impact.
Now let's go back to the $|=1
topic. You might still decide to disable buffering, for two reasons:
You use relatively few print( ) calls. You achieve
this by arranging for print( )statements to print
multiline text, not one line per print( )
statement.
You want your users to see output immediately. If you are about to
produce the results of a database query that might take some time to
complete, you might want users to get some feedback while they are
waiting. Ask yourself whether you prefer getting the output a bit
slower but steadily from the moment you press the Submit button, or
having to watch the "falling stars"
for a while and then getting the whole output at once, even if
it's a few milliseconds faster—assuming the
browser didn't time out during the wait.
An even better solution is to keep buffering enabled and call
$r->rflush( ) to flush the buffers when needed.
This way you can place the first part of the page you are sending in
the buffer and flush it a moment before you perform a lengthy
operation such as a database query. This kills two birds with the
same stone: you show some of the data to the user immediately so she
will see that something is actually happening, and you
don't suffer from the performance hit caused by
disabling buffering. Here is an example of such code:
use CGI ( );
my $r = shift;
my $q = new CGI;
print $q->header('text/html');
print $q->start_html;
print $q->p("Searching...Please wait");
$r->rflush;
# imitate a lengthy operation
for (1..5) {
sleep 1;
}
print $q->p("Done!");
The script prints the beginning of the HTML document along with a
nice request to wait by flushing the output buffer just before it
starts the lengthy operation.
Now let's run the web
benchmark and compare
the performance of buffered versus unbuffered printing in the
multi-printing code used in the last benchmark. We are going to use
two identical handlers, the first handler having its
STDOUTstream (tied to socket) unbuffered. The
code appears in Example 13-7.
Example 13-7. Book/UnBuffered.pm
package Book::UnBuffered;
use Apache::Constants qw(:common);
local $|=1; # Switch off buffering.
sub handler {
my $r = shift;
$r->send_http_header('text/html');
print "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
print "<html>\n";
print " <head>\n";
print " <title>\n";
print " Test page\n";
print " </title>\n";
print " </head>\n";
print " <body bgcolor=\"black\" text=\"white\">\n";
print " <h1> \n";
print " Test page \n";
print " </h1>\n";
print " <a href=\"foo.html\">foo</a>\n" for 1..100;
print " <hr>\n";
print " </body>\n";
print "</html>\n";
return OK;
}
1;
As you can see, there is not much difference when the overhead of
other processing is added. The difference was more significant when
we benchmarked only the Perl code. In real web requests, a few
percent difference will be felt only if you unbuffer the output and
print thousands of strings one at a time.
13.2. Apache::args Versus Apache::Request::param Versus CGI::param