6.12. Method Handlers: The Browse and See, Browse and View Example
Let's look at an example of the method-handler
concepts presented in Chapter 4. Suppose you need
to implement a handler that allows browsing the files in the document
root and beneath. Directories should be browsable (so you can move up
and down the directory tree), but files should not be viewable (so
you can see the available files, but you cannot click to view them).
So let's write a simple file browser.
We know what customers are like, so we suspect that the customer will
ask for similar customized modules pretty soon. To avoid having to
duplicate our work later, we decide to start writing a base class
whose methods can easily be overridden as needed. Our base class is
called Apache::BrowseSee.
We start the class by declaring the package and using the
strict pragma:
package Apache::BrowseSee;
use strict;
Next, we import common constants (e.g., OK,
NOT_FOUND, etc.), load the
File::Spec::Functions and
File::Basename modules, and import a few
path-manipulation functions that we are going to use:
use Apache::Constants qw(:common);
use File::Spec::Functions qw(catdir canonpath curdir updir);
use File::Basename 'dirname';
Now let's look at the functions. We start with the
simple constructor:
sub new { bless { }, shift;}
The real entry point, the handler, is prototyped as
($$). The handler starts by instantiating its
object, if it hasn't already been done, and storing
the $r object, so we don't need
to pass it to the functions as an argument:
Next we retrieve the path_info element of the
request record:
$self->{dir} = $r->path_info || '/';
For example, if the request was /browse/foo/bar,
where /browse is the location of the handler,
the path_info element will be
/foo/bar. The default value /
is used when the path is not specified.
Then we reset the entries for dirs and
files:
$self->{dirs} = { };
$self->{files} = { };
This is needed because it's possible that the
$self object is created outside the handler (e.g.,
in the startup file) and may persist between requests.
Now an attempt to fetch the contents of the directory is
made:
eval { $self->fetch( ) };
return NOT_FOUND if $@;
If the fetch( ) method dies, the error message is
assigned to $@ and we return
NOT_FOUND. You may choose to approach it
differently and return an error message explaining what has happened.
You may also want to log the event before returning:
warn($@), return NOT_FOUND if $@;
Normally this shouldn't happen, unless a user messes
with the arguments (something you should always be on the lookout
for, because they will do it).
When the fetch( ) function has completed
successfully, all that's left is to send the HTTP
header and start of the HTML via the head( )
method, render the response, send the end of the HTML via
tail( ),[32] and
finally to return the OK constant to tell the
server that the request has been fully answered:
[32]This could perhaps be
replaced by a templating system. See Appendix D for
more information about the Template Toolkit.
The response is generated by three functions. The head(
) method is a very simple one—it sends the HTTP
header text/html and prints an HTML preamble using
the current directory name as a title:
sub head {
my $self = shift;
$self->{r}->send_http_header("text/html");
print "<html><head><title>Dir: $self->{dir}</title><head><body>";
}
The tail( ) method finishes the HTML document:
sub tail {
my $self = shift;
print "</body></html>";
}
The fetch( ) method reads the contents of the
directory stored in the object's
dir attribute (relative to the document root)
and then sorts the contents into two groups, directories and files:
sub fetch {
my $self = shift;
my $doc_root = Apache->document_root;
my $base_dir = canonpath( catdir($doc_root, $self->{dir}));
my $base_entry = $self->{dir} eq '/' ? '' : $self->{dir};
my $dh = Apache::gensym( );
opendir $dh, $base_dir or die "Cannot open $base_dir: $!";
for (readdir $dh) {
next if $_ eq curdir( ); # usually '.'
my $full_dir = catdir $base_dir, $_;
my $entry = "$base_entry/$_";
if (-d $full_dir) {
if ($_ eq updir( )) { # '..'
$entry = dirname $self->{dir};
next if catdir($base_dir, $entry) eq $doc_root;
}
$self->{dirs}{$_} = $entry;
}
else {
$self->{files}{$_} = $entry;
}
}
closedir $dh;
}
By using canonpath( ), we make sure that nobody
messes with the path_info element, by eliminating
successive slashes and "/."s on Unix and taking
appropriate actions on other operating systems. It's
important to use File::Spec and other
cross-platform functions when developing applications.
While looping through the directory entries, we skip over the current
directory entry using the curdir( ) function
imported from File::Spec::Functions (which is
equivalent to . on Unix) and handle the parent directory entry
specially by matching the updir( ) function (which
is equivalent to .. on Unix). The function dirname(
) gives us the parent directory, and afterward we check
that this directory is different from the document root. If
it's the same, we skip this entry.
Note that since we use the path_info element to
pass the directory relative to the document root, we rely on Apache
to handle the case when users try to mess with the URL and add .. to
reach files they aren't supposed to reach.
Finally, let's look at the render(
)
method:
sub render {
my $self = shift;
print "<p>Current Directory: <i>$self->{dir}</i><br>";
my $location = $self->{r}->location;
print qq{<a href="$location$self->{dirs}{$_}">$_</a><br>}
for sort keys %{ $self->{dirs} || { } };
print qq{$_<br>}
for sort keys %{ $self->{files} || { } };
}
The render( ) method actually takes the files and
directories prepared in the fetch( ) method and
displays them to the user. First the name of the current directory is
displayed, followed by the directories and finally the files. Since
the module should allow browsing of directories, we hyperlink them.
The files aren't linked, since we are in
"see but don't
touch" mode.[33]
[33]In your real code you
should also escape HTML- and URI-unsafe characters in the filenames
(e.g., <, >, &, ", ',
etc.) by using the Apache::Util::escape_html and
Apache::Util::escape_uri functions.
Finally, we finish the package with 1; to make
sure that the module will be successfully loaded. The _
_END_ _ token allows us to put various notes and POD
documentation after the program, where Perl won't
complain about them.
package Apache::BrowseSee;
use strict;
use Apache::Constants qw(:common);
use File::Spec::Functions qw(catdir canonpath curdir updir);
use File::Basename 'dirname';
sub new { bless {}, shift;}
sub handler ($$) {
my($self, $r) = @_;
$self = $self->new unless ref $self;
$self->{r} = $r;
$self->{dir} = $r->path_info || '/';
$self->{dirs} = {};
$self->{files} = {};
eval { $self->fetch( ) };
return NOT_FOUND if $@;
$self->head;
$self->render;
$self->tail;
return OK;
}
sub head {
my $self = shift;
$self->{r}->send_http_header("text/html");
print "<html><head><title>Dir: $self->{dir}</title><head><body>";
}
sub tail {
my $self = shift;
print "</body></html>";
}
sub fetch {
my $self = shift;
my $doc_root = Apache->document_root;
my $base_dir = canonpath( catdir($doc_root, $self->{dir}));
my $base_entry = $self->{dir} eq '/' ? '' : $self->{dir};
my $dh = Apache::gensym( );
opendir $dh, $base_dir or die "Cannot open $base_dir: $!";
for (readdir $dh) {
next if $_ eq curdir( );
my $full_dir = catdir $base_dir, $_;
my $entry = "$base_entry/$_";
if (-d $full_dir) {
if ($_ eq updir( )) {
$entry = dirname $self->{dir};
next if catdir($base_dir, $entry) eq $doc_root;
}
$self->{dirs}{$_} = $entry;
}
else {
$self->{files}{$_} = $entry;
}
}
closedir $dh;
}
sub render {
my $self = shift;
print "Current Directory: <i>$self->{dir}</i><br>";
my $location = $self->{r}->location;
print qq{<a href="$location$self->{dirs}{$_}">$_</a><br>}
for sort keys %{ $self->{dirs} || {} };
print qq{$_<br>}
for sort keys %{ $self->{files} || {} };
}
1;
_ _END_ _
This module should be saved as
Apache/BrowseSee.pm and placed into one of the
directories in @INC. For example, if
/home/httpd/perl is in your
@INC, you can save it in
/home/httpd/perl/Apache/BrowseSee.pm.
To configure
this module, we just add
the following snippet to httpd.conf:
Users accessing the server from /browse can now
browse the contents of your server from the document root and beneath
but cannot view the contents of the files (see Figure 6-2).
Figure 6-2. The files can be browsed but not viewed
Now let's say that as soon as we get the module up
and running, the client comes back and tells us he would like us to
implement a very similar application, except that files should now be
viewable (clickable). This is because later he wants to allow only
authorized users to read the files while letting everybody see what
he has to offer.
We knew that was coming, remember? Since we are lazy and
it's not exciting to write the same code again and
again, we will do the minimum amount of work while still keeping the
client happy. This time we are going to implement the
Apache::BrowseRead module:
package Apache::BrowseRead;
use strict;
use base qw(Apache::BrowseSee);
We place the new module into
Apache/BrowseRead.pm, declare a new package, and
tell Perl that this package inherits from
Apache::BrowseSee using the
base pragma. The last line is roughly equivalent
to:
BEGIN {
require Apache::BrowseSee;
@Apache::BrowseRead::ISA = qw(Apache::BrowseSee);
}
Since this class is going to do the same job as
Apache::BrowseSee, apart from rendering the file
listings differently, all we have to do is override the
render( ) method:
sub render {
my $self = shift;
print "<p>Current Directory: <i>$self->{dir}</i><br>";
my $location = $self->{r}->location;
print qq{<a href="$location$self->{dirs}{$_}">$_</a><br>}
for sort keys %{ $self->{dirs} || { } };
print qq{<a href="$self->{files}{$_}">$_</a><br>}
for sort keys %{ $self->{files} || { } };
}
As you can see, the only difference here is that we link to the real
files now.
We complete the package as usual with 1; and
_ _END_ _:
Now, when accessing files through /read, we can
browse and view the contents of the files (see Figure 6-3). Once we add some
authentication/authorization methods, we will have a server where
everybody can browse, but only privileged users can read.
Figure 6-3. The files can be browsed and read
You might be wondering why you would write a special module to do
something Apache itself can already do for you. First, this was an
example on using method handlers, so we tried to keep it simple while
showing some real code. Second, this example can easily be adapted
and extended—for example, it can handle virtual files that
don't exist on the filesystem but rather are
generated on the fly and/or fetched from the database, and it can
easily be changed to do whatever you (or your
client) want to do, instead of what Apache allows.